Forward Deployed Engineer, AI Inference (vLLM and Kubernetes)

Red Hat, LLC

United States, New York, United States Full-time May 27, 2026

Opportunity Description

The vLLM and LLM-D Engineering team at Red Hat is looking for a customer obsessed developer to join our team as a Forward Deployed Engineer. In this role, you will not just build software; you will be the bridge between our cutting-edge inference platform (, and) and our customers' most critical production environments.

You will interface directly with the engineering teams at our customer to deploy, optimize, and scale distributed Large Language Model (LLM) inference systems. You will solve last mile infrastructure challenges that defy off-the-shelf solutions, ensuring that massive models run with low latency and high throughput on complex Kubernetes clusters. This is not a sales engineering role, you will be part of the core vLLM and LLM-D engineering team.

What You Will Do

Orchestrate Distributed Inference: Deploy and configure LLM-D and vLLM on Kubernetes clusters. You will set up and configure advanced deployment like disaggregated ser...

Full-time Computer Occupations

Ready to Apply?

Submit your application for Forward Deployed Engineer, AI Inference (vLLM and Kubernetes) at Red Hat, LLC

Apply for this Position

Location United States, New York

Country United States

Type Full-time

Category Computer Occupations

Posted May 27, 2026

Deadline July 06, 2026

Forward Deployed Engineer, AI Inference (vLLM and Kubernetes)

Opportunity Description

Ready to Apply?

Opportunity Details

About Red Hat, LLC

Red Hat, LLC

Share This Opportunity