Remote AI Inference Engineer for Forward Deployed Projects at Red Hat, Inc.
Are you from India? 🇮🇳
👉 Check Today's Deals on Amazon IndiaJoin Red Hat as a Forward Deployed Engineer
Overview of the Role
Red Hat, Inc. is searching for a customer-obsessed developer to enhance our vLLM and LLM-D Engineering team. As a Forward Deployed Engineer, you will bridge our state-of-the-art inference platforms (LLM-D and vLLM) with our customers’ critical production environments. Your responsibilities will include deploying, optimizing, and scaling distributed Large Language Model (LLM) inference systems directly in collaboration with client engineering teams.
Key Responsibilities
Orchestrate Distributed Inference
- Deploy and configure LLM-D and vLLM on Kubernetes clusters.
- Implement advanced deployment strategies such as disaggregated serving, KV-cache aware routing, and KV Cache offloading to optimize hardware utilization.
Optimize for Production
- Conduct performance benchmarks and tune vLLM parameters.
- Configure intelligent inference routing policies to meet Service Level Objectives (SLOs) for latency and throughput.
- Focus on critical metrics like Time Per Output Token (TPOT), GPU utilization, and Kubernetes scheduler efficiency.
Collaborate with Customer Engineers
- Write production-quality code in Python, Go, and YAML.
- Integrate our inference engine into existing Kubernetes ecosystems alongside customer engineers.
Solve Complex Infrastructure Challenges
- Debug intricate interactions between model architectures (like MoE and large context windows), hardware accelerators (NVIDIA GPUs, AMD GPUs, TPUs), and Kubernetes networking frameworks (Envoy/ISTIO).
Provide Feedback to Engineering Teams
- Act as "Customer Zero" for core engineering teams.
- Share field learnings to influence the LLM-D and vLLM development roadmap.
- Travel as needed for customer presentations, demos, and proof-of-concept executions.
Qualifications
Experience
- 8+ Years of Engineering Experience: Proven track record in Backend Systems, SRE, or Infrastructure Engineering.
- Kubernetes Expertise: Deep understanding of Kubernetes primitives, CRDs, Operators, Controllers, and networking.
- AI Inference Proficiency: Familiarity with LLM forward passes, KV Caching, disaggregated decoding, and continuous batching.
Skills
- Systems Programming: Proficiency in Python (for model integration) and Go (for Kubernetes scheduling).
- Infrastructure as Code: Experience with Helm, Terraform, or similar tools.
- Cloud Knowledge: Comfortable deploying LLMs on bare-metal and hyperscaler Kubernetes clusters.
Additional Preferred Experience
- Contributions to open-source AI infrastructure projects (e.g., KServe, vLLM).
- Understanding of Envoy Proxy or Inference Gateway (IGW) frameworks.
- Knowledge of model optimization techniques such as Quantization and Speculative Decoding.
Salary and Benefits
The salary range for this position is $189,600.00 – $312,730.00, contingent upon qualifications. Compensation includes potential bonuses, commissions, and equity options.
Comprehensive Benefits
- Medical, dental, and vision coverage.
- Flexible Spending Account for healthcare and dependent care.
- 401(k) retirement plan with employer match.
- Paid time off and holidays.
- Paid parental leave and additional leave benefits.
- Employee stock purchase plans and tuition reimbursement.
About Red Hat
Red Hat is the world’s leading provider of enterprise open source software solutions. With a community-driven approach, we deliver high-performing Linux, cloud, container, and Kubernetes technologies across 40+ countries. Our work environment promotes flexibility, creativity, and collaboration, encouraging contributions from every team member.
Inclusion and Equal Opportunity
At Red Hat, our culture is built on transparency, collaboration, and inclusion. We celebrate diverse backgrounds and perspectives to drive innovation. We encourage applicants from all dimensions of our global community to join us.
Red Hat is an equal opportunity employer and is committed to providing reasonable accommodations for individuals with disabilities. For assistance with the online application, please contact us via email.
If you are looking for a new challenge in a dynamic and innovative environment, we welcome your application to join our team as a Forward Deployed Engineer at Red Hat.
Source link
