Jobgether logo

Staff Machine Learning Engineer, AI Serving

Jobgether
2 days ago
Full-time
Remote
Web Development

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Staff Machine Learning Engineer, AI Serving in the United States.

This role sits at the core of a large-scale machine learning infrastructure organization focused on powering real-time recommendations, content discovery, and generative AI systems at massive scale. You will be responsible for designing and evolving high-performance inference systems that support millions of queries per second with strict latency and reliability requirements. The position combines deep systems engineering with advanced ML deployment, spanning GPU-based model serving, Kubernetes orchestration, and distributed cloud infrastructure. You will play a key role in shaping how large models and LLMs are served efficiently in production environments. Working in a highly collaborative and technically advanced team, you will influence platform architecture that directly impacts user experience, ranking systems, and AI-driven features. This is a high-impact engineering role where scalability, performance, and reliability are central to success.

Accountabilities:

  • Lead the design, development, and maintenance of a large-scale ML inference platform supporting low-latency, high-throughput model serving for search, ranking, and generative AI workloads.
  • Architect and implement GPU-based serving systems capable of handling millions of queries per second with strong reliability and performance guarantees.
  • Build and optimize end-to-end inference pipelines, including routing, caching, batching, and feature processing systems.
  • Develop and maintain model export frameworks to convert trained models into optimized formats for efficient GPU inference.
  • Design and improve observability systems for real-time monitoring of model performance, system health, and feature behavior.
  • Lead efforts in benchmarking, performance tuning, and scalability improvements across multi-cluster cloud environments.
  • Collaborate with cross-functional ML, infrastructure, and product teams to support production deployment of large-scale ML and LLM systems.

Requirements

  • 7+ years of experience in Machine Learning Engineering, AI Platform Engineering, or large-scale distributed systems development.
  • Strong experience operating and scaling Kubernetes-based infrastructure in production environments.
  • Deep knowledge of ML serving systems, inference pipelines, and production-grade AI deployment.
  • Strong programming skills in Python and/or Go, with experience in building scalable backend or ML systems.
  • Hands-on experience with modern ML/AI frameworks and tooling such as PyTorch, Triton, vLLM, or similar technologies.
  • Experience with cloud platforms (AWS, GCP) and infrastructure tooling such as Terraform or equivalent.
  • Strong understanding of observability, monitoring, and performance tuning for real-time systems.
  • Ability to communicate complex technical concepts clearly to both technical and non-technical stakeholders.
  • Strong ownership mindset with a focus on scalability, reliability, and developer experience.

Benefits

  • Competitive compensation package with base salary, equity (RSUs), and potential performance-based incentives.
  • Comprehensive healthcare coverage including medical, dental, and vision insurance.
  • Retirement plan with employer matching contributions.
  • Flexible remote-first work environment.
  • Generous paid time off, including vacation, holidays, and volunteer days.
  • Paid parental leave and family support programs.
  • Mental health support, coaching, and wellness resources.
  • Learning and development support for professional growth.
  • Additional benefits covering workspace support, caregiving, and family planning.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
 
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
 
 
#LI-CL1
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.