All Jobs

Principal Engineer, Inference

CoreWeave|Bellevue / Sunnyvale, United StatesHybrid
$206k - $303kUSDVerified
Apply Now

Job Description

Principal engineer leading CoreWeave’s next-generation GPU inference platform, architecting ultra-low-latency, large-scale model serving across massive GPU clusters.

Responsibilities

  • Define technical roadmap for high-throughput, low-latency inference
  • Design Kubernetes-native control-plane components for model serving
  • Implement optimizations like micro-batching and KV-cache reuse
  • Build observability, debugging and rollout tooling for models
  • Mentor engineers on large-scale inference best practices
  • Partner with customers to optimize production AI applications

Benefits

Salary range around 206,000–303,000 USD plus equity and benefitsWork on frontier-scale GPU clusters for major AI customersExposure to world-class AI labs and enterprise clients

Category

MLOps / AI Infrastructure

Posted

11/24/2025

Ready to Apply?

Applications go directly to CoreWeave's career portal

Apply on CoreWeave