Principal Engineer, Inference

CoreWeave|Bellevue / Sunnyvale, United StatesHybrid

$206k - $303kUSDVerified

Job Description

Principal engineer leading CoreWeave’s next-generation GPU inference platform, architecting ultra-low-latency, large-scale model serving across massive GPU clusters.

Responsibilities

Define technical roadmap for high-throughput, low-latency inference
Design Kubernetes-native control-plane components for model serving
Implement optimizations like micro-batching and KV-cache reuse
Build observability, debugging and rollout tooling for models
Mentor engineers on large-scale inference best practices
Partner with customers to optimize production AI applications

Benefits

Salary range around 206,000–303,000 USD plus equity and benefitsWork on frontier-scale GPU clusters for major AI customersExposure to world-class AI labs and enterprise clients

Ready to Apply?

Applications go directly to CoreWeave's career portal

Apply on CoreWeave