All Jobs
xAI

Member of Technical Staff – Multimodal (Real-time Video Understanding)

xAI|Palo Alto / Seattle, United StatesOnsite
$180k - $440kUSDVerified
Apply Now

Job Description

Engineer on xAI’s Grok Voice team, pushing the frontier of multimodal intelligence for real-time video understanding. Work includes building and scaling models that can listen, see, and respond in real time, often using JAX, Python, and Rust, with a strong expectation of hands-on engineering excellence.

Responsibilities

  • Design and implement multimodal models for real-time video and audio understanding
  • Optimize training and inference pipelines for low latency and high reliability
  • Collaborate with other engineers across infra, data, and product for Grok Voice
  • Contribute to xAI’s core codebase and shared modeling tooling
  • Participate in a fast-moving, highly technical culture focused on shipping

Benefits

High-end salary plus equity (e.g., ranges like $180,000–$440,000 have been reported for similar MTS roles)Comprehensive medical, dental, vision, disability, and life insurance401(k) plan and visa sponsorshipFlexible vacation in a culture that still expects intense commitment

Category

LLM / Generative AI Engineer

Ready to Apply?

Applications go directly to xAI's career portal

Apply on xAI