xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. The Member of Technical Staff - Inference will design and optimize large-scale model serving systems for Grok, building a high-performance inference platform to serve millions of users daily. The role involves architecting scalable distributed infrastructure, optimizing latency and throughput, building high-concurrency serving systems, benchmarking, fine-tuning inference engines, developing custom trace tools, creating CI/CD infrastructure, and accelerating research on scaling compute and model-hardware co‑design.
Salary
USD 180,000 - 440,000
Requirements
Skills
Deep low-level systems programming (C/C++ or Rust)
Experience with large-scale, high-concurrent production serving
Experience with GPU inference engines (vLLM, SGLang, Triton, TensorRT-LLM, etc.)
Strong background in system optimizations: batching, caching, load balancing, parallelism