Site Operations Manager na xAI

Presencial - Memphis, TN

Candidatar-se
Ver mais vagas na xAI

The Site Operations Manager will oversee data center technicians and operations at xAI’s Memphis location, ensuring high uptime of AI infrastructure, managing power, cooling, networking, hardware, and leading a team, while driving sustainable solutions and maintaining efficient processes.

Requirements

Skills

  • 5+ years of experience in data center operations or similar critical environments, with 3+ years managing technical teams
  • Proven ability to lead teams effectively in fast-paced, high-responsibility settings
  • Solid expertise in server hardware, cabling, and data center technologies, from setup to lifecycle management
  • Experience supporting compute-heavy environments like AI, machine learning, or high-performance computing
  • Proficiency with tools like Jira and managing collaborative workflows across teams
  • Strong analytical skills and the ability to explain technical concepts clearly to diverse audiences
  • Familiarity with scripting (e.g., Python, Bash) to automate tasks and boost team efficiency
  • A history of partnering with vendors, scaling operations, and advancing sustainability initiatives
  • Enthusiasm for xAI’s mission to accelerate human discovery and unravel the universe
  • Ability to thrive in a dynamic, mission-focused environment with occasional on-call duties
  • Willingness to travel to data center locations as needed to support operations
  • Physical capability to handle data center tasks, including lifting up to 50 lbs, standing for long periods, and occasional ladder use

Responsibilities

  • Oversee Site Operations: Manage power, cooling, networking, and hardware deployments to ensure 99.999% uptime for xAI’s AI compute systems
  • Guide Your Team: Lead and develop a team of Data Center Operations Technicians through training, performance evaluations, and fostering a collaborative, high-performing environment tied to xAI’s objectives
  • Streamline Processes: Take charge of hardware lifecycles, incident resolution, and inventory management, refining procedures to ensure your team operates with precision and consistency
  • Connect Key Players: Coordinate between technicians, xAI’s AI specialists, and external vendors to integrate new technology and expand capacity seamlessly
  • Drive Sustainable Solutions: Champion energy-efficient practices and sustainability efforts, optimizing resources while supporting the demands of cutting-edge AI workloads
  • Measure Success: Track and report key metrics like uptime, power efficiency, and issue resolution times, using data to enhance site performance and inform decisions
  • Handle Emergencies: Lead the team through urgent situations with clear direction, resolving issues quickly to protect our AI systems from disruption
  • Optimize Operations: Build and refine processes—such as preventative maintenance schedules with vendors and ticket workflows in Jira—to keep operations efficient and scalable
  • Support Expansion: Work with leadership to standardize best practices across sites (if applicable), ensuring operations align with xAI’s ambitious growth plans

Technologies

JiraPythonBash

Descubra se seu currículo está pronto para esta vaga

Veja como nossa IA pode otimizar seu currículo e aumentar suas chances de conseguir esta posição.