Integrate simulators into the Cloud platform. Own the seam between our simulators and the rest of Cloud: how a simulator job gets deployed, how its physics and rendering components are packaged and plumbed together, how it’s instrumented, and how it runs efficiently at scale.
Run sim reliably in automation. Make it trivially easy to kick off simulation workflows as part of CI, and keep the nightly sim workflows that gate our releases reliable and fast.
Be Cloud’s point person for simulation engineers. Translate the Sim team’s needs into platform work, push composable primitives back into the rest of Cloud, and obsess over removing friction from the sim workflow so engineers can iterate as fast as possible.
Prepare for what’s next. Sim at Bedrock is evolving quickly. You’ll help us evaluate options and stand up the infrastructure for whichever directions we commit to.
Requirements
6+ years of professional software engineering experience with demonstrated ownership of production systems.
Strong Python skills and comfort with API design, async patterns, and cloud-native development.
Solid cloud infrastructure background. You’re comfortable in AWS, understand distributed systems concepts (orchestration, state management, retries, spot/preemptible compute), and can reason about cost and performance trade-offs.
Experience with deployment and service integration. gRPC, container orchestration, instrumentation. You’ve owned the operational glue between systems before.
Generalist platform sensibility. Comfortable working across the stack: backend services, data pipelines, CI systems, and internal UIs. You care about how the whole loop feels for the engineers using it.
Strong written communication and a bias toward small, well-instrumented systems over heavy frameworks.
Preferred Qualifications: Experience with Ray or comparable distributed compute frameworks (Spark, Kubernetes-native job systems, etc.).
Experience operating on large scale data systems – by amount of PB, flops or number of vCPUs.
Experience in an early-stage startup environment designing, building, and launching new platform capabilities from scratch.