Scout AI is developing Fury, the first robotic foundation model for defense, aiming to provide U.S. forces with advanced autonomous capabilities. The AI Infrastructure Engineer will design and scale the infrastructure for model training and deployment, ensuring efficient operation across various environments.
Responsibilities:
- Design and implement data pipelines for ingesting, transforming, and storing petabytes of multimodal data from Fury’s robotic and operator systems
- Develop internal tooling for dataset exploration, curation, versioning, and quality monitoring over time
- Build and maintain distributed training infrastructure (cloud and on-prem) for large-scale multimodal and foundation model training
- Implement job orchestration workflows for launching, tracking, and debugging large-scale model runs
- Identify and remediate bottlenecks in compute, memory, storage, and network performance to optimize throughput and cost efficiency
- Collaborate with AI, autonomy, and systems teams to ensure data and training infrastructure supports real-time and mission-critical use cases
- Maintain observability and reliability tooling for training and inference pipelines
- Stay current on best practices in MLOps, distributed training frameworks, and AI infrastructure at scale