Mundane is a venture-backed seed-stage robot learning startup founded by a team of Stanford researchers and builders. They are seeking a Data/ML Infrastructure Lead to build and own the data engine that powers robot learning, ensuring the system scales efficiently to turn raw robot operation into high-quality datasets and reliable compute workflows.
Responsibilities:
- Design and implement a robust dataset format for robot learning (episodes, metadata, manifests, versioning)
- Build ingestion pipelines from robots (office + customer deployments) into centralized storage
- Implement scalable storage and compression strategies, including time-synchronized video and high-rate sensor streams
- Build fast dataset indexing and sampling tools (task-balanced sampling, hard-example mining, curriculum support)
- Improve dataloader throughput and stability (prefetching, caching, sharding, distributed loading)
- Standardize reproducible training workflows (dataset version + config + code commit + artifact lineage)
- Own and improve on-prem GPU training usability (multi-user workflows, monitoring, job hygiene)
- Enable multi-GPU training and distributed experimentation when needed
- Partner closely with hardware, deployment, and learning teams to ensure logging, calibration, and data integrity are correct