Deccan AI is a model training and evaluation startup headquartered in Mountain View, CA. As a Robotics Simulation & Synthetic Data Intern, you will own the end-to-end synthetic data generation pipeline and directly contribute to training runs for VLA models.
Responsibilities:
- Build and operate synthetic trajectory pipelines using NVIDIA Isaac Sim, Isaac Lab, and the GR00T-Mimic blueprint to generate large-scale manipulation and locomotion datasets from a small seed of human demonstrations
- Configure simulation environments—scene composition, physics parameters, robot URDF/USD models, camera placements, and domain randomization—to maximize trajectory diversity and sim-to-real transfer quality
- Apply world-foundation-model augmentation (NVIDIA Cosmos Transfer / GR00T-Dreams) to transform sim-rendered frames into photorealistic training images with varied lighting, textures, and backgrounds
- Design and run data quality experiments—measure success rates, trajectory smoothness, and visual fidelity; iterate on pipeline parameters to improve downstream policy performance
- Curate and version datasets in formats compatible with VLA model training (e.g., Open X-Embodiment, LeRobot), ensuring metadata, task labels, and action annotations meet client specifications
- Benchmark synthetic vs. real data by training imitation-learning policies (e.g., ACT, Diffusion Policy) on mixed datasets and reporting sim-to-real transfer metrics
- Document pipelines and author technical guides so the team can reproduce, scale, and extend your work beyond the internship