ChatGPT Jobs is an MIT-born, venture-backed Silicon Valley startup focused on reinventing design and manufacturing through Engineering General Intelligence. The Data Engineer - Manufacturing will transform raw customer data into structured datasets for AI model training, while collaborating with cross-functional teams to ensure data quality and reliability.
Responsibilities:
- Ingest, clean, transform, and structure customer and internally generated engineering data for AI training and inference
- Design and build high-quality mechanical components and assemblies in CAD to serve as authoritative ground truth for evaluating and training AI systems
- Produce labeled datasets, reference designs, annotations, exploded views, sequences, and other engineering artifacts that encode real-world reasoning
- Apply engineering judgment to define and assess output quality across datasets
- Continuously refine standards for metadata, annotation, and model quality, maintaining a living 'definition of quality' for ME datasets
- Collaborate with Product Managers to shape tooling used for annotation, data correction, model-output review, and pipeline automation
- Provide detailed feedback on tool usability, workflow efficiency, and automation opportunities
- Help develop scalable, repeatable data processes that improve throughput and data consistency
- Partner closely with engineering and research teams to understand model data requirements, failure modes, and areas needing new data
- Influence model behavior by supplying representative engineering examples and ground-truth mechanical designs
- Partner with customer-facing teams to translate domain requirements, industry standards, and customer data schemas into actionable dataset specifications
- Serve as a subject matter expert on mechanical engineering formats, CAD standards, manufacturing practices, and design artifacts
- Generate technical documentation, exploded views, sequences, and annotations that encode engineering reasoning into training data
- Ensure that datasets reflect real-world constraints, DFM (Design for Manufacturing) considerations, material behavior, and industry best practices
- Embed engineering reasoning into training data so that AI systems learn not just geometry or text, but engineering intent
- Work with customers to understand their data sources, schemas, formats, and quality expectations
- Guide customers in preparing high-quality datasets, defining structured schemas, and improving data pipelines
- Support delivery timelines by communicating progress clearly and surfacing risks or issues early
- Review and work with external contractors, ensuring high-quality output and adherence to SOPs
Requirements:
- Strong domain expertise in mechanical engineering, manufacturing design, or industrial workflows
- Hands-on experience with CAD tools such as SolidWorks, CATIA, Siemens NX, or Creo
- Familiarity with annotation tools and illustration software (e.g., Creo Illustrate, Adobe Illustrator, Arbortext)
- Ability to interpret complex mechanical assemblies, technical drawings, GD&T, and engineering documentation
- Experience creating artifacts like exploded views, work-step sequences, repair manuals, or manufacturing instructions
- Strong problem-solving skills and the ability to translate domain workflows into structured data requirements
- Excellent communication and cross-functional collaboration skills
- Experience with data operations, labeling workflows, ML data pipelines, or AI/ML data lifecycle (collection -> labeling -> QA -> training -> evaluation -> deployment)
- Experience in fast-paced startup or high-growth environments
- Comfort with customer-facing discovery or solutioning