YO IT CONSULTING is seeking a Senior Data Engineer who thrives on precision and systems thinking to contribute to the next generation of AI systems. The role involves designing and maintaining production-grade data pipelines, evaluating AI-generated content for accuracy, and shaping AI communication standards.

Responsibilities:

Evaluate AI-generated answers to data engineering prompts for technical accuracy, completeness, clarity, and real-world feasibility
Challenge advanced language models with complex Data Engineer scenarios involving SQL, Python, ETL/ELT design, orchestration, warehousing, data modeling, and pipeline reliability
Review and refine AI-generated prompts, responses, rubrics, and reference answers to ensure they reflect senior-level data engineering judgment
Provide structured feedback that identifies incorrect assumptions, missing constraints, weak reasoning, inefficient implementations, or unsafe recommendations
Shape AI communication standards by helping models explain data architecture, debugging steps, tradeoffs, and implementation patterns clearly and responsibly
Support benchmarking efforts by evaluating model performance across realistic data engineering workflows, edge cases, and failure modes
Develop and review high-quality examples that demonstrate strong reasoning around pipeline design, data quality checks, data contracts, schema evolution, and system scalability

Requirements:

4+ years of professional experience in data engineering, with significant hands-on work designing, building, and maintaining production-grade data pipelines
Deep knowledge of SQL, data modeling, ETL/ELT architecture, orchestration frameworks, warehouse/lakehouse patterns, and modern data stack tools such as dbt, Airflow, Snowflake, BigQuery, Databricks, Fivetran, or similar platforms
Strong understanding of distributed data systems, batch and streaming workflows, schema design, data validation, data observability, lineage, and pipeline reliability
Proven experience optimizing complex SQL queries, troubleshooting data quality issues, designing scalable transformations, and supporting analytics or machine learning-ready datasets
Demonstrated experience in translating ambiguous business or technical requirements into reliable data models, pipeline designs, and implementation plans
Bachelor's degree in Computer Science, Data Engineering, Information Systems, Statistics, Engineering, or a related technical field; equivalent professional experience will also be considered
Previous experience with AI data training, annotation, or evaluating AI-generated technical content is a strong plus

Data Engineer - AI Model Training – Remote

Key skills

About this role

Responsibilities:

Requirements: