YO IT CONSULTING is seeking a Senior Data Engineer who thrives on precision and systems thinking to contribute to the next generation of AI systems. The role involves designing and maintaining production-grade data pipelines, evaluating AI-generated content for accuracy, and shaping AI communication standards.
Responsibilities:
- Evaluate AI-generated answers to data engineering prompts for technical accuracy, completeness, clarity, and real-world feasibility
- Challenge advanced language models with complex Data Engineer scenarios involving SQL, Python, ETL/ELT design, orchestration, warehousing, data modeling, and pipeline reliability
- Review and refine AI-generated prompts, responses, rubrics, and reference answers to ensure they reflect senior-level data engineering judgment
- Provide structured feedback that identifies incorrect assumptions, missing constraints, weak reasoning, inefficient implementations, or unsafe recommendations
- Shape AI communication standards by helping models explain data architecture, debugging steps, tradeoffs, and implementation patterns clearly and responsibly
- Support benchmarking efforts by evaluating model performance across realistic data engineering workflows, edge cases, and failure modes
- Develop and review high-quality examples that demonstrate strong reasoning around pipeline design, data quality checks, data contracts, schema evolution, and system scalability
Requirements:
- 4+ years of professional experience in data engineering, with significant hands-on work designing, building, and maintaining production-grade data pipelines
- Deep knowledge of SQL, data modeling, ETL/ELT architecture, orchestration frameworks, warehouse/lakehouse patterns, and modern data stack tools such as dbt, Airflow, Snowflake, BigQuery, Databricks, Fivetran, or similar platforms
- Strong understanding of distributed data systems, batch and streaming workflows, schema design, data validation, data observability, lineage, and pipeline reliability
- Proven experience optimizing complex SQL queries, troubleshooting data quality issues, designing scalable transformations, and supporting analytics or machine learning-ready datasets
- Demonstrated experience in translating ambiguous business or technical requirements into reliable data models, pipeline designs, and implementation plans
- Bachelor's degree in Computer Science, Data Engineering, Information Systems, Statistics, Engineering, or a related technical field; equivalent professional experience will also be considered
- Previous experience with AI data training, annotation, or evaluating AI-generated technical content is a strong plus