AddSource is seeking a Data Platform Reliability Engineer (DRE) to join their Agile team. In this role, you will be responsible for monitoring and maintaining production data engineering pipelines to ensure high reliability, performance, and data quality across the enterprise data ecosystem.
Responsibilities:
- Monitor and maintain production data engineering pipelines to ensure high reliability, performance, and data quality
- Troubleshoot and resolve data pipeline failures using Apache Airflow, dbt, and Snowflake native ETL services
- Write, optimize, and tune complex SQL queries for transformations, validations, and performance improvements
- Participate in an on-call rotation to respond to production incidents, data quality issues, and platform alerts
- Collaborate with data engineers and analysts to support pipeline development and resolve data-related challenges
- Perform DataOps monitoring activities, including log analysis, SLA tracking, and bottleneck identification
- Manage code deployments using GitLab/GitHub, adhering to version control and change management best practices
- Create and maintain operational documentation, runbooks, and troubleshooting guides
- Identify ETL/ELT optimization opportunities and implement performance and efficiency improvements
- Coordinate with cross-functional teams during incidents, providing timely updates and driving resolution
- Conduct root cause analysis (RCA) on production issues and implement preventive measures
Requirements:
- 5+ years of relevant experience in data engineering, platform reliability, or DataOps roles
- Engineering Degree – BE / ME / BTech / MTech / BSc / MSc
- Advanced proficiency in SQL (Expert level)
- Strong understanding of both SQL and NoSQL database systems
- Experience implementing data quality frameworks and reliability practices
- Proven experience monitoring and maintaining ETL/ELT pipelines
- Hands-on experience with Apache Airflow
- Hands-on experience with dbt (data build tool)
- Hands-on experience with Snowflake native ETL services
- Production-level experience within the Snowflake ecosystem
- Working knowledge of data platform architecture and operations (Medium proficiency)
- Experience with DataOps monitoring and on-call production support
- Strong exposure to AWS (preferred)
- Proficiency in GitLab and GitHub for version control and CI/CD practices
- SQL Query Writing – Expert
- Snowflake – Intermediate
- Data Platform Architecture & Operations – Intermediate
- Relevant technical certifications are a plus
- Experience implementing CI/CD pipelines in Snowflake using GitLab or GitHub Actions
- Automation through job servers and deployment frameworks
- Familiarity with Snowflake Agentic Framework and Cortex Agents
- Exposure to AI orchestration tools such as LangChain and LangGraph