Carrot is a global fertility and family care platform that supports members through significant life moments. The Data Engineer role focuses on enhancing the reliability and scalability of Carrot's data platform by building and maintaining automated data pipelines, collaborating with cross-functional teams, and ensuring high-quality data solutions.
Responsibilities:
- Build, test, deploy, monitor, and iterate on scalable ETL/ELT pipelines using platforms such as dbt and Fivetran; contribute to standard workflows (e.g., member engagement, finance/billing reports) and support ad hoc reporting needs
- Contribute to the design and continuous improvement of Snowflake and related cloud data platforms (AWS/GCP), including source integrations, performance tuning, and cost optimizations
- Develop auditable orchestration flows with tools like Prefect or Airflow; integrate with services such as S3, SFTP, APIs, and Files.com: Next-Generation MFT, SFTP & File Sharing; implement robust alerting, retries, and monitoring
- Identify opportunities to improve pipeline speed, reliability, and cost; help refactor legacy processes, reduce manual steps, and document changes for scale and reuse
- Support automated data integrations and exports (e.g., eligibility feeds, payroll/tax/billing files) with finance, product, legal, and commercial teams; help enforce data specs and secure handling of sensitive information
- Follow best practices for Git/GitHub, code reviews, testing, Jira workflows, audit-trail documentation, and exception logging; uphold InfoSec and compliance standards across the data lifecycle
- Help design modular, well-documented data models and marts (dbt/Snowflake) that enable BI and data science use cases (e.g., segmentation, provider matching, engagement analytics)
- Participate in on-call or escalation routines for high-impact data incidents; assist with root-cause analysis, remediation, and clear stakeholder communications
- Codify business rules and automate recurring cycles (e.g., monthly finance handoffs, audit logs) to reduce manual intervention and operational risk
- Implement privacy-conscious practices such as masking and de-identification; contribute to safe, compliant access across the stack and external data exchanges
Requirements:
- Proficiency with Snowflake, dbt, and Python for data modeling, transformations, and pipeline development in production environments
- Strong SQL skills for complex querying and performance optimization across large datasets
- Experience building and maintaining automated ETL/ELT pipelines using orchestration tools such as Prefect or Airflow, including alerting, dependency management, and retries
- Hands-on exposure to cloud data platforms (e.g., Snowflake, AWS S3/Redshift, Google Cloud) and integrating new data sources at scale
- Understanding of secure external data flows via SFTP, Files.com: Next-Generation MFT, SFTP & File Sharing, and APIs, with attention to reliability and compliance requirements
- Comfort with Git/GitHub and Jira for version control, code reviews, and operational transparency in a team setting
- Track record of automating manual workflows and creating reusable components that reduce operational overhead
- Collaborative, dependable working style with adaptability in dynamic environments and clear communication with technical and non-technical partners
- Experience in fast-paced or growth environments delivering data solutions under shifting priorities and timelines
- Background enabling BI and data science teams via scalable integrations, data marts, and documentation that supports multi-tenant reporting or advanced analytics
- Familiarity with batch and event-driven paradigms, including near-real-time pipelines using Snowflake, dbt, and Python
- Experience automating operational, financial, or compliance-driven workflows in cloud data environments
- Exposure to audit, privacy, and compliance frameworks (e.g., SOC, HIPAA, GDPR, ISO, SOX) and role-based access controls in regulated or healthtech contexts