Design, build, and maintain scalable data pipelines that power analytics and data-driven decision-making across Replit (e.g. tracking Repl deployments, AI agent usage, etc.)
Develop ETL/ELT workflows using modern data stack tools and transform raw data into clean, reliable datasets that enable self-service analytics.
Partner with teams across the company to understand data needs, deliver robust solutions, and implement data quality monitoring to ensure accuracy and reliability.
Examples of what you could do:
Build unified data models combining product usage, billing, and customer data to enable cohort analysis and retention tracking.
Design real-time pipelines that surface key metrics and automated data quality checks to catch inconsistencies before they impact downstream users.
Create dimensional models that enable flexible analysis of user behavior, feature adoption, and conversion funnels.
Requirements
5+ years of experience building production data pipelines with strong SQL skills and experience designing data models.
Experience with modern data transformation tools (dbt preferred), proficiency in Python, and hands-on experience with cloud data warehouses (BigQuery, Snowflake, Redshift).
Understanding of data warehouse design principles and ability to communicate effectively with both technical and non-technical stakeholders.
Preferred Qualifications:
Experience with modern data stack tools (dbt, Fivetran, Segment, HEX, Databricks, Amplitude) and background in high-growth SaaS or PLG companies.
Familiarity with event-based analytics platforms, data visualization tools, and software engineering best practices.
Bonus Points:
Experience with real-time data processing, reverse ETL tools, or developer tools and collaborative coding environments.
Knowledge of data governance frameworks or machine learning pipelines and feature engineering.