Omada Health is on a mission to inspire and engage people in lifelong health, one step at a time. They are seeking a highly skilled Data Engineer to design, build, and maintain robust data architectures and engineering data models and pipelines, ensuring data integrity, scalability, and performance.
Responsibilities:
- Data Architecture: Design, develop, and implement scalable, secure, and efficient data solutions that meet the needs of the organization
- Data Modeling: Create and maintain logical and physical data models to support business intelligence, analytics, and reporting requirements
- Pipeline Engineering: Design, build, and optimize ETL (Extract, Transform, Load) processes and data pipelines to ensure smooth and efficient data flow from various sources
- Data Integration: Integrate diverse data sources, including APIs, databases, and third-party data, into a unified data platform
- Performance Optimization: Monitor and optimize the performance of data systems and pipelines to ensure low latency and high throughput
- Data Quality and Governance: Implement data quality checks, validation processes, and governance frameworks to ensure the accuracy and reliability of data
- Collaboration: Partner closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet their needs
- Documentation: Maintain comprehensive documentation of data architectures, models, and pipelines for ongoing maintenance and knowledge sharing
- Training: You'll train and collaborate with teammates effectively in data engineering best practices
- Technical Influence/Leadership: Recommends policy changes and establishes department-wide procedures. Uses extensive experience and knowledge to resolve complex problems
- Monitor and manage production environment to deliver data within defined SLAs
Requirements:
- 5+ years of experience building, maintaining, and orchestrating scalable data pipelines
- 3+ years of experience as a data engineer developing or maintaining integration with software such as Airflow or any Python-based data pipeline codebase
- Experience applying a variety of integration patterns for different use cases
- Experience in backend software development to contribute to distributed computing development and data technologies, with broad experience across systems, contexts, and ideas
- Experience implementing data pipelines and improving the performance of ETL processes and related SQL queries
- Experience in data modeling for OLTP and OLAP applications
- Experience with Cloud platforms such as Amazon AWS
- Familiarity with workflow management tools (Airflow preferred)
- Familiarity with cloud-based data warehouses (Amazon Redshift preferred)
- Exceptional Problem solving and analytical skills
- Experience working with sensitive data i.e. PHI / PII & security best practices
- Familiarity with data governance practices and principles
- Proficiency in SQL and experience with relational databases (e.g., MySQL, PostgreSQL)
- Proficiency in Analytical SQL (e.g. Analytics Queries, Distributed database queries) and experience working with massive parallel processing (MPP) databases (e.g., Redshift, BigQuery, Snowflake)
- Proficiency in programming languages such as Python, Java, or Scala
- Knowledge of data modeling techniques (3NF) and tools (e.g., ER/Studio, ERwin)
- Software Engineering Mindset: Apply best practices to write elegant, maintainable code and understand automated testing concepts
- Familiarity with business intelligence tools and environments
- Familiarity with big data technologies (e.g., Lambda, Hadoop, Spark)
- Software engineering mindset and an ability to write elegant, maintainable code while following engineering best practices
- Excellent communication and collaboration skills, both written and verbal, with the ability to convey complex technical concepts to non-technical stakeholders
- Leading projects and tasks effectively with cross functional stakeholders minimal guidance
- Bachelor's degree in Computer Science or a similar discipline preferred
- Experience with big data technologies such as Hadoop, Spark, or Kafka
- Experience building internal frameworks or development productivity tools would be a big plus
- Experience building data infrastructure, frameworks and automation is a big plus
- Understanding automated testing concepts and ability to consistently apply those concepts