LifeStance Health is a leading mental health practice group dedicated to helping individuals and communities with their mental health needs. The Senior Data Engineer will be responsible for building, maintaining, and optimizing cloud-based data solutions to support data-driven decision-making, collaborating closely with data analysts and software engineers.
Responsibilities:
- Provide primary data engineer for real-time production support, ensuring minimal downtime of mission-critical data pipelines. Rotate the role and on call requirements
- Monitor, troubleshoot, and optimize data pipeline failures, query performance bottlenecks, and data discrepancies using AWS CloudWatch, Redshift, and PostgreSQL
- Automate Root Cause Analysis (RCA) reporting, reducing resolution time by 30%+ through proactive alerting and system monitoring
- Implement best practices in IAM roles, encryption, and regulatory compliance (HIPAA, GDPR) to ensure data security and governance
- Design, develop, and maintain scalable ETL pipelines using AWS Glue, Lambda, Redshift, S3, and PostgreSQL
- Optimize query performance through partitioning, indexing, query tuning, and materialized views, achieving significant performance improvements
- Implement CI/CD pipelines for automated deployments using AWS CodePipeline, Terraform, and CloudFormation to improve system stability and deployment efficiency
- Support streaming data pipelines using Kafka, Kinesis, or Spark Streaming for real-time data ingestion and processing
- Collaborate with business intelligence teams and analysts to develop high-quality data models that support analytics and decision-making
- Build interactive dashboards and reports using Power BI, Tableau, or Looker to enable real-time insights for stakeholders
- Automate data validation, cleansing, and quality checks, ensuring 99.9%+ data accuracy across critical business functions
Requirements:
- 5+ years of experience in data engineering, cloud-based data platforms, and big data processing
- Bachelor's degree in Computer Science, Data Engineering, or a related field
- Expertise in AWS services, including Glue, Lambda, Redshift, S3, CloudFormation, IAM, and CloudWatch
- Strong SQL & Python experience for data transformations, query optimization, and automation
- Deep knowledge of PostgreSQL, including performance tuning, indexing, query optimization, and schema design
- Experience with Apache Spark for big data processing and real-time analytics
- Expertise in ETL frameworks and data modeling (Star Schema, Snowflake Schema, OLAP/OLTP optimization)
- Hands-on experience with Infrastructure as Code (IaC) tools like Terraform, CloudFormation
- CI/CD expertise with AWS CodePipeline, Jenkins, or GitHub Actions for data pipeline deployments
- Data security & governance expertise including IAM roles, encryption, and HIPAA/GDPR compliance
- Proven track record in troubleshooting production issues, conducting root cause analysis (RCA), and improving system performance
- Experience working in cross-functional teams with business analysts, data scientists, and DevOps engineers
- Master's degree in Computer Science, Data Engineering, or a related field preferred but not required
- Experience with streaming data frameworks (Kafka, Kinesis, Spark Streaming)
- Familiarity with advanced analytics frameworks and ML pipelines (MLflow, SageMaker, Feature Stores)
- Certifications: AWS Certified Data Analytics, Google Cloud Professional Data Engineer, or Databricks Certified Data Engineer Associate