CalOptima is a mission-driven community-based organization focused on serving member health with excellence. The Data Operations Engineer & Analyst will design, build, and operate large-scale enterprise cloud data platforms, ensuring the delivery of reliable data pipelines that support analytics and reporting initiatives.
Responsibilities:
- Participates in a mission driven culture of high-quality performance, with a member focus on customer service, consistency, dignity and accountability
- Assists the team in carrying out department responsibilities and collaborates with others to support short and long-term goals/priorities for the department
- Designs, supports and maintains scalable, modular ETL/ELT pipelines (e.g., SSIS, PySpark) for ingestion, transformation and delivery, embedding governance controls for data quality, lineage and compliance
- Partners with the Data Warehouse team to co-develop solutions on the modern data platform (e.g., Snowflake, Databricks, Microsoft Fabric), embedding data quality, lineage and compliance requirements into engineering workflows
- Modernizes legacy SQL-based transformations into parameterized, config-driven pipelines for reusability and scalability
- Designs and maintains data pipelines for historical tracking, snapshots, handling slowly changing datasets and leveraging modern data platforms and frameworks
- Contributes to continuous integration/continuous delivery (CI/CD) automation for data workflows, including version control (e.g., GitHub), orchestration frameworks, and automated testing
- Develops and maintains automation scripts to streamline data governance processes and operational tasks
- Monitors and troubleshoots data workflows, resolving issues related to data movement, transformation, and system performance
- Stays informed on emerging data technologies, tools and best practices, and proactively recommends enhancements to data engineering processes and platform capabilities
- Supports end-to-end pipeline operations, including orchestration, monitoring, alerting and service level agreement (SLA) management using appropriate tools (e.g., Airflow)
- Implements data quality checks and anomaly detection as part of engineering workflows to ensure trust in analytical datasets
- Partners with data engineers to optimize jobs, SQL queries and data platform workloads for performance and cost efficiency
- Drives governance of the enterprise data platform to ensure consistent ingestion, naming conventions, schema management, and enrichment of data assets for reliable analytics
- Partners with cybersecurity to implement robust data protection standards and maintain regulatory compliance (e.g., Health Insurance Portability and Accountability Act (HIPAA)) to reduce risk exposure
- Tracks and reports governance maturity through key performance indicators (KPIs) and dashboards; contributes to policy development and council initiatives
- Maintains clear documentation of data flows, definitions and validation processes to ensure transparency and traceability
- Creates process maps and diagrams to support stewardship and communicate effectively with both technical and business stakeholders
- Queries and analyzes large-scale health care datasets and business logic (e.g., claims, pharmacy, patient data) using SQL and other tools
- Provides employee engagement sessions to strengthen data literacy and reinforce governance principles, practices, roles, and accountability expectations
- Manages multiple projects simultaneously, ensuring timely delivery and alignment with stakeholder expectations
- Completes other projects and duties as assigned
Requirements:
- Bachelor's degree in computer science, health informatics, data analytics or a related field PLUS 5 years of professional experience in data management, data governance or enterprise information management required; an equivalent combination of education and experience sufficient to successfully perform the essential duties of the position such as those listed above may also be qualifying
- 4 years of hands-on experience working with health care data, including administrative and clinical datasets, and familiarity with HIPAA compliance required
- Master's degree in computer science, health informatics, data analytics or a related field
- Experience with at least one other programming language (e.g., Python, Scala) for automation and analysis
- Experience designing and implementing data pipelines in cloud environments, including orchestration frameworks (e.g., Apache Airflow, Azure Data Factory)
- Experience with CI/CD tools and practices (e.g., GitHub, Azure DevOps) for data workflows
- In-depth experience with SQL Server Integration Services (SSIS) for complex ETL development and optimization
- Hands-on experience with cloud data platforms and big data architectures, including Snowflake, Databricks or Microsoft Fabric on Azure
- Experience in regulated industries (e.g., government)
- Relevant certifications