Personify Health is a company focused on creating personalized health solutions. The Senior Data Engineer will design and optimize scalable cloud-based data architectures while developing robust data pipelines and collaborating with cross-functional teams to deliver impactful data solutions.
Responsibilities:
- Build data applications and processes using Python, SQL, and Django; manage and query data in PostgreSQL, Oracle, and cloud-native databases
- Examine, extract, cleanse, and load data while implementing quality assurance rules and tools to ensure consistent and accurate data
- Work with healthcare-specific data processes such as EDI file transfers, claims adjudication, audits, eligibility verification, and reporting workflows
- Collaborate with cross-functional teams (Data Analysts, Data Scientists, Product, Reporting, Account Management) to define requirements and deliver data-driven solutions
- Ensure data quality, integrity, and security through automated validation, auditing, and monitoring, with compliance to HIPAA and CMS regulations
- Monitor, maintain, and tune pipeline performance; proactively troubleshoot and resolve complex data flow and system issues
- Provide technical mentorship to Data Engineers, sharing expertise in data modeling, pipeline development, and troubleshooting practices
- Research and propose improvements to the tech stack and data engineering processes
- Participate in sprint planning, refinement, and estimation to support implementation awareness and delivery
Requirements:
- At least one AWS certification (e.g., AWS Certified Data Analytics – Specialty, Big Data – Specialty, Developer – Associate)
- 7+ years in data engineering or analytics engineering, with a strong focus on cloud-native architectures. Proven experience designing and operating scalable data platforms in AWS
- 5+ years in healthcare, insurance, or claims processing, including 5+ years working with EDI (834, 835, 837, 2222, 2223, 999), X12 file standards or HL7 standards and familiarity with HIPAA and CMS compliance
- Expert-level proficiency in SQL (including pivots, window functions, and complex date calculations) and Python for data processing, transformation, and application development
- Hands-on experience with orchestration tools like Airflow, containerization with Docker, and CI/CD pipelines. Strong bias for automation and continuous improvement
- Proficient in consuming and transforming REST APIs and JSON data into relational models. Skilled in building robust data ingestion and transformation pipelines
- Experience with JIRA, BitBucket Git, BitBucket Pipelines, and collaboration with cross-functional teams including Data Analysts, Data Scientists, Product, and Account Management
- Proficient in Excel and BI tools such as Tableau, Power BI, and MicroStrategy for data analysis and reporting
- Detail-oriented with a strong focus on data quality, accuracy, and performance tuning for large-scale data systems. Background in cost optimization and system reliability
- Ability to mentor engineers, share technical knowledge, and communicate effectively with both technical and non-technical stakeholders. Strong documentation and systems thinking
- Design, build, and maintain scalable ETL/ELT pipelines using AWS, Airflow, and other technologies to ingest and transform healthcare and TPA data, including claims, provider, and eligibility sources
- Develop and operate resilient data architectures and workflows (Airflow, CloudWatch, ECS, DAGs) with strong CI/CD, observability, and governance
- Deep proficiency in AWS services including S3, Glue, EMR, EC2, MWAA, Lambda, Kinesis, ECS, and experience with Infrastructure as Code tools like Terraform
- Deep understanding of relational and non-relational data models, including star/snowflake schemas and dimensional modeling. Skilled in PostgreSQL, Oracle, AWS RDS, Snowflake, and Redshift. Ability to mentor others in data modeling best practices