Cotiviti is a company focused on data-driven solutions, and they are seeking a Data Engineer to architect, develop, and maintain scalable data infrastructure and pipelines. The role emphasizes data quality, integrity, and accessibility while collaborating with various teams to deliver robust analytics and reporting solutions.
Responsibilities:
- Design, build, and maintain ETL/ELT pipelines to ingest, process, and transform data from multiple sources
- Architect and optimize data storage solutions including databases, data lakes, and warehouses (e.g., Hadoop, PostgreSQL, MySQL, Oracle)
- Develop and manage data workflows in Databricks, including authoring, executing, and optimizing Spark-based queries and models
- Implement data quality assessment frameworks in Databricks, including profiling, validation, and monitoring of data assets
- Perform data gap analysis to identify missing, incomplete, or inconsistent data and recommend remediation strategies
- Develop and maintain a semantic layer to support reporting and business analytics, ensuring data is accessible in business-friendly formats
- Automate data processes and workflows to improve efficiency and reliability
- Ensure data security, governance, and compliance in all data engineering solutions
- Collaborate with analytics, business, and development teams to understand data requirements and deliver robust, scalable solutions
- Maintain comprehensive documentation of data flows, pipelines, repositories, and semantic layers
- Deliver solutions using Agile development processes and participate in code reviews for adherence to standards and best practices
Requirements:
- Bachelor's degree in Computer Science, Information Technology or equivalent work experience
- 2+ years' experience in data engineering, designing and maintaining data pipelines and infrastructure
- Experience with Databricks for data processing, pipeline development, and data quality assessment
- Proficiency in Spark, Python, Scala, and SQL across multiple database platforms (Oracle, MySQL, PostgreSQL)
- Experience with big data tools and distributed systems (e.g., Hadoop, Kafka, Pig)
- Familiarity with data warehousing concepts and semantic layer development
- Experience in data gap analysis and implementing data quality frameworks
- Ability to work independently and collaboratively within Agile teams
- Strong problem-solving and creative thinking skills
- Must be able to sit and use a computer keyboard for extended periods
- Willingness to work after hours/weekends as required for major deadlines
- Flexibility to participate in international work processes, including conference calls across global time zones
- Experience in healthcare data engineering, including claims data, billing, and coding
- Knowledge of data governance, security, and compliance practices
- Familiarity with infrastructure-as-code, containerization, and CI/CD pipelines for data workflows
- Ability to document and communicate technical concepts to both technical and non-technical audiences
- Proficient with Microsoft Office Suite (Word, Excel, PowerPoint)
- Experience integrating Databricks workflows with other big data tools and platforms
- Ability to collaborate with analytics teams to support business-friendly reporting solutions