Piper Companies is seeking a Data Engineer II to support a key federal healthcare initiative. This role involves designing and optimizing modern data pipelines and backend systems to support mission-critical federal health programs.
Responsibilities:
- Build and maintain PySpark data pipelines within the Databricks platform
- Optimize Spark job performance, addressing bottlenecks, and improving resource utilization across distributed systems
- Design, develop, and maintain backend data components and services that support large-scale data processing
- Conduct research and develop proof-of-concepts for new tools, frameworks, and solutions in the data engineering ecosystem
- Write clean, maintainable, and scalable code following best practices and coding standards
- Perform code reviews to ensure quality, consistency, and reliability across the team
- Debug and troubleshoot backend data issues, proactively identifying potential system risks
- Develop and maintain thorough technical documentation
- Actively participate in Agile ceremonies (standups, sprint planning, retrospectives)
- Support estimation, task breakdown, and prioritization to meet delivery timelines
- Stay current with emerging technologies and share insights with the engineering team
Requirements:
- Bachelor's Degree in Computer Science, Computer Engineering, or related technical discipline
- 5+ years of experience in data engineering, backend engineering, or similar roles
- Strong experience with Python and Apache Spark
- Working knowledge of R
- Solid understanding of data modeling, ETL processes, and distributed computing architectures
- Strong foundation in software engineering fundamentals, including data structures, algorithms, and design patterns
- Experience working within Agile/Scrum development teams
- Ability to work independently and collaboratively
- Strong analytical thinking and problem-solving skills
- Excellent written and verbal communication abilities
- Experience with AWS services (S3, EC2, Glue, Lambda, etc.)
- Additional hands-on experience with R
- Databricks or Apache Spark professional certifications
- Experience with SAS