Build and maintain high-volume, scalable data pipelines using Apache Kafka and Apache Spark, supporting both real-time and batch data processing needs.
Design, develop, and optimize data ingestion, transformation, and integration workflows across enterprise systems.
Ensure data quality, consistency, and integrity across four (4) disparate data sources, implementing validation, cleansing, and reconciliation processes.
Develop and maintain SQL-based data solutions, including complex queries, stored procedures, performance tuning, and data modeling.
Collaborate with data analysts, product owners, and application teams to define data requirements and ensure alignment with business needs.
Implement monitoring, logging, and alerting mechanisms to ensure reliability and observability of data pipelines.
Support data architecture design and contribute to best practices for scalable and secure data engineering solutions.
Ensure compliance with federal data governance, security, and privacy requirements.
Participate in Agile ceremonies and support iterative development and delivery of data capabilities.
Troubleshoot and resolve data pipeline issues, ensuring minimal disruption to downstream systems and reporting.
Requirements
Bachelor’s degree in Computer Science, Information Systems, Engineering, Data Science, or related field (or equivalent experience).
3+ years of experience in data engineering, data integration, or related technical roles.
Strong hands-on experience with Apache Kafka for streaming data pipelines.
Strong experience with Apache Spark for large-scale data processing (batch and/or streaming).
Advanced SQL development experience, including complex queries, performance tuning, and data transformation logic.
Experience integrating and managing data across multiple heterogeneous data sources.
Experience working in the federal government or other highly regulated environments with security and compliance requirements.
Strong understanding of data quality management, data validation, and data governance practices.
Strong problem-solving and analytical thinking abilities.
Excellent communication skills, with the ability to explain technical concepts to non-technical stakeholders.
Strong attention to detail, especially in ensuring data accuracy and consistency.
Ability to work independently in a fast-paced, mission-driven environment.
Strong collaboration skills across cross-functional technical and business teams.
US Citizenship or Permanent Residency required.
Must reside in the Continental US.
Depending on the government agency, specific requirements may include public trust background check or security clearance.
Tech Stack
Apache
Kafka
Spark
SQL
Benefits
health care
dental
vision
life insurance
401(k)
paid time off including PTO, holidays, and any other paid leave required by law