Design, develop, and implement end-to-end data pipelines, utilizing ETL processes and technologies such as Databricks, Python, Spark, Scala, JavaScript, JSON, SQL, and Jupyter Notebooks.
Create and optimize data pipelines from scratch, ensuring scalability, reliability, and high-performance processing.
Perform data cleansing, data integration, and data quality assurance activities to maintain the accuracy and integrity of large datasets.
Leverage big data technologies to efficiently process and analyze large datasets, particularly those encountered in a federal agency.
Troubleshoot data-related problems and provide innovative solutions to address complex data challenges.
Implement and enforce data governance policies and procedures, ensuring compliance with regulatory requirements and industry best practices.
Work closely with cross-functional teams to understand data requirements and design optimal data models and architectures.
Collaborate with data scientists, analysts, and stakeholders to provide timely and accurate data insights and support decision-making processes.
Maintain documentation for software applications, workflows, and processes.
Stay updated with emerging trends and advancements in data engineering and recommend suitable tools and technologies for continuous improvement.
Requirements
3+ years of experience within Data Engineering, Data Science, Data Analytics, Software Engineering, or DevOps
Experience with developing scalable ETL workflows or developing data pipelines
Experience with working with large datasets and leveraging big data technologies to process and analyze data efficiently
Experience with distributed data computing tools such as Spark, Databricks, Hadoop, Hive, AWS EMR, Jupyter Notebooks or Kafka
Experience with programming languages, including C++, JavaScript, JSON, SQL, or Python
Experience creating solutions within a collaborative, cross-functional team environment
Knowledge of data modeling and visualization, database design principles, and data governance practices
Ability to develop scripts and programs for converting various types of data into usable formats and support project team to scale, monitor and operate data platforms
Secret clearance
Bachelor's degree
Tech Stack
AWS
ETL
Hadoop
JavaScript
Kafka
Python
Scala
Spark
SQL
Benefits
Health, life, disability, financial, and retirement benefits