Performs data engineering and data analysis to create, store, locate , retrieve, analyze, manage, and share documentation/artifacts.
Leads and develops data products within the system/software development life cycle from requirements definition, to architecture, design, implementation, and testing, through deployment and operations, and support.
Defines and develops techniques to integrate, consolidate , and structure data for analytical use, including the ability to inform on data aggregation and associated security implications, and the ability to leverage logical data models to support data source mappings.
Identifies data migration strategies to transition data from legacy systems and technologies to advanced, enterprise-based solutions.
Performs assessments relative to data architecture, design, and implementation, and identifies system impacts associated with new requirements, and requirements changes.
Works in collaboration with developers, data scientists, and other engineers to facilitate data movement and security both on-prem and in the cloud.
Contributes to technical collaboration with internal and external stakeholders.
Requirements
Active SECRET clearance required
Bachelor's Degree in a STEM or related field
Minimum 3-years of demonstrated experience with enterprise data management solutions and platforms in the areas of master data management, data quality management, data governance, metadata management, Extract-Transform-Load (ETL), data warehousing, and data lakes
Minimum 2-years of experience modeling and managing structured and unstructured data including photos/images, physical specifications/technical diagrams
Experience leading legacy data integration and remediation (facades, strangler approaches, etc )
Deep understanding of different integration patterns and best practices such as events, synchronous vs asynchronous messaging, peer-to-peer, publish-to-subscribe, distributed logs, and RESTful APIs
Must possess and maintain an active Secret Security Clearance.
Demonstrated experience working with relational( sql ) database management systems, normalizing complex data to optimize for writing and reading as well as designing efficient queries for operational and analytical purposes.
Demonstrated experience working with messaging platforms (Kafka, Rabbitmq , Redis queueing) to create data pipelines that allow applications to operate as loosely coupled components and facilitate easy on-boarding of new features.
Has worked with message encoding languages such as Protobuf , Avro, JSON, and can select appropriate solution for a given message stream.
Demonstrated experience working with specific non-relational databases such as Apache Druid, Cassandra, and AWS Simple Storage Solution(S3).
Should be able to explain when to use these solutions and instead of relational database solutions.
Some experience working with distributed, high-volume database solutions such as Apache Hadoop, Apache Spark.
Proficiently in scripting data transformations jobs and writing API servers with Python
Has worked with domain experts in a field they did not have proficiency to deliver value to customers
Demonstrated experience working with Kubernetes (k8s) and containerization (docker).