Design and implement data ingestion, transformation, and enrichment pipelines across multiple concurrent projects with varying data modalities (time-series sensor data, video, images, documents, and metadata).
Develop and manage cloud-native data services including object storage workflows, vector database integration, and structured data warehousing to support multi-modal AI/ML systems.
Work closely with AI/ML engineers to operationalize data pipelines that feed training, inference, and retrieval-augmented generation (RAG) workloads in production.
Establish data quality, lineage, and governance practices across projects that are maturing from prototype to product, bringing structure and repeatability to evolving data ecosystems.
Support the processing and organization of unstructured data (video files, PDFs, technical manuals) into formats suitable for embedding generation, semantic search, and summarization.
Present technical approaches and data architecture decisions to both technical teammates and non-technical stakeholders.
Requirements
Active US Government issued Secret Clearance, or the ability to obtain one
Bachelor’s degree in Computer Science, Data Science, Information Systems, or related field
0–2 years of experience in data engineering, data management, or database development
Familiarity with relational databases, data modeling, and ETL processes
Understanding of data quality principles and metadata management
Ability to write and optimize SQL queries
Exposure to data integration and interface management concepts