Lifescale Analytics helps organizations unlock the power of data through advanced analytics, AI, and modern digital solutions. They are currently seeking a LLM/AI Data Engineer to support client engagement by designing and validating high-quality, production-grade data pipelines with integrated LLM capabilities.

Responsibilities:

Design, build, and operate LLM-assisted analytics pipelines in structured data environments
Implement retrieval-augmented generation (RAG) and structured data grounding patterns
Validate and improve LLM output quality, consistency, and traceability
Develop and maintain production-grade ETL/ELT pipelines
Review and test pipelines to identify logic errors, data gaps, and performance issues
Define and track pipeline SLAs (latency, throughput, data freshness)
Build and enforce data quality frameworks and validation processes
Document engineering processes including QC logs, test cases, and schema documentation
Collaborate with cross-functional teams to ensure scalable and auditable data systems
All other duties as assigned

Requirements:

Applicants responding to this position must be a US Citizen and may be subject to a government security investigation and must meet eligibility requirements by currently possessing the ability to view classified government information
The candidate must have lived in the United States for the past 5 years
The Employer will not sponsor applicants for any employment visas, at hiring or in the future, including but not limited to H-1B visas
Corp-to-Corp or subcontract personnel will not be considered for this position
Experience designing, building, or operating LLM-assisted analytics pipelines
Experience validating and improving LLM output quality and reliability
Strong understanding of: Prompt engineering for structured outputs, Retrieval-Augmented Generation (RAG) patterns, Structured-data grounding & hallucination mitigation
Minimum 4+ years of experience in: Data engineering, ETL/ELT pipeline development, Data quality assurance in production environments, Proven experience working with high-volume structured data systems
Advanced proficiency in SQL and Python
Experience with tools such as dbt, Spark, or similar frameworks
Hands-on experience with Snowflake, including: Snowpark or equivalent transformation frameworks, Data modeling and performance optimization, Snowflake Cortex
Ability to design and implement data quality frameworks
Experience reviewing and validating production pipelines: Logic validation and transformation accuracy, Data completeness and integrity checks, Identification of edge cases and failure modes
Ability to benchmark and optimize pipelines against performance targets
Experience defining and measuring: Pipeline latency, Throughput, Data freshness SLAs
Experience supporting auditable and explainable data systems
Strong documentation practices, including: QC logs and validation reports, Test case design and execution records, Schema and lineage documentation, Issue tracking and remediation workflows
Bachelor's degree in Computer Science, Data Engineering, or related field (or equivalent experience)
Experience supporting U.S. Department of Defense (DoD) environments: Air Force Life Cycle Management Center (LCMC), Army Materiel Command (AMC)
Familiarity with Palantir Foundry: Ontology modeling concepts, Data product consumption patterns
Experience with defense datasets: Government-Industry Data Exchange Program (GIDEP), Federal Logistics Information System (FED-LOG)
Exposure to: Entity resolution and part matching, ERP data integration into analytics platforms, Data normalization across fragmented systems

LLM/AI Data Engineer

Key skills

About this role

Responsibilities:

Requirements: