Quantiphi is an award-winning Applied AI and Big Data software and services company, driven by a deep desire to solve transformational problems at the heart of businesses. As a Senior Data Engineer, you will architect and build scalable ETL pipelines while collaborating with customers to deliver impactful AI solutions.

Responsibilities:

Design and implement resilient ETL/ELT workflows using tools such as Airflow, Dagster, or Prefect
Build scalable batch and real-time data pipelines
Ensure end-to-end data integrity from ingestion through AI consumption
Develop low-latency “Hot Path” pipelines to enable real-time agent decisioning
Implement streaming architectures to support event-driven workflows
Optimize pipelines for high throughput, performance, and cost efficiency
Design systems supporting both batch and real-time analytics use cases
Implement automated testing and validation frameworks for data pipelines
Proactively monitor and detect data drift before it impacts AI model performance
Establish observability standards across the data platform
Maintain high standards for reliability, scalability, and production readiness
Manage the CI/CD lifecycle for data pipelines
Promote code seamlessly across development, staging, and production environments
Implement Infrastructure-as-Code best practices where applicable
Design scalable data models for AI and analytics use cases
Implement and maintain Knowledge Graph structures to model complex relationships
Optimize indexing and schema design to support RAG-based AI systems
Engage directly with customers to translate business challenges into technical solutions
Lead whiteboarding sessions to design data schemas collaboratively
Iterate data models in tight feedback loops with stakeholders
Create clear, maintainable technical documentation

Requirements:

7+ years of experience in Data Engineering or Data Platform development
Expert-level proficiency in Python (Pandas, PySpark, FastAPI)
Advanced SQL skills including complex joins, window functions, and query optimization
Strong experience with Snowflake (Snowpark, Streams) or Kinetica
Hands-on experience building scalable ETL/ELT pipelines
Experience implementing streaming or real-time data processing architectures
Experience with NoSQL databases such as MongoDB or DynamoDB
Experience designing or working with Knowledge Graph technologies (Neo4j, AWS Neptune, etc.)
Familiarity with CI/CD pipelines and automated deployment practices
Strong understanding of data quality, observability, and production reliability standards
Experience working directly with enterprise customers or stakeholders
Excellent communication and documentation skills
Experience with agent orchestration frameworks such as LangChain, LlamaIndex, or CrewAI
Familiarity with vector similarity search and managing embeddings in Pinecone, Milvus, or Snowflake native vector types
Understanding of Retrieval-Augmented Generation (RAG) architectures and optimization strategies
Experience building robust GraphQL or REST APIs that agents can use as operational tools

Senior Data Engineer

Key skills

About this role

Responsibilities:

Requirements: