Quantiphi is an award-winning, AI-First digital engineering and consulting company focused on delivering high-impact Services and Solutions. They are seeking a Senior Machine Learning Engineer to support a large-scale healthcare modernization initiative, focusing on building and scaling AI model training, tuning, and deployment environments.
Responsibilities:
- Design and implement end-to-end ML workflows using Vertex AI Pipelines (Kubeflow v2)
- Build and optimize model training and tuning pipelines (custom jobs, HPO with Vizier)
- Deploy scalable models using Vertex AI Endpoints (online & batch inference)
- Optimize models for performance and cost (quantization, ONNX, TensorRT)
- Manage model lifecycle including experimentation, versioning, and monitoring
- Fine-tune LLMs using SFT, RLHF/RLAIF, and parameter-efficient methods (LoRA, QLoRA)
- Work with models such as Gemini, PaLM, Llama, Mistral
- Build robust evaluation frameworks (BLEU, ROUGE, RAGAS, custom metrics)
- Implement A/B testing and experiment tracking (Vertex AI / MLflow)
- Design and deploy end-to-end RAG pipelines (ingestion, chunking, embeddings, retrieval, reranking)
- Build and optimize vector search systems using Vertex AI Vector Search or similar
- Develop hybrid search architectures (dense + sparse retrieval, BM25, RRF)
- Improve search relevance via query understanding, ranking models, and LLM-based enhancements
- Build multi-step reasoning systems using LangChain, LlamaIndex, LangGraph, CrewAI, or AutoGen
- Design and expose agent tools (Text-to-SQL, vector search, APIs, knowledge graph tools)
- Implement memory and state management for multi-turn workflows
- Develop secure and scalable prompt management and guardrails
- Build and maintain training data pipelines including labeling and dataset curation
- Leverage Vertex AI Feature Store for feature management
- Collaborate closely with data teams on data contracts and schema design
- Implement CI/CD pipelines for ML using Cloud Build & Artifact Registry
- Manage model registry, deployment gates, and monitoring (drift, performance)
- Support distributed training workloads (GPU/TPU, multi-node environments)
- Work with clinical NLP pipelines (NER, entity extraction, classification)
- Utilize healthcare data formats such as FHIR and de-identified datasets
- Ensure solutions align with healthcare compliance standards (e.g., HIPAA)
Requirements:
- 5–10+ years of experience in Machine Learning / AI Engineering
- 2+ years of hands-on experience with LLMs / Generative AI in production
- Strong expertise in GCP Vertex AI (training, pipelines, deployment)
- Proficiency in Python and ML frameworks (PyTorch, TensorFlow, Hugging Face)
- Experience building RAG systems or agentic workflows in production
- Solid understanding of embeddings, vector databases, and search systems
- Experience with API development (FastAPI, gRPC)
- Experience with Azure ML / Azure OpenAI in a hybrid cloud environment
- Background in Healthcare / Clinical AI (FHIR, NLP, regulated datasets)
- Familiarity with distributed training (DeepSpeed, multi-GPU/TPU)
- GCP Professional ML Engineer certification
- Strong collaboration skills working across data, platform, and business teams
- Ability to communicate complex AI concepts to non-technical stakeholders
- Self-starter with experience working in client-facing, fast-paced environments
- Ability to scope, estimate, and deliver independently