Build data pipelines that scrub PII, create research datasets, and power the research portal for educational AI studies
Architect the path toward self-hosted and on-device model deployments for privacy and global accessibility
Design and implement model orchestration systems that intelligently route requests across multiple AI providers (OpenAI, Anthropic, AWS Bedrock, open-source models)
Build cost optimization infrastructure
implement conversation compression, prompt caching, and smart model selection to keep AI accessible
Create comprehensive observability systems for ML operations
track costs, latency, quality, and usage patterns across thousands of applications
Design and implement infrastructure for fine-tuning and deploying custom models
Build monitoring and alerting systems that help us maintain reliability as AI interactions scale

7+ years building production ML/data systems, with experience in ML operations and infrastructure
Strong experience with model serving, orchestration, and optimization in production environments
Proficient in Python and data pipeline technologies (Airflow, ETL tools, etc.)
Experience with cloud infrastructure (AWS preferred) and containerization (Kubernetes, Docker)
Experience with cost optimization strategies for LLM-based systems
Thrive in high-agency, high collaboration cultures
Great communication that makes working remote-first work.

Staff ML Infrastructure Engineer

Key skills