10a Labs is a company focused on safety and threat intelligence for AI systems. They are seeking a Machine Learning Engineer to design, build, evaluate, and deploy machine learning systems for various applications related to safety, security, and intelligence.
Responsibilities:
- Design, train, evaluate, and deploy machine learning models across text, image, audio, and multimodal domains
- Develop and improve classification systems for safety, security, abuse detection, and intelligence applications
- Conduct experiments to benchmark, evaluate, and compare AI models, including large language models and multimodal systems
- Contribute to model distillation, optimization, and fine-tuning efforts to improve performance, efficiency, and deployability
- Design evaluation pipelines, metrics, and testing frameworks to measure model capabilities, reliability, and safety
- Build agentic systems and automated workflows for evaluation, red teaming, research, and large-scale experimentation
- Own ML projects from initial research and prototyping through production deployment and monitoring
- Partner with software engineers to productionize ML systems and support ongoing improvements
- Provide technical expertise and guidance across client engagements and internal research initiatives
Requirements:
- 3–5+ years of professional experience building and deploying machine learning systems
- Strong proficiency in Python and modern machine learning frameworks such as PyTorch and/or TensorFlow
- Experience working across multiple modalities, with expertise in one or more of: Computer Vision: image classification, object detection, OCR, segmentation, deepfake detection, multimodal vision-language systems, or related areas
- Experience working across multiple modalities, with expertise in one or more of: Natural Language Processing: LLMs, text classification, information extraction, retrieval systems, speech-to-text, agentic applications, or related areas
- Experience training, fine-tuning, evaluating, and deploying machine learning models in production environments
- Experience designing evaluation methodologies, benchmarking systems, and model performance metrics
- Experience with MLOps tools and practices (Docker, Kubernetes, CI/CD for ML, MLflow, etc.)
- Experience with cloud platforms such as Google Cloud Platform (preferred), AWS, or Azure, including ML infrastructure, workflow orchestration, storage, and database services
- Familiarity or experience with model distillation, synthetic data generation, reinforcement learning, or AI evaluation research is strongly preferred
- Experience working with frontier language models, multimodal foundation models, or AI safety evaluations
- Prior experience in cybersecurity, trust and safety, abuse prevention, threat intelligence, or related domains
- Experience with retrieval-augmented generation (RAG), AI agent frameworks, and context orchestration systems such as LangChain, LlamaIndex, OpenAI Agents, or AutoGen