Cambium Learning Group is an award-winning educational technology solutions leader dedicated to helping all students reach their potential through individualized and differentiated instruction. We are seeking a talented Machine Learning Engineer II to join our machine learning and scoring development team, where you will design and deploy custom machine learning solutions for our clients and internal platforms.

Responsibilities:

Lead the transition of machine learning models from theoretical prototypes into scalable, high-performance production systems
Architect and deploy ML solutions utilizing AWS ECS (Elastic Container Service) for containerized workloads and AWS Lambda for serverless, event-driven inference pipelines
Optimize PyTorch models for production deployment by converting them to ONNX formats. Apply advanced inference optimization techniques (quantization, pruning, ONNX Runtime) and memory-efficient attention mechanisms like Flash Attention to minimize latency and maximize throughput
Champion infrastructure best practices for machine learning systems, establishing reliable CI/CD pipelines, and ensuring robust, secure, and reproducible deployments across the AWS ecosystem
Design, develop, and evaluate algorithms that generate descriptive, diagnostic, predictive, and prescriptive insights from both structured and unstructured data
Write clean, efficient, and well-tested code. Complete rigorous testing, debugging, and documentation to ensure seamless installation and long-term maintenance
Actively participate in research discussions, requirements gathering, and system design alongside domain experts to build tailored scoring and ML solutions

Requirements:

2–5 years of industry experience in Machine Learning Engineering, Software Engineering, or Data Science, with a proven track record of architecting and deploying models to production
Deep, hands-on experience with the AWS ecosystem, specifically AWS ECS and Lambda. Solid understanding of containerization (Docker) and event-driven architectures
Strong proficiency in modern programming languages used in ML (e.g., Python, C++, Java) and familiarity with industry-standard coding practices
Hands-on experience with PyTorch and other machine learning libraries (e.g., Scikit-Learn, TensorFlow). Deep understanding of model optimization pipelines, including PyTorch to ONNX conversions, ONNX Runtime, and scaling attention mechanisms (e.g., Flash Attention)
Experience working with large-scale computing frameworks, data analysis systems, and relational/non-relational databases
Experience utilizing AWS SageMaker for managed model training and hosting
Hands-on experience applying modern parameter-efficient fine-tuning methods (such as LoRA and qLoRA) to large language models
Experience building, integrating, and deploying autonomous or semi-autonomous AI agents to automate complex workflows and connect ML models with external tools/APIs
Proven experience and familiarity with deep learning technologies applied specifically to Natural Language Processing (NLP) and complex text-based modeling
Experience collaborating with specialized researchers (e.g., psychometricians, statisticians) to operationalize complex mathematical concepts
Experience implementing IaC using tools like Terraform or AWS CloudFormation
Experience setting up comprehensive model monitoring systems to detect data drift, concept drift, and model degradation in production AWS environments

Machine Learning Software Engineer II

Key skills

About this role

Responsibilities:

Requirements: