GiveCampus is the world's leading fundraising platform for non-profit educational institutions, dedicated to enhancing the quality and accessibility of education. They are seeking a Senior Machine Learning Engineer to own the productionization and operational lifecycle of machine learning models, collaborating closely with Data Scientists to transition validated models into production systems. This role is pivotal in defining the direction of their ML Platform and involves responsibilities across model productionization, pipeline development, deployment, and ongoing maintenance.
Responsibilities:
- Transform non-production prototypes (e.g. Jupyter notebooks, standalone scripts, etc.) into modular, tested, production-ready Python code
- Containerize models with proper dependency management (Docker, ECR)
- Implement comprehensive testing: unit tests, integration tests, model validation
- Build automated training pipelines using SageMaker Pipelines and Step Functions
- Develop batch and real-time inference pipelines based on use case requirements
- Integrate with Snowflake for feature retrieval and prediction storage
- Deploy models to SageMaker endpoints for real-time inference
- Configure batch transform jobs for bulk predictions
- Integrate predictions with our Rails application via APIs and webhooks
- Monitor model performance, latency, and drift in production
- Build automated retraining pipelines triggered by schedule or drift detection
- Own incident response for ML systems—you're on the hook when models break
- Optimize costs across compute, storage, and inference
- Build reusable templates, libraries, and tooling that accelerate future model deployments
- Create self-service capabilities that enable Data Science to deploy and test models with minimal friction
- Document patterns, runbooks, and best practices for ML operations
Requirements:
- 5+ years of software engineering experience, with 3+ years focused on ML systems
- Strong Python skills with emphasis on production code quality (not just notebooks)
- Experience deploying and operating ML models in production environments
- Hands-on experience with AWS (SageMaker preferred, but strong AWS fundamentals work)
- Proficiency with Docker and containerization best practices
- Understanding of ML concepts sufficient to work effectively with Data Scientists
- Experience building data pipelines and working with data warehouses (Snowflake a plus)
- Experience with SageMaker Pipelines, Feature Store, Model Registry
- Familiarity with Step Functions, EventBridge, or similar orchestration tools
- Infrastructure as Code experience (Terraform, CDK, CloudFormation)
- Experience with LLMs, RAG architectures, or generative AI applications
- Experience integrating ML systems with web applications (Rails, APIs)
- Background in B2B SaaS or EdTech