Button is a company focused on empowering the creator and affiliate economy through innovative mobile solutions. As a Senior Machine Learning Engineer, you will manage the entire ML lifecycle, collaborating with various teams to develop and optimize machine learning models that enhance Button's commerce and monetization products.
Responsibilities:
- Own the full ML lifecycle including feature pipelines, training workflows, model deployment, inference services, monitoring, and retraining
- Design and build reliable data and feature pipelines, including feature store patterns that support reproducible training and consistent features across training, batch scoring, and online inference
- Build and optimize machine learning models including regression, classification, ranking, and recommender systems
- Implement and manage batch scoring pipelines and online inference services with clear performance, reliability, and latency standards
- Partner with data scientists to operationalize models and build the tooling needed to run consistent evaluation, experimentation, and model iteration
- Collaborate with software engineers to ensure smooth integration of models into production services and APIs
- Establish observability for ML systems including monitoring of data freshness, feature drift, model performance, and pipeline health
- Design systems that support rapid experimentation and safe rollout of new models
- Document architecture clearly, establish best practices for ML engineering at Button, and mentor teammates through thoughtful code reviews and design discussions
- Contribute to the design of decisioning systems that power ranking, recommendations, and commerce optimization across Button’s platform
Requirements:
- 5+ years of professional experience in machine learning engineering, software engineering, data engineering, or similar roles
- Fluency with Python and SQL
- Proven experience designing, building, and operating data pipelines at scale
- Hands on experience deploying and maintaining machine learning models in production environments
- Experience working in cloud environments, especially AWS
- Write clear, maintainable code with strong software engineering practices including testing, documentation, debugging, and thoughtful system design
- Have experience building and operating production machine learning systems rather than only training models
- Understand the full ML lifecycle including feature generation, training pipelines, deployment strategies, and monitoring
- Have practical experience designing scalable data pipelines and feature generation workflows
- Have experience building or working with feature pipelines or feature stores that support both training and online inference
- Think deeply about reliability, scalability, latency, and cost efficiency when building ML systems
- Are comfortable working in ambiguous problem spaces and translating product questions into measurable ML solutions
- Enjoy collaborating closely with engineers, data scientists, and product managers and can clearly communicate technical tradeoffs and design decisions
- Familiarity with Amazon SageMaker, Redis, Spark, streaming systems, or distributed data processing frameworks
- Experience with machine learning frameworks such as PyTorch, TensorFlow, or scikit-learn
- Familiarity with orchestration and data modeling tools such as Airflow, dbt, or similar systems
- Experience building ranking, recommendation, or decisioning systems is a plus