About this role

Harnham is an early-age cutting-edge organization seeking a Machine Learning Engineer to drive infrastructure scalability across state-of-the-art GenAI products. The role involves building and scaling ML infrastructure, optimizing models for performance, and architecting deployment strategies to support growth and complexity.

Responsibilities:

Build and scale ML infrastructure capable of serving high-volume, low-latency model inference
Optimize models and pipelines for performance, cost, and reliability in production environments
Productionize research and experimental models into scalable, maintainable ML systems
Architect infrastructure and deployment strategies that support continuous growth and evolving model complexity
Drive infrastructure development to handle a variety of models

Requirements:

Experience in ML platform infrastructure and deployment, including scaling training / inference, concurrency, queuing, back pressure, orchestration
Design and operate high-performance model serving systems with proven ownership of system stability, not just in deployment
Engineer solutions that efficiently manage parallel inference workloads at scale
Tune end-to-end serving pipelines to maximize responsiveness and overall system capacity
Python
AWS native stack
Docker, containers, SageMaker, Kubernetes

Machine Learning Infrastructure Engineer

Key skills

About this role

Responsibilities:

Requirements: