SilverSearch, Inc. is a globally recognized media and information organization seeking a Machine Learning Engineer specializing in MLOps. This role focuses on deploying, operating, and scaling ML infrastructure to support enterprise intelligence products across various AI workloads, ensuring production reliability and efficiency.
Responsibilities:
- Design, deploy, and operate production ML infrastructure across Dev, QA, and Prod environments
- Manage ML deployment pipelines and runtime operations in AWS SageMaker
- Configure and optimize GPU/CPU infrastructure for large-scale inference workloads
- Implement monitoring, alerting, drift detection, and observability for ML systems
- Build deployment governance processes including rollout, rollback, and recovery strategies
- Support high-throughput ML workloads across text, image, and video pipelines
- Optimize infrastructure scalability, cost efficiency, and operational reliability
- Partner with ML Engineers and Data Scientists to operationalize new models and workflows
- Implement A/B testing and controlled rollout strategies for production ML systems
Requirements:
- Hands-on experience deploying and operating ML systems in production
- Strong AWS SageMaker experience, including: Pipelines, Endpoints, Monitoring, Multi-environment deployments
- Experience with containerized ML deployment and orchestration
- Experience operating PyTorch and TensorFlow inference systems
- Strong understanding of autoscaling, infrastructure optimization, and runtime reliability
- Experience implementing monitoring and observability frameworks for ML systems
- Experience supporting distributed ML workloads in cloud environments
- Experience supporting NLP and computer vision ML systems
- Familiarity with semantic/vector search infrastructure
- Experience with ranking/reranking systems
- Familiarity with ANN/vector indexing approaches
- Experience supporting large-scale text, image, and video processing pipelines
- Experience optimizing GPU-based infrastructure