Paramount Pictures is on a mission to unleash the power of content and is seeking a Lead DevOps Engineer for their Applied Intelligence Personalization Team. This role focuses on building and maintaining scalable infrastructure for real-time machine learning inference to enhance engagement and personalized messaging.
Responsibilities:
- Design, implement, and manage scalable and reliable infrastructure for online inference services
- Optimize Kubernetes-based deployments for low-latency model serving and real-time personalization
- Automate CI/CD pipelines to streamline the deployment of ML models and services
- Develop observability and monitoring solutions using tools like Prometheus, New Relic, and OpenTelemetry
- Ensure high availability, security, and performance of real-time inference APIs
- Work with ML engineers and backend teams to integrate inference models efficiently into production
- Implement autoscaling strategies for inference workloads based on traffic patterns and model demand
- Manage Pub/Sub and event-driven architectures to enable real-time messaging and engagement analytics
- Optimize model-serving infrastructure using Redis, Memcached, and other caching strategies
- Debug and tackle production issues related to latency, scaling, and reliability
Requirements:
- 4+ years of experience in DevOps, Site Reliability Engineering (SRE), or Cloud Infrastructure Engineering
- Solid experience with Kubernetes and container orchestration
- Hands-on experience with CI/CD tools such as GitHub Actions, Jenkins, and ArgoCD
- Experience working with real-time inference and ML model deployment
- Deep knowledge of Google Cloud Platform (GCP), AWS, or Azure
- Expertise in infrastructure as code (IaC) using Terraform or Helm
- Experience with message queues and event-driven architectures (Pub/Sub, Kafka, etc.)
- Proficiency in monitoring and logging solutions (New Relic, Prometheus, OpenTelemetry, etc.)
- Deep scripting skills in Python, Bash, or Go for automation
- Hands-on experience with ML model serving frameworks (TensorFlow Serving, Triton, TorchServe, etc.)
- Familiarity with load balancing, API gateways, and caching strategies
- Understanding of A/B testing frameworks and experimentation analysis
- Experience optimizing low-latency microservices for ML-based personalization
- Passion for building and maintaining high-performance infrastructure for real-time applications