LVT (LiveView Technologies) is on a mission to enhance safety and security through innovative technology solutions. As a MLOps Engineer III, you will be responsible for developing and maintaining infrastructure for machine learning operations, providing leadership and mentorship to a team, and ensuring the integration of ML workflows across various platforms.
Responsibilities:
- Provide technical leadership and mentorship to a growing team of engineers working on machine learning infrastructure
- Help set the technical direction, define best practices, and drive the adoption of modern ML Ops methodologies and technologies
- Design, build, and maintain scalable and robust ML Ops infrastructure for cloud and edge deployments
- Evaluate and make build vs. buy decisions for ML tools and platforms
- Develop and optimize data pipelines to support machine learning workflows
- Integrate and manage machine learning tools for model training, validation, and deployment
- Oversee the deployment of machine learning models to cloud environments and edge devices
- Ensure robust and reliable deployment processes, including continuous integration and continuous deployment (CI/CD)
- Implement observability solutions to monitor the performance and health of ML models and infrastructure
- Proactively identify and address issues in ML pipelines and deployments
- Collaborate with cross-functional teams such as data scientists, software engineers, and product managers to ensure seamless integration and successful delivery of ML products
- Foster effective communication channels and promote a culture of collaboration and knowledge sharing
- Drive continuous improvement initiatives to enhance ML Ops processes, productivity, and efficiency
- Identify bottlenecks, streamline workflows, and implement tools and methodologies to optimize the ML development lifecycle
Requirements:
- Bachelor's or Master's degree in Computer Science, Software Engineering, Electrical/Computer Engineering, or a related field
- 5-7+ years of professional experience in software engineering, with a focus on ML Ops or related fields
- Strong expertise in cloud platforms (e.g., AWS, GCP, Azure) and edge computing
- Proficiency in programming languages such as Python, GoLang, or C++
- Experience with ML Ops tools and frameworks (e.g., Kubeflow, MLflow, PyTorch, etc)
- Excellent problem-solving, debugging, and analytical skills, with the ability to navigate complex technical challenges
- Strong interpersonal and communication skills, with the ability to collaborate effectively with diverse stakeholders
- Demonstrated ability to thrive in a fast-paced and dynamic environment, managing multiple priorities simultaneously
- Track record of delivering high-quality machine learning infrastructure and/or products on time and within budget
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes)
- Familiarity with data engineering and big data technologies
- Knowledge of security best practices in ML Ops and cloud deployments
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana)