FICO is a leading analytics and decision management company that empowers businesses and individuals around the world with data-driven insights. As a DataOps / DevOps/ MLOps Engineer on the Generative AI team, you will design and maintain scalable data and ML pipelines, collaborate with cross-functional teams, and operationalize machine learning models to drive innovation across FICO’s platform.
Responsibilities:
- Design, build, and maintain scalable, resilient data and ML pipelines, infrastructure, and workflows using tools such as GitHub Actions, Terraform, Helm, ArgoCD, Crossplane and others
- Automate infrastructure provisioning and configuration management using cloud-native services (preferably AWS) with tools like Terraform and CloudFormation
- Design, containerize, and manage Kubernetes (EKS) clusters and/or ECS environments in AWS. Collaborate with development teams to optimize performance, deployment, and cost
- Partner with DevOps and SRE teams to ensure high availability, observability, scalability, and security of the data and ML infrastructure
- Work closely with Data Scientists and ML Engineers to operationalize machine learning models, including building CI/CD pipelines for model training, validation, and deployment
- Implement observability for data pipelines and ML services using tools like Prometheus, Grafana, Datadog, or similar
- Develop and maintain automated pipelines for model retraining, monitoring drift, and versioning in production
- Support experimentation and prototyping in areas such as Machine Learning and Generative AI, transitioning successful prototypes into production systems
- Ensure cloud infrastructure is secure, compliant, and cost-efficient, following best practices in governance, identity, and access management
Requirements:
- 8+ years experience in DataOps, MLOps, or related fields, with at least 2+ years focused on ML model operationalization and workflow automation
- Proficient in AWS services including EC2, S3, IAM, ACM, Route 53, CloudWatch, EKS, and ECS
- Strong experience with infrastructure as code (IaC) tools such as Terraform, CloudFormation, and Helm
- Experience with CI/CD for ML pipelines, GitOps practices, and tools like GitHub Actions, Jenkins, or Argo Workflows
- Strong scripting and automation skills using Bash, Python, or GitHub workflows
- Understanding of observability and monitoring tools (e.g., Prometheus, Grafana, Datadog, or OpenTelemetry)
- Solid understanding of security best practices for cloud and Kubernetes environments, including secrets management, identity & access control, and policy enforcement
- Excellent collaboration and communication skills, with a proven ability to work effectively in cross-functional, globally distributed teams
- A Bachelor/Master's degree in computer sciences, Engineering, or a related discipline, or equivalent hands-on industry experience
- Familiarity with data governance, lineage, and metadata management is a plus