FICO is a leading global analytics software company, helping businesses in 100+ countries make better decisions. As a DataOps / DevOps Engineer on the Generative AI team, you will build innovative solutions across the FICO platform, focusing on operationalizing machine learning models and maintaining scalable infrastructure.
Responsibilities:
- Design, build, and maintain scalable, resilient data and ML pipelines, infrastructure, and workflows using tools such as GitHub Actions, ArgoCD, Crossplane, Terraform, Helm, and others
- Automate infrastructure provisioning and configuration management using cloud-native services (preferably AWS) with tools like Terraform, CloudFormation, or Crossplane
- Design, containerize, and manage Kubernetes (EKS) clusters and/or ECS environments in AWS. Collaborate with development teams to optimize performance, deployment, and cost
- Partner with DevOps and SRE teams to ensure high availability, observability, scalability, and security of the data and ML infrastructure
- Work closely with Data Scientists and ML Engineers to operationalize machine learning models, including building CI/CD pipelines for model training, validation, and deployment
- Implement observability for data pipelines and ML services using tools like Prometheus, Grafana, Datadog, or similar
- Develop and maintain automated pipelines for model retraining, monitoring drift, and versioning in production
- Support experimentation and prototyping in areas such as Machine Learning and Generative AI, transitioning successful prototypes into production systems
- Ensure cloud infrastructure is secure, compliant, and cost-efficient, following best practices in governance, identity, and access management
Requirements:
- 7+ years of experience in DataOps, MLOps, or related fields, with at least 2 years focused on ML model operationalization and workflow automation
- Proficient in AWS services including EC2, S3, IAM, ACM, Route 53, CloudWatch, EKS, and ECS
- Experience with infrastructure as code (IaC) tools such as Terraform, CloudFormation, and Helm
- Familiarity with CI/CD for ML pipelines, GitOps practices, and tools like GitHub Actions, Jenkins, or Argo Workflows
- Strong scripting and automation skills using Python, or GitHub workflows
- Understanding of observability and monitoring tools (e.g., Prometheus, Grafana, Datadog, or OpenTelemetry)
- Solid understanding of security best practices for cloud and Kubernetes environments, including secrets management, identity & access control, and policy enforcement
- Excellent collaboration and communication skills, with a proven ability to work effectively in cross-functional, globally distributed teams
- A bachelor's degree in computer sciences, Engineering, or a related discipline, or equivalent hands-on industry experience
- Familiarity with data governance, lineage, and metadata management is a plus