EXL is seeking a Cloud Data Platform Engineer to design and operate cloud infrastructure that powers advanced analytics and machine learning platforms for their clients. The role involves building scalable and secure environments for data scientists and machine learning engineers, while collaborating with various teams to deliver reliable cloud platforms for enterprise data initiatives.
Responsibilities:
- Build and manage cloud environments supporting data science and ML workloads (AWS, Databricks)
- Design scalable compute environments using Docker and Kubernetes (EKS/ECS)
- Optimize performance, reliability, and cost efficiency of cloud resources
- Implement Infrastructure as Code using tools such as Terraform or CloudFormation
- Develop and maintain CI/CD pipelines for platform services and ML deployment workflows
- Automate provisioning and configuration of cloud infrastructure
- Partner with data scientists, ML engineers, and data engineers to support analytics and model development workflows
- Enable reproducible development environments and self-service infrastructure capabilities
- Support integrations between data pipelines and cloud platforms
- Implement observability and monitoring using tools such as Datadog, Prometheus, Grafana, or CloudWatch
- Troubleshoot and resolve infrastructure and platform performance issues
- Implement cloud security best practices including IAM, secrets management, and secure data access controls
- Support compliance and governance requirements for sensitive financial and enterprise data
Requirements:
- 5+ years of experience in Cloud Engineering, DevOps, Platform Engineering, or SRE
- Strong hands-on experience with AWS
- Experience implementing Infrastructure as Code (Terraform, CloudFormation, or similar)
- Experience with containerized environments (Docker, Kubernetes, EKS/ECS)
- Strong scripting skills (Python, Bash, or PowerShell)
- Experience supporting data-intensive or analytics workloads
- Experience with Databricks, SageMaker, or ML platforms
- Familiarity with distributed compute frameworks (Spark, Ray, Dask)
- Experience supporting data science or analytics teams
- Experience working in financial services or regulated environments
- Exposure to MLOps or LLMOps workflows