Fusion Risk Management is a fast-growing, innovative company recognized for its supportive culture and commitment to operational resilience. The Machine Learning Engineer will design, build, and maintain production-grade machine learning systems, focusing on reinforcement learning and optimization-driven intelligence to enhance resilience capabilities.
Responsibilities:
- Design, build, deploy, and maintain production machine learning systems, including reinforcement learning components and intelligent optimization-driven features
- Architect scalable ML pipelines for training, validation, deployment, monitoring, and automated retraining
- Maintain and expand operations for simulation (Monte Carlo, Bayesian Networks) and optimization engines (linear, constraint, CP-SAT) for continued reliable service
- Own ML Ops and AI Ops practices, including CI/CD for models, automated testing, model validation, performance monitoring, drift detection, observability, and governance frameworks
- Refactor and harden existing AI systems to improve scalability, latency, cost efficiency, and fault tolerance
- Build and maintain data pipelines and feature engineering workflows that support reliable and reproducible model training
- Collaborate closely with product and engineering teams to translate resilience use cases into scalable, maintainable ML-powered product capabilities
- Contribute to the design of Fusion's ML architecture, infrastructure standards, and long-term intelligent systems roadmap
Requirements:
- Bachelor's or Master's degree in Computer Science, Machine Learning, Artificial Intelligence, Engineering, or a related field
- 3+ years of experience building, deploying, and operating machine learning systems in production environments
- Strong software engineering foundation with hands-on experience building and deploying machine learning systems in production environments
- Experience designing ML architectures, APIs, and services that integrate with enterprise SaaS platforms
- Deep understanding of model lifecycle management: experimentation, validation, deployment, monitoring, retraining, and versioning
- Experience with reinforcement learning, decision systems, simulation modeling, or optimization techniques
- Strong experience building scalable data and feature pipelines using cloud-native tools (e.g., Azure, Snowflake, dbt, Salesforce integrations, or similar platforms)
- Proficiency in writing clean, maintainable, well-tested code with version control, CI/CD, and observability best practices
- Familiarity with containerization and distributed systems (Docker, Kubernetes, serverless architectures, or similar)
- Ability to design modular, extensible ML systems that evolve alongside product requirements
- Strong communication skills and the ability to explain system behavior, tradeoffs, and architectural decisions to technical and non-technical stakeholders
- Experience with reinforcement learning, decision intelligence systems, or control systems
- Experience with simulation, optimization, constraint programming, or operations research techniques
- Experience Building ML Pipelines In Cloud Environments (Azure Preferred)
- Experience implementing ML Ops tooling for testing, validation, monitoring, retraining, and governance
- Experience deploying AI-powered systems within enterprise SaaS environments