Home
Jobs
Saved
Resumes
Site Reliability Engineer at TWG Global | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Site Reliability Engineer
TWG Global
Website
LinkedIn
Site Reliability Engineer
Jacksonville, Florida, United States of America
Full Time
2 hours ago
$120,000 - $190,000 USD
No H1B
Apply Now
Key skills
Airflow
AWS
Azure
Cloud
Docker
Google Cloud Platform
Grafana
Kubernetes
Linux
Prometheus
Python
Terraform
Bash
AI
ML
MLOps
MLflow
Kubeflow
GCP
Google Cloud
GitHub Actions
SageMaker
Datadog
GitHub
GitLab
CI/CD
Collaboration
About this role
Role Overview
Build and maintain infrastructure to support real-time and batch ML workloads
Implement observability tools (logging, monitoring, alerting) for model performance and system uptime
Design and manage CI/CD pipelines for ML and data applications
Ensure high availability, disaster recovery, and rollback capabilities for production environments
Manage access controls, secrets, and security policies in collaboration with compliance and IT
Troubleshoot incidents, lead postmortems, and drive root-cause resolution
Work with U.S. and international teams to provide 24/7 coverage across time zones
Requirements
3–6 years of experience in DevOps, SRE, or backend engineering roles
Proficient with tools like Docker, Kubernetes, Terraform, GitLab/GitHub Actions, Airflow
Strong scripting in Python or Bash and familiarity with Linux environments
Experience deploying and monitoring ML models or data pipelines in production
Knowledge of observability stacks (e.g., Prometheus, Grafana, ELK, Datadog)
Familiarity with cloud platforms (e.g., AWS, GCP, or Azure)
Strong documentation, problem-solving, and incident response skills
Preferred Qualifications:
Experience supporting ML/AI workflows using Palantir Foundry.
Exposure to compliance frameworks like SOC 2, ISO 27001, or financial regulations
Knowledge of MLOps frameworks (e.g., MLflow, Kubeflow, SageMaker Pipelines)
Ability to automate deployments, testing, and monitoring at scale
Tech Stack
Airflow
AWS
Azure
Cloud
Docker
Google Cloud Platform
Grafana
Kubernetes
Linux
Prometheus
Python
Terraform
Benefits
Work on real-world AI applications with high-impact clients
Collaborate with world-class data scientists, engineers, and product leaders
Flat org structure, high trust, high autonomy
Competitive salary + performance-based incentives
Apply Now
Home
Jobs
Saved
Resumes