Fragomen is seeking a senior‑level engineering role responsible for leading the evolution of Fragomen’s global cloud, container, and automation platforms. The Senior DevOps / Platform Engineer will set technical direction, drive modernization initiatives, and ensure the performance, reliability, and scalability of systems that support Fragomen’s mission‑critical immigration services.

Responsibilities:

Lead the design, evolution, and stabilization of global cloud, container, and automation platforms
Own and operate production Docker Swarm environments, including:
Cluster sizing and capacity planning
Scaling and service orchestration strategies
Troubleshooting distributed system issues such as networking, scheduling, and performance
Drive modernization efforts for CI/CD pipelines and deployment workflows
Define and enforce standards for build, test, security, and release processes across engineering teams
Architect, implement, and support scalable cloud infrastructure in AWS and/or Azure
Optimize environments for reliability, performance, cost efficiency, and compliance
Perform advanced root‑cause analysis using logs, metrics, traces, and cloud platform telemetry
Partner with development, security, and operations teams to improve platform reliability and automation
Contribute to architecture decisions, platform strategy, and engineering best practices
Mentor engineers and provide technical leadership across teams

Requirements:

Extensive hands‑on experience operating and improving large‑scale, production‑grade infrastructure
Significant production experience with container platforms, ideally Docker Swarm, including: Cluster design, scaling, and orchestration; Diagnosis and remediation of complex distributed systems issues
Strong CI/CD engineering background, with experience designing and refining pipelines using tools such as: GitLab CI, Jenkins, Octopus Deploy
Experience implementing approval workflows, automated rollback strategies, artifact management, and modernizing legacy pipelines
Deep experience with cloud infrastructure in AWS and/or Azure, including: Compute, networking, IAM, storage, and monitoring services
Strong Infrastructure as Code experience; Terraform experience is highly preferred
Solid Linux systems administration skills
Strong understanding of networking fundamentals, including: TLS, DNS, Load balancers and reverse proxies
Experience with secrets management and certificate lifecycle management
Strong ability to collaborate across cross‑functional teams
Comfortable mentoring and guiding engineers at varying levels of experience
Ability to communicate complex technical concepts clearly to both technical and non‑technical stakeholders
Demonstrated ability to influence platform strategy and drive engineering standards organization‑wide
Experience working in high‑security or regulated environments such as legal, financial, or enterprise‑grade organizations
Hands‑on experience with observability and monitoring platforms such as: CloudWatch, ELK, Prometheus, Splunk
Proven ability to apply monitoring and observability best practices to improve reliability and operational insight

Senior DevOps Engineer

Key skills

About this role

Responsibilities:

Requirements: