Coalfire is on a mission to make the world a safer place by solving clients’ hardest cybersecurity challenges. They are seeking a Technical Senior Manager of Site Reliability Engineering to implement and maintain scalable, secure, and high-performing systems, ensuring client infrastructures remain stable and resilient.
Responsibilities:
- Allocate approximately 70% of time to hands-on engineering tasks, such as developing new deployments, tooling, and automation scripts to address client needs
- Dedicate around 30% of time to leadership duties, including mentoring engineers, ensuring quality deliverables, and managing escalations
- Act as the primary escalation contact for complex technical issues, resolving them promptly to maintain high levels of client satisfaction
- Monitor and uphold quality standards for engineering work, confirming alignment with internal protocols, compliance regulations, and project milestones
- Identify and mitigate risks in partnership with consulting and solutions architecture teams, ensuring regulatory requirements and client expectations are fully addressed
- Coordinate day-to-day engineering activities, tracking progress and adjusting resources to meet project goals on schedule utilizing Agile practice methodologies
- Help create and implement solutions that improve the practice
Requirements:
- 9+ years in Systems Engineering and Architecture: Involving requirements definition, architecture development, systems integration, and testing
- 9+ years in Cloud Computing: Designing, implementing, operating, and automating environments within AWS, Azure, or GCP
- 9+ years with Infrastructure-as-Code: Hands-on proficiency in Terraform and Ansible for orchestration and automation
- SLA and Issue Management: Proven track record of meeting SLAs—particularly regarding availability, response times, and service posture—through effective collaboration and escalation processes
- Operational Excellence: Demonstrated success driving continuous improvement via KPIs and best practices for operational support
- Governance and Compliance: Experience guiding the creation of Infrastructure-as-Code solutions, governance models, and alignment with standards such as FedRAMP or other security frameworks
- Team Leadership: Proven track record of managing teams (6–8 contributors), focusing on career development, goal setting, project oversight, and daily guidance
- Regulatory Audit Prep: Prepared and coached teams for client-facing compliance audits with third-party auditors
- Project Definition and Documentation: Lead efforts of defining, planning, and documenting key Managed Services projects and initiatives; tracked outcomes against established goals
- Managed Services Expertise: Familiarity with ticket management systems and meeting SLA requirements in a managed services environment
- Cloud & Automation: Extensive experience with AWS, Azure, or GCP; deep knowledge of Terraform, Ansible, GitLab, and CI/CD technologies
- Technical Collaboration: Proven ability to collaborate with Site Reliability Engineers and cross-functional teams, facilitating team problem-solving and performance improvements
- Soft Skills: Strong interpersonal, organizational, and problem-solving skills; effective at building client trust
- Documentation & Communication: Capable of creating technical diagrams and comprehensive written documentation; able to convey complex ideas clearly
- Professionalism & Autonomy: Demonstrated ability to work both independently and as part of a team with a professional attitude and demeanor
- Security Mindset: Critical thinker capable of balancing stringent security and compliance requirements with mission objectives
- REQUIRED CERTIFICATIONS: Relevant Professional Cloud Certification (for example, AWS Solutions Architect, Azure Solutions Architect, or GCP equivalent)
- Additional advanced or specialty Cloud Certifications (for example, AWS DevOps Engineer, AWS Security Specialty, Azure Security Engineer)
- CISSP (Certified Information Systems Security Professional) or comparable cybersecurity certification
- Consulting Experience: Previous roles in technical consulting for external clients
- High-Availability Environments: Exposure to 24x7 operational settings or large-scale and high-availability system support
- Encryption and Hardening: Demonstrated expertise implementing SSL, PKI, FIPS 140-2, and enforcing security baselines such as CIS Benchmarks and DISA STIG
- Further Cloud and Security Specialization: Additional hands-on work with container orchestration (Kubernetes), advanced threat detection, or enterprise endpoint security