Collectively work on the design, evolution, and operational health of CCM’s AWS environment, including architectural decisions, standards, and best practices
Design, implement, and optimize AWS-based infrastructure using services such as EC2, ECS/EKS, Lambda, RDS, S3, CloudWatch, IAM, and VPC
Design and manage cloud infrastructure using Infrastructure as Code (e.g., Terraform, CloudFormation, or equivalent)
Lead new implementations and major reliability initiatives, serving as a subject matter expert for AWS and SRE best practices
Actively monitor, analyze, and optimize AWS spend, providing regular cost insights and recommendations that balance reliability, performance, and fiscal stewardship
Apply and mature site reliability principles to improve system availability, scalability, performance, security, and observability
Design, analyze, and implement automation to eliminate operational toil and improve system efficiency
Provide advanced operations and systems administration for cloud-hosted and hybrid platforms supporting CCM’s IT systems and services
Define and improve monitoring, alerting, logging, and incident response practices to proactively identify risks and minimize customer impact
Lead complex production incidents, perform root cause analysis, and drive corrective and preventive actions
Mentor and provide technical guidance to junior and mid-level engineers without direct people-management responsibilities
Collaborate with engineering, QA, security, and business teams to embed reliability throughout the SDLC
Ensure systems and data are handled in compliance with legal, regulatory, and organizational requirements
Develop and continuously improve production engineering processes, including:
Change and configuration management
Monitoring and observability
Incident and emergency response
Disaster recovery and business continuity
Capacity planning and performance tuning
Infrastructure-as-code and deployment automation
Partner with leadership to establish and enforce consistent IT Production policies, standards, and tooling
Act as a change agent for long-term technical strategy, identifying risks, dependencies, and opportunities across systems and teams
Participate in a sustainable on-call rotation and contribute to ongoing improvements that reduce alert fatigue and operational overhead
Build strong cross-functional relationships to align reliability initiatives with business and ministry outcomes
Contribute to the exercise and expression of Christian Care Ministry’s Christian beliefs
Perform all other duties as assigned
Requirements
Bachelor’s degree or higher in a relevant field
computer science, information systems, or engineering or equivalent combination of education and relevant experience required
7+ years of experience solving customer problems with technical solutions, including 3+ years of site reliability engineering experience required
Extensive hands-on experience designing, operating, and scaling production AWS environments required
Strong preference for AWS certifications, including:
AWS Certified SysOps Administrator
AWS Certified DevOps Engineer
AWS Certified Solutions Architect
Experience with Agile and Scrum processes and complex IT projects
Knowledge of data protection operations and legislation (e.g. GDPR, HIPAA)
Experience in Financial or Healthcare payer-related field a plus
Tech Stack
AWS
Cloud
EC2
SDLC
Terraform
Benefits
100% paid Medical for employees/99% for family
Generous employer Health Savings Account (HSA) contributions
Employer-paid Life Insurance (3x salary) and Long-term Disability Insurance
6 weeks of paid parental leave (for both mom and dad)
Dental
two plans to choose from
Vision
Short-term Disability
Accident, Critical Illness, Hospital Indemnity
401(k) – up to 4% match on ROTH or Traditional contributions
Generous paid-time off and 11 paid holidays
Wellness plan including Financial, Occupational, Mental/Spiritual, and Physical health incentives up to $50/mo
Employee Assistance Program including no cost, in-person mental health visits and employee discounts