Coforma is a remote-first company dedicated to improving lives through ethical technology. They are seeking an experienced Principal Platform Engineer to join their innovation lab, focusing on building a state-of-the-art product and developer platform from the ground-up while promoting best practices in platform reliability and security.
Responsibilities:
- Contribute to our core platform codebase by helping build and maintain shared services such as authentication and identity, API services, and deployment pipelines
- Help research and design a future internal platform portal to catalogue our products and services, and centralize self-service workflows
- Identify areas of developer toil and friction, and design self-service solutions to reduce repetitive tasks and frequent support requests
- Manage CI/CD pipelines and deployment automation for containerized and serverless applications
- Partner with Coforma's growth team to deliver innovative solutions and prototypes on our platform
- Promote a Platform-as-a-Product mindset to accelerate the delivery of solutions for all Coforma departments
- Serve as a Site Reliability Engineer (SRE) and DevOps lead for the innovation department, supporting product and platform vision by helping shape an in-house SRE and DevOps practice
- Implement and oversee an observability stack (Grafana, Datadog, OTEL, Sentry, Incident.io) to standardize the monitoring, alerting, incident response, and performance of our products and services
- Collaborate across departments as a subject matter expert (SME) to promote SRE best practices and principles within other Engineering and Delivery teams
- Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to help drive engineering and product excellence, and work cross-functionally to help define customer Service Level Agreements (SLAs)
- Lead and promote best practices of Incident Response and Management (IRM) using modern IRM tools
- Mentor junior Platform, DevOps, and SREs
- Help manage our Google Cloud Platform (GCP) environment with a focus on performance, resiliency, and security
- Use and promote Infrastructure as Code (IaC) such as Terraform or Pulumi for consistent, secure deployments and change management
- Assist developers across engineering teams in their use of cloud infrastructure
- Contribute to the design of high-availability cloud architecture
- Implement shift-left security to help developers ship secure code with minimal friction
- Assist Platform leadership with compliance goals such as SOC 2, HIPAA, NIST 800-53, and other regulatory standards
- Promote secure SDLC and DevSecOps tools and practices
- Build relationships with Platform leadership, Innovation department leads, engineers, and others in the company who you will be collaborating with
- Audit current observability stack and SRE tooling and begin developing a knowledge of our tech stack
- Learn about the SRE, DevOps, and platform needs of other Engineering teams
- Begin developing an idea of what “platform engineering” means for the Innovation team and for the broader company
- Identify core platform components that need to be built to meet the Innovation and Engineering teams’ needs
- Manage day-to-day Platform, DevSecOps, and SRE tasks as our in-house Platform Lead, including owning observability, on-call, incident management, and developer relations
- Collaborate with developers and provide technical leadership and guidance to support the Innovation department’s vision from a platform perspective
- Begin contributing to our codebase, whether for IaC, platform services, or observability tooling
- Establish baselines and standards for monitoring dashboards, telemetry, alerts, and on-call that can be reproduced across diverse Engineering teams
- Ensure Innovation engineers have golden paths to deploy prototypes with minimal friction
- Work with Platform leadership to plan the tooling and architecture for our IDP portal
- Contribute to the broader internal launch of our platform to support company growth and solutions
- Deliver tangible improvements to product velocity and reliability (measured time-to-deploy, time-to-debug, time-to-recovery, latency, uptime, etc.)
- Own SLOs/SLIs and incident management for platform services
- Make recommendations for the growth of an in-house platform practice across the company, and promote the Innovation department’s standards among other Engineering individual contributors
- Lead by example to help Coforma develop the best practices for a scalable and reliable platform of SaaS products
Requirements:
- Minimum of 8 years of experience in Platform Engineering, SRE, and DevOps roles
- 2+ years of experience in senior and technical lead roles in Engineering teams
- Expertise with our core Platform stack: GCP, GitHub, Datadog, Grafana
- Experience building containerized platform services (Javascript, Go, Rust, Python)
- Experience developing and maintaining internal platform tools for developers, owning SLOs, and championing product performance and velocity
- Demonstrated ability to lead incident response, security reviews and remediation, and navigate compliance requirements
- Excellent collaboration, communication, and stakeholder management skills
- Comfortable with ambiguity and able to support experiments, measure outcomes, and adjust course quickly
- Full-time resident of the contiguous United States (must be legally authorized to work in the US now and in the future without sponsorship)
- Observability tools such as Grafana, Datadog, OTEL, Prometheus, Sentry, Incident.io
- IaC expertise (Terraform, Pulumi)
- Containers and orchestration platforms (Docker, Kubernetes), and serverless architectures
- Programming or scripting language (Python, JavaScript, Bash, Go, Rust)
- SRE principles, DevSecOps, and DevEx improvement practices
- Modern CI/CD pipelines (e.g., GitHub Actions, Google Cloud Deploy)
- High-traffic consumer web or multi-tenant SaaS environments preferred
- Building blocks of the web (HTTP, HTML, CSS, JS/TS, JSON, SSL) and strong first-principles systems engineering foundations
- Security and compliance implications (SOC 2, HIPAA) for SaaS-style or licensed products, and can design with them in mind