Jensen Hughes is a leader in fire protection engineering and risk-based fields, dedicated to making the world safe and secure. They are seeking a Platform Engineer to build and operate their cloud platform on AWS, focusing on infrastructure CI/CD and observability while enabling AI capabilities.

Responsibilities:

AWS platform engineering (multi-account)
Design, build, and operate secure, reliable AWS foundations across a multi-account AWS environment (AWS Organizations / Control Tower where applicable), including networking, IAM, KMS, secrets, tagging, and shared services
Establish scalable patterns for compute, storage, and networking; enable repeatable environments across dev/stage/prod
Improve developer experience through standards, templates, and clear platform documentation
Own Terraform architecture end-to-end: module strategy, state design, environment separation, provider/version management
Build and maintain a production-grade Terraform SDLC:
PR-driven workflows with plan previews, approvals, and promotion across environments
Controlled apply mechanisms with audit trails and rollback plans
Drift detection and safe reconciliation strategy • Import/migration/refactor patterns without downtime
Implement baseline guardrails (tagging, encryption, access controls) as code wherever feasible
Implement PR-driven infrastructure delivery using GitOps principles (not Kubernetes-only):
Git as the source of truth; PRs as change requests
Automated validation/testing/security checks on every change
Safe promotion model (dev → stage → prod) with appropriate gates
Controlled applies for production (approval gates / break-glass procedures), with full traceability
Standardize pipelines in the team’s primary CI/CD platform (GitHub Actions) and integrate with existing enterprise tooling where needed
Establish repo structure, branching strategy, and operational runbooks for the infrastructure delivery workflow
Own the Splunk observability operating model: dashboards, alerting standards, SLOs/SLIs, runbooks, and on-call readiness
Build/operate telemetry pipelines for reliability and cost efficiency (noise reduction, sampling/cardinality strategies, retention and routing)
Partner with application teams to improve visibility, reduce MTTR, and drive incident learnings into platform improvements
Partner with engineering teams to enable agentic AI use cases using Amazon Bedrock and AgentCore (tool integration patterns, secure operation, production readiness)
Help establish foundational patterns for agent deployment and operations (environments, permissions, observability, and evaluation/reliability practices) aligned to enterprise controls
Participate in incident response; lead postmortems and drive systemic, preventive fixes
Measure and improve platform reliability, security posture, and cost efficiency over time

Requirements:

8–10 years of experience in Platform Engineering / SRE / DevOps (or equivalent experience delivering platform outcomes)
AWS expertise, including multi-account patterns (AWS Organizations / Control Tower preferred), networking, IAM/security, and operations
Terraform expert with proven ownership of org-scale infrastructure-as-code (modules, state, CI controls, large refactors)
Proven experience designing Infrastructure CI/CD and PR-driven infrastructure delivery (GitOps principles) for Terraform and cloud configuration
PR-based automation with plan previews and security/policy checks
Controlled apply processes with approvals and auditability
Environment promotion patterns and rollback strategies
Strong production experience with observability platforms such as Splunk, Datadog, Grafana, or Dynatrace, including building and operating dashboards, alerting standards, and telemetry pipelines (logs/metrics/traces) in production
Strong Linux and troubleshooting skills; proficiency in automation (Python or Go preferred)
Experience building agentic AI solutions using Amazon Bedrock Agents and/or Amazon Bedrock AgentCore (deployment/operations, tool integration patterns)
OpenTelemetry at scale (standards, collectors/gateways, sampling, correlation across logs/metrics/traces)
Policy-as-code experience (Conftest/Sentinel or similar) applied to Terraform and platform guardrails
Experience building an Internal Developer Platform (IDP) / self-service workflows (golden paths, templates, paved roads)
Databricks on AWS platform support (workspace/cluster policies, reliability, cost controls; Unity Catalog familiarity a plus)

Platform Engineer

Key skills

About this role

Responsibilities:

Requirements: