Sagent is transforming the mortgage servicing industry by making loans and homeownership simpler and safer for consumers. They are seeking a Senior Cloud Infrastructure Engineer to support infrastructure and application development teams, focusing on maintaining and evolving GCP and Kubernetes infrastructure while partnering with development teams to enhance delivery and troubleshoot production issues.
Responsibilities:
- Operate and improve multi-region GKE clusters hosting hundreds of microservices across multiple environments from development through production
- Manage the Kubernetes platform layer: Istio service mesh, cert-manager, external-dns, RBAC, HPA/KEDA autoscaling, HashiCorp Vault secret injection, and Helm-based deployments
- Develop and maintain Terraform modules across multiple IaC repositories covering GKE, networking (Shared VPC, Cloud NAT, Private Service Connect), Cloud SQL, Cloud Storage, Dataproc, Cloud Composer, Vault, and web hosting
- Maintain and extend Azure DevOps CI/CD pipelines using shared Terraform templates with multi-environment deployment workflows
- Support Confluent Kafka infrastructure including Connect workers with JDBC source connectors, consumer group health monitoring, and Kafka-lag-based autoscaling with KEDA
- Manage Redis Enterprise clusters on Kubernetes with operator-managed lifecycle and replication
- Operate the observability stack: Grafana Cloud (Alloy, Loki, Mimir, Tempo, Pyroscope via Private Service Connect), kube-prometheus-stack, Google Managed Prometheus, OpenTelemetry Operator/Collector, Beyla, and Kubecost
- Harden cluster security posture: NetworkPolicies, Pod Security Standards, admission policy enforcement, CrowdStrike Falcon, Lacework, kube-bench, and cert-manager with Let’s Encrypt ACME
- Support data infrastructure including Cloud SQL (PostgreSQL), Dataproc (Spark), Cloud Composer (Airflow), Matillion CDC pipelines, Snowflake, and BigQuery
- Manage DNS across multiple providers (Azure DNS, Cloudflare, GCP Cloud DNS) via external-dns, and support Azure APIM and Cloudflare CDN/WAF
- Partner directly with application development teams to troubleshoot deployment failures, tune resource limits and autoscaling, and resolve Kafka consumer lag and connectivity issues
- Contribute to the Internal Developer Portal (Backstage) and internal CLI tooling that enables self-service for product engineers
Requirements:
- 5+ years of cloud or infrastructure engineering experience, including 3+ years of hands-on GCP experience
- Strong production experience with GKE, VPC networking, IAM, Cloud SQL, Cloud Storage, and Artifact Registry
- Advanced Terraform experience, including reusable module design, state management, and multi-environment patterns
- Production Kubernetes expertise: Helm chart development and management, RBAC, resource tuning, and troubleshooting workloads at scale
- Hands-on experience with Istio service mesh: sidecar injection, mTLS, VirtualServices, AuthorizationPolicies, and traffic management
- Understanding of CNI fundamentals (Cilium/Dataplane V2), east-west traffic flows, and network segmentation
- Experience with CI/CD pipeline development (Azure DevOps YAML pipelines or equivalent) and trunk-based development workflows
- Hands-on experience with secrets management, including HashiCorp Vault (Kubernetes auth, agent injection) and GCP Secret Manager
- Proficiency in scripting (Bash, Python, or Go) with the ability to write production-quality automation and tooling
- Strong security mindset with experience implementing least-privilege IAM, certificate management, and policy-driven controls
- Clear and effective communicator able to work across infrastructure and application development teams
- Experience with event-driven architectures and Apache Kafka (Confluent Platform, Connect, consumer group management, KEDA-based scaling)
- Experience with Redis Enterprise on Kubernetes (operator-managed clusters, Active-Active replication)
- Familiarity with Grafana Cloud observability stack (Alloy, Loki, Mimir, Tempo, Pyroscope) and OpenTelemetry
- Experience with GCP data services: Dataproc (Spark), Cloud Composer (Airflow), BigQuery, Pub/Sub
- Familiarity with OPA/Rego or Kyverno for policy enforcement
- Experience with Azure APIM, Cloudflare, or multi-cloud DNS management
- Familiarity with Matillion or similar ETL/CDC tooling and Snowflake data warehouse
- Exposure to financial services or mortgage/loan servicing domain and associated compliance requirements
- Experience with Kubecost or similar FinOps tooling for cloud cost optimization
- Experience building or contributing to an Internal Developer Portal (Backstage)
- GCP Professional Cloud Architect or Kubernetes Administrator (CKA) certification