Sagent is transforming the mortgage servicing industry by bringing the modern experience customers now expect from loan originations to loan servicing. The Cloud Engineering team is seeking a Cloud Infrastructure Engineer to support infrastructure and application development teams, focusing on maintaining and evolving GCP and Kubernetes infrastructure while partnering with development teams to improve deployment processes.
Responsibilities:
- Operate and improve multi-region GKE clusters hosting hundreds of microservices across multiple environments from development through production
- Manage the Kubernetes platform layer: Istio service mesh, cert-manager, external-dns, RBAC, HPA/KEDA autoscaling, HashiCorp Vault secret injection, and Helm-based deployments
- Develop and maintain Terraform modules across multiple IaC repositories covering GKE, networking (Shared VPC, Cloud NAT, Private Service Connect), Cloud SQL, Cloud Storage, Dataproc, Cloud Composer, Vault, and web hosting
- Maintain and extend Azure DevOps CI/CD pipelines using shared Terraform templates with multi-environment deployment workflows
- Support Confluent Kafka infrastructure including Connect workers with JDBC source connectors, consumer group health monitoring, and Kafka-lag-based autoscaling with KEDA
- Manage Redis Enterprise clusters on Kubernetes with operator-managed lifecycle and replication
- Operate the observability stack: Grafana Cloud (Alloy, Loki, Mimir, Tempo, Pyroscope via Private Service Connect), kube-prometheus-stack, Google Managed Prometheus, OpenTelemetry Operator/Collector, Beyla, and Kubecost
- Harden cluster security posture: NetworkPolicies, Pod Security Standards, admission policy enforcement, CrowdStrike Falcon, Lacework, kube-bench, and cert-manager with Let’s Encrypt ACME
- Support data infrastructure including Cloud SQL (PostgreSQL), Dataproc (Spark), Cloud Composer (Airflow), Matillion CDC pipelines, Snowflake, and BigQuery
- Manage DNS across multiple providers (Azure DNS, Cloudflare, GCP Cloud DNS) via external-dns, and support Azure APIM and Cloudflare CDN/WAF
- Partner directly with application development teams to troubleshoot deployment failures, tune resource limits and autoscaling, and resolve Kafka consumer lag and connectivity issues
- Contribute to the Internal Developer Portal (Backstage) and internal CLI tooling that enables self-service for product engineers
Requirements:
- 7+ years of cloud or infrastructure engineering experience, including 5+ years of hands-on Azure OR GCP experience
- Strong production experience with GKE, VPC networking, IAM, Cloud SQL, Cloud Storage, and Artifact Registry
- Advanced Terraform experience, including reusable module design, state management, and multi-environment patterns
- Production Kubernetes expertise: Helm chart development and management, RBAC, resource tuning, and troubleshooting workloads at scale
- Hands-on experience with Istio service mesh: sidecar injection, mTLS, VirtualServices, AuthorizationPolicies, and traffic management
- Understanding of CNI fundamentals (Cilium/Dataplane V2), east-west traffic flows, and network segmentation
- Experience with CI/CD pipeline development (Azure DevOps YAML pipelines or equivalent) and trunk-based development workflows
- Hands-on experience with secrets management, including HashiCorp Vault (Kubernetes auth, agent injection) and GCP Secret Manager
- Proficiency in scripting (Bash, Python, or Go) with the ability to write production-quality automation and tooling
- Strong security mindset with experience implementing least-privilege IAM, certificate management, and policy-driven controls
- Clear and effective communicator able to work across infrastructure and application development teams
- Experience with event-driven architectures and Apache Kafka (Confluent Platform, Connect, consumer group management, KEDA-based scaling)
- Experience with Redis Enterprise on Kubernetes (operator-managed clusters, Active-Active replication)
- Familiarity with Grafana Cloud observability stack (Alloy, Loki, Mimir, Tempo, Pyroscope) and OpenTelemetry
- Experience with GCP data services: Dataproc (Spark), Cloud Composer (Airflow), BigQuery, Pub/Sub
- Familiarity with OPA/Rego or Kyverno for policy enforcement
- Experience with Azure APIM, Cloudflare, or multi-cloud DNS management
- Familiarity with Matillion or similar ETL/CDC tooling and Snowflake data warehouse
- Exposure to financial services or mortgage/loan servicing domain and associated compliance requirements
- Experience with Kubecost or similar FinOps tooling for cloud cost optimization
- Experience building or contributing to an Internal Developer Portal (Backstage)