Marqeta is seeking an experienced Senior Manager, Platform Engineering to lead their infrastructure software engineering team responsible for a Kubernetes-based compute platform. This role involves driving platform modernization, cost optimization, and operational excellence while mentoring a team of engineers.
Responsibilities:
- Lead, mentor, and grow a team of infrastructure software engineers focused on Kubernetes platform engineering
- Build a culture of innovation, operational excellence, and customer-focused platform development
- Recruit top talent and develop career growth paths for team members
- Foster collaboration with application development teams, SRE, security, and other infrastructure teams
- Drive technical decision-making while empowering engineers to own their solutions
- Define and execute the technical roadmap for Marqeta's Kubernetes compute platform
- Drive continuous platform modernization to support evolving business needs and scale requirements
- Champion platform-as-a-product mindset, treating internal engineering teams as customers
- Evaluate and integrate emerging technologies and AWS services to improve platform capabilities
- Lead architectural decisions for container orchestration, service mesh, observability, and developer tooling
- Develop and implement strategies to optimize Kubernetes infrastructure costs without compromising performance or reliability
- Monitor and analyze compute resource utilization, identifying opportunities for right-sizing and efficiency gains
- Implement FinOps practices including chargeback/showback models, budget alerting, and cost allocation
- Drive adoption of cost-effective AWS services and spot instances where appropriate
- Partner with engineering teams to optimize application resource requests and limits
- Ensure ultra-high availability of the production Kubernetes platform supporting payment processing workloads
- Establish SLOs/SLIs for platform reliability and performance
- Lead incident response for platform-level issues and drive continuous improvement through blameless postmortems
- Implement comprehensive monitoring, alerting, and observability solutions
- Balance innovation with stability through disciplined change management and deployment practices
- Design and implement CI/CD pipelines and deployment automation for platform infrastructure
- Apply software development best practices to infrastructure code (testing, code review, version control)
- Drive infrastructure-as-code initiatives using Terraform and other automation tools
- Collaborate with security teams to embed security into the platform and SDLC
- Enable developer productivity through self-service capabilities and golden paths
Requirements:
- 8+ years of experience in infrastructure engineering, platform engineering, or DevOps roles
- 5+ years of people management experience, leading technical teams through complex initiatives
- Deep expertise with Kubernetes in production environments at scale (architecture, operations, troubleshooting)
- Extensive Cloud fundamental knowledge (AWS preferred, including EKS, EC2, VPC, IAM, and other core services)
- Proven track record of Kubernetes cost optimization and resource efficiency improvements
- Strong understanding of SDLC methodologies, CI/CD practices, and infrastructure-as-code
- Experience managing ultra-high availability systems (99.99%+ uptime) in production
- Proficiency with infrastructure-as-code tools (Terraform, CloudFormation, etc.)
- Hands-on experience with container technologies, service mesh, and cloud-native architectures
- Strong technical leadership skills with ability to influence without authority across the organization
- Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
- Experience in payments, financial services, or other highly regulated industries
- Familiarity with PCI-DSS, SOC 2, and compliance requirements in cloud environments
- AWS certifications (Solutions Architect, DevOps Engineer, or Security Specialty)
- Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
- Experience with service mesh technologies (Istio, Linkerd, AWS App Mesh)
- Background with observability platforms (Prometheus, Grafana, Datadog, New Relic, etc.)
- Knowledge of GitOps practices and tools (ArgoCD, Flux)
- Experience with multi-tenancy and platform security patterns
- Familiarity with FinOps frameworks and cloud cost management tools