Manage and evolve the end-to-end architectural blueprint for cloud-native applications at scale — covering compute, data, networking, security, and observability layers on AWS.
Define and promote agentic AI architecture patterns, including multi-agent systems, tool orchestration, memory and retrieval strategies, LLM routing, guardrails, and feedback loops.
Translate these patterns into relevant standards and reusable reference architectures.
Establish and govern DevOps standards across the organization: CI/CD pipeline design, GitOps practices, deployment strategies (blue/green, canary, progressive delivery), environment parity, and release engineering.
Design and oversee platform infrastructure. This includes ECS, Kubernetes clusters (EKS), service mesh, API gateway strategy, secrets management, IAM/RBAC governance, and data stores like DynamoDB, Aurora RDS, Postgres, Elastic/OpenSearch. Additionally, it involves multi-account/multi-region AWS topology.
Lead architecture reviews and provide binding technical decisions on high-risk changes — balancing velocity, reliability, cost, and security.
Introduce new technologies through structured PoC and risk assessment processes; build internal communities of practice around architectural patterns.
Produce architecture decision records (ADRs), system context diagrams, threat models, and runbooks that serve as the source of truth for platform design.
Be a technical advisor to Product and Partners — translating architectural constraints and capabilities into clear strategic options.
Mentor staff and senior engineers on architectural thinking, systems design, and engineering leadership; conduct design reviews that raise the technical bar across teams.
Requirements
B.S. or M.S. degree in Computer Science, Systems Engineering, or a related discipline
8+ years of experience in software engineering and systems architecture, with 3+ years in a dedicated architect or principal engineer role.
5+ years of AWS architecture experience at scale
including multi-account organization design, landing zone/Control Tower, VPC design, IAM governance, and cost optimization
Experience architecting distributed systems and services that sustain high availability (≥99.99%), low latency, and elastic scalability under variable production load.
Experience designing agentic AI architectures: orchestration layers, agent frameworks (LangGraph, AutoGen, Semantic Kernel, or equivalent), tool/API integration, evaluation pipelines, and operational monitoring of AI systems.
DevOps and platform engineering expertise: Kubernetes (EKS), Terraform/CDK, Helm, ArgoCD/Flux (GitOps), GitHub Actions/Jenkins/Harness CI/CD, and container security practices.
Background in API gateway patterns (REST, gRPC, async event-driven).
Experience establishing SRE practices including SLO definition, error budgets, runbooks, chaos engineering, and game days.
Grasp of cloud security architecture: zero-trust networking, secrets management (Vault, AWS Secrets Manager), SIEM integration, and compliance frameworks (SOC 2, ISO 27001)
Experience working across data architecture concerns: streaming pipelines (Kafka, Kinesis), data lakes, OLAP/OLTP boundary design, and caching strategies (ElastiCache, Redis).