AWSCloudDistributed SystemsGoogle Cloud PlatformJavaPythonReactTerraformTypeScriptGoAIGenerative AILarge Language ModelsGCPGoogle CloudSaaS
About this role
Role Overview
Orchestrate the Fleet: Maintain and improve Portal’s SaaS infrastructure for reliability, security, and scalability. This covers the runtime environments supporting the platform and workflows powered by large language models.
Modern Infra-as-Code: Collaborate with senior engineers to build infrastructure on GCP and AWS using Terraform and emerging infrastructure-from-code patterns where agents assist in defining the stack.
Support Fullstack Systems: Operate in a modern web stack environment (TypeScript, React, Python). While this isn’t a frontend-heavy role, comfort with debugging fullstack systems and web infrastructure is key.
Reliability Engineering: Participate in on-call rotations to ensure systems meet reliability and availability goals, employing AI assistants to accelerate root cause analysis and incident resolution.
Collaborate & Innovate: Participate in the planning and delivery of technical projects, defining how infrastructure evolves to support the next wave of generative AI features.
Requirements
Cloud Native & AI Curious: Brings hands-on experience with cloud infrastructure (GCP or AWS) and IaC tools like Terraform, with an interest in LLMs, RAG, or agents in an operational context.
Systems Thinker: Understands distributed systems principles and how to operate them reliably at scale, specifically addressing the challenges posed by non-deterministic AI workloads.
Polyglot Practitioner: Experienced with at least one modern programming language (e.g., TypeScript, Java, Go, Python) and comfortable navigating codebases where AI-generated PRs are the norm.
Quality & Automation: Prioritizes code quality and reliability, looking for ways to build systems that test themselves and improve through automated feedback loops.
Growth Mindset: Eager to evolve as an engineer in a landscape where the definition of "operations" changes rapidly. Familiarity with open-source projects or building "coding assistant" bots is a plus.