OpenVPN Inc. is building out AI tooling across the organization, and they are seeking an AI Platform Engineer to define architectural standards and build the platform layer from the ground up. The role involves managing AI developer tooling, internal AI-powered workflows, and ensuring compliance with governance and security standards.
Responsibilities:
- Own the rollout and operational management of AI-assisted development tools across engineering (e.g., Cursor, Copilot, Claude Code)
- Define and implement access controls, license management, and usage policies that satisfy SOC2/ISO 27001 requirements
- Build cost tracking and reporting so leadership has visibility into AI tool spend and usage patterns across the org
- Reduce friction for engineers adopting these tools while maintaining security and auditability
- Partner with teams across the org to identify, build, and support internal AI applications such as RAG pipelines, agents, and automation workflows
- Evaluate and recommend tooling, frameworks, and patterns based on what teams actually need
- Define where IaaS’s responsibility ends and consuming teams’ begins – this boundary doesn’t exist yet; you’ll help draw it
- Advise on data governance policies for LLM usage, including what data can go into which models, where outputs are stored, and how audit trails are maintained
- Ensure AI infrastructure and tooling meets existing SOC2 and ISO 27001 controls and can be evidenced in audits
- Provide leadership with clear, regular reporting on AI adoption, cost, risk, and usage across the org
- Stand up and manage AI/ML infrastructure, primarily on GCP (Vertex AI) within OpenVPN’s existing environment
- Design the Terraform modules and IaC patterns for AI infrastructure that follow the team’s existing conventions (e.g., Atlantis-driven GitOps workflows)
- Build visibility into AI/ML infrastructure costs and implement controls (spot instances, auto-scaling policies, idle resource cleanup) consistent with how compute costs are managed elsewhere
- Evaluate build-vs-buy decisions for AI/ML infrastructure components and managed services with an eye toward operational fit within existing patterns
Requirements:
- Hands-on experience standing up and managing AI/ML infrastructure such as Vertex AI, or comparable platforms (SageMaker, Azure ML)
- Has set up AI developer tooling (Cursor, Copilot, etc.) at an org level, including managing rollout, access, and cost
- Infrastructure-as-code fluency: Terraform is our primary tool, managed through Atlantis. You should be able to write modules that other teams consume through self-service
- Ability to work across teams and define boundaries for a new capability area that doesn't have established patterns yet
- Can communicate clearly with leadership about cost, risk, and value of AI investments
- Strong GCP experience
- Understanding of security and compliance in a regulated environment (SOC2, ISO 27001); experience implementing controls or working within audit frameworks
- Platform engineering mindset: thinks about how to make things self-service, auditable, and repeatable rather than one-off
- Experience with model serving infrastructure: managed inference services (Vertex AI endpoints, Bedrock, Azure OpenAI) or self-hosted frameworks, including routing, scaling, monitoring, and cost controls
- Experience with RAG infrastructure components such as vector databases, embedding pipelines, and retrieval systems
- Familiarity with LLM operational patterns including prompt caching, batching strategies, model versioning, and monitoring/observability approaches specific to inference workloads
- Familiarity with Kubernetes/EKS since the rest of the team's infra lives there
- Experience with data infrastructure (BigQuery, Kafka, Cloud Composer)
- Configuration management experience with Ansible or Puppet
- Familiarity with AI/ML managed services on AWS (Bedrock, SageMaker) and how they fit alongside self-managed infrastructure
- Has gone through an ownership transition or greenfield buildout before and knows how to establish patterns from scratch without over-engineering