Harrison Clarke is a high-growth AI startup seeking AI Infrastructure Engineers to design and scale the cloud-native foundation powering its AI platform. This role focuses on building robust, cost-efficient infrastructure to support large-scale, real-time AI workloads, while owning the end-to-end infrastructure stack.

Responsibilities:

Architect and scale infrastructure for low-latency, high-throughput AI inference
Orchestrate GPUs and manage multi-tenant workloads using Kubernetes and service mesh technologies
Build and operate core systems including infrastructure-as-code, observability, distributed storage, and networking
Implement cross-platform capabilities such as authentication, rate limiting, monitoring, and telemetry
Define infrastructure roadmap and drive trade-offs across performance, reliability, and cost
Collaborate closely with ML engineers to productionize and optimize model serving pipelines

Requirements:

Strong background in infrastructure engineering, DevOps, or ML platform engineering
Deep expertise in Kubernetes at scale, GPU orchestration, and cloud-native automation
Experience designing highly available, globally distributed systems and traffic routing
Proficiency with infrastructure-as-code, CI/CD pipelines, and observability tooling
Solid understanding of distributed systems, performance optimization, and cost management
Experience with ML inference frameworks (e.g., Triton, ONNX Runtime, vLLM, TensorRT)
Strong grasp of cloud security and data management for production AI workloads
Entrepreneurial mindset with the ability to operate autonomously in fast-paced environments

Artificial Intelligence Engineer

Key skills

About this role

Responsibilities:

Requirements: