Arcee AI is seeking a Compute Infrastructure Specialist to manage and scale the infrastructure that powers their AI workloads and customer deployments. This hands-on role involves coordinating across teams to ensure efficient GPU resource allocation and smooth deployments while enhancing customer experience with the infrastructure.
Responsibilities:
- Manage and track GPU/compute inventory across internal and customer environments
- Coordinate infrastructure provisioning for customer deployments and internal research workloads
- Monitor utilization, capacity, uptime, and cost efficiency across compute environments
- Work cross-functionally with Engineering, Research, Product, and GTM teams on deployment readiness and customer needs
- Support customer onboarding and infrastructure troubleshooting alongside Solutions and Customer Success teams
- Maintain documentation around infrastructure processes, environments, and deployment standards
- Help improve operational workflows around provisioning, monitoring, escalation management, and forecasting
- Partner with vendors and cloud providers as needed
- Assist with infrastructure planning related to scaling customer demand and new product launches