Grafana Labs is a remote-first, open-source powerhouse with a strong user base of over 20 million. They are seeking a Staff AI Engineer to develop high-performance AI features that enhance observability tools, enabling users to detect and resolve incidents effectively through automation and collaboration.

Responsibilities:

Build and deliver AI solutions: Take ownership of developing high-performance AI features to help users detect, triage, and resolve incidents using observability data and tools
Rapid experimentation and iteration: Implement a highly iterative process where you quickly prototype, test, and validate with real users, including shipping and evolving LLM- or agent-powered workflows for incident lifecycle management and automated analysis tasks
Collaborate cross-functionally: Work with data analysts, product managers, and designers to shape AI-driven product features, including integration of agentic components with internal tools, alerting systems, runbooks, and developer workflows
Utilize AI tools effectively: Use AI and automation tools to enhance both product functionality and your own development workflows
Effective communication: You’ll be working in a highly dynamic and collaborative environment, so we need someone who can communicate effectively and contribute across teams
Ownership and impact: Take full ownership of the AI solutions you develop, ensuring they are not only innovative but also scalable, maintainable, and aligned with real user workflows

Requirements:

Experience with LLMs, prompt engineering, and building applications powered by GenAI
Proven track record of delivering software that made it into production and is actively used by users
Exposure to working in cloud-native environments (e.g., AWS, GCP, Azure)
Experience using observability tools to understand and troubleshoot system behavior
Experience building or working with agent frameworks or multi‑agent workflows
Experience with infrastructure / devops related tooling: Kubernetes, Docker, Terraform or similar for deployments
Familiarity with model fine-tuning techniques
Experience building observability tooling

Staff AI Engineer - Grafana Ops, AI/ML | USA | Remote

Key skills

About this role

Responsibilities:

Requirements: