Great Value Hiring is seeking a Grafana Expert to design expert-level evaluation tasks for AI agents. The role involves creating realistic workflows, implementing grading systems, and reviewing AI performance to ensure accurate task execution.
Responsibilities:
- Design realistic, multi-step Grafana workflows - dashboards, alerting rules, data source configuration, panel setup, cross-module operations
- Perform each workflow yourself on a hosted Grafana instance to produce a reference trajectory
- Write clear, specific task prompts with measurable outcomes that can be verified programmatically
- Implement programmatic graders that check whether each instruction was completed correctly
- Review AI agent attempts at your tasks, identify where and why they fail, and tag root causes
- Calibrate task difficulty so tasks are challenging but solvable - iterating on prompts and constraints based on model performance
Requirements:
- 2+ years of daily, professional Grafana experience (SRE, Platform Engineering, Observability, or similar)
- Deep familiarity with PromQL, dashboard templating, alerting pipelines, and data source configuration (Prometheus, InfluxDB, etc.)
- Ability to articulate workflows clearly enough for programmatic verification
- Comfort writing basic grading scripts (Python; engineering support provided as needed)
- Experience with Grafana API automation
- Kubernetes/infrastructure monitoring background
- Familiarity with AI evaluation or benchmarking