Grafana Labs is a remote-first, open-source powerhouse with over 20M users of its visualization tool. They are seeking a Senior AI Engineer to build observability tools that leverage AI to help users manage complex data and improve their systems.
Responsibilities:
- Build and deliver AI solutions: Take ownership of developing high-performance AI features to help users detect, triage, and resolve incidents using observability data and tools
- Rapid experimentation and iteration: Implement a highly iterative process where you quickly prototype, test, and validate with real users, including shipping and evolving LLM- or agent-powered workflows for incident lifecycle management and automated analysis tasks
- Collaborate cross-functionally: Work with data analysts, product managers, and designers to shape AI-driven product features, including integration of agentic components with internal tools, alerting systems, runbooks, and developer workflows
- Utilize AI tools effectively: Use AI and automation tools to enhance both product functionality and your own development workflows
- Effective communication: You’ll be working in a highly dynamic and collaborative environment, so we need someone who can communicate effectively and contribute across teams
- Ownership and impact: Take full ownership of the AI solutions you develop, ensuring they are not only innovative but also scalable, maintainable, and aligned with real user workflows
Requirements:
- Experience with LLMs, prompt engineering, and building applications powered by GenAI
- Proven track record of delivering software that made it into production and is actively used by users
- Exposure to working in cloud-native environments (e.g., AWS, GCP, Azure)
- Experience using observability tools to understand and troubleshoot system behavior
- Strong engineering skills: Solid experience building production software systems (backend and / or full stack)
- AI experience with a practical mindset: You're familiar with AI technologies and frameworks, and you focus on delivering high-quality solutions that work in the real world, not just in theory
- Quick iteration and experimentation: You're comfortable releasing prototypes, collecting feedback, and iterating with a pragmatic mindset
- Proven initiative: You take ownership and drive projects forward, pushing boundaries to find the most impactful solutions
- Collaborative attitude: You communicate effectively with peers, product managers, and designers
- Experience building or working with agent frameworks or multi‑agent workflows
- Experience with infrastructure / devops related tooling: Kubernetes, Docker, Terraform or similar for deployments
- Familiarity with model fine-tuning techniques
- Experience building observability tooling