RibbitZ LLC is a leading frontier AI research accelerator, and they are seeking a Senior Software Engineer to work on cutting-edge LLM benchmarking and AI-generated code evaluation. The role involves curating code examples, evaluating AI-generated code, and designing verification mechanisms for software engineering tasks.

Responsibilities:

Curate code examples and reference solutions
Evaluate and refine AI-generated code for scalability, reliability, and performance
Build agents that verify code quality and detect error patterns
Design automatic verification mechanisms for SWE tasks
Benchmark LLMs across: architecture design, API design, production implementation, debugging, monitoring, operational maintenance

Senior Software Engineer – LLM Evaluation

Key skills

About this role

Responsibilities: