Alignerr is seeking experienced software engineers to evaluate and improve the performance of frontier AI models. The role involves assessing AI-generated code, identifying bugs, and providing expert feedback on model performance.
Responsibilities:
- Evaluate the performance of frontier language models on complex software engineering tasks
- Identify bugs, logical errors, hallucinations, and reliability issues in model outputs
- Design and review prompts, test cases, and evaluation scenarios for advanced coding workflows
- Provide precise written feedback explaining model strengths, weaknesses, and edge cases
- Work across multiple languages and codebases to assess generalization and correctness
Requirements:
- 3–4+ years of professional software engineering experience
- Strong proficiency in at least one of: TypeScript, Ruby, Java, or C++
- Excellent written and spoken English
- Demonstrated ability to reason about complex systems and debug non-obvious issues
- Familiarity with modern AI / LLM tooling (Git, CLI workflows, testing frameworks, etc.)
- Ability to critically evaluate model behavior rather than simply use model outputs