Alignerr is seeking experienced software engineers to evaluate and improve frontier AI models. The role involves applying engineering expertise to identify failure modes, assess model limitations, and provide structured feedback on AI-generated code.

Responsibilities:

Evaluate the performance of frontier language models on complex software engineering tasks
Identify bugs, logical errors, hallucinations, and reliability issues in model outputs
Design and review prompts, test cases, and evaluation scenarios for advanced coding workflows
Provide precise written feedback explaining model strengths, weaknesses, and edge cases
Work across multiple languages and codebases to assess generalization and correctness

Requirements:

3–4+ years of professional software engineering experience
Strong proficiency in at least one of: TypeScript, Ruby, Java, or C++
Excellent written and spoken English
Demonstrated ability to reason about complex systems and debug non-obvious issues
Familiarity with modern AI / LLM tooling (Git, CLI workflows, testing frameworks, etc.)
Ability to critically evaluate model behavior rather than simply use model outputs

Software Engineer – AI Evaluation Specialist

Key skills

About this role

Responsibilities:

Requirements: