Crossing Hurdles is seeking a Software Engineer to evaluate AI-generated responses to coding queries for accuracy and clarity. The role involves conducting fact-checking, annotating model responses, and ensuring code quality while adhering to best practices in software engineering.

Responsibilities:

Evaluate AI-generated responses to coding and software engineering queries for accuracy, reasoning, clarity, and completeness
Conduct fact-checking using trusted sources and validate outputs by executing code
Annotate model responses by identifying strengths, gaps, and inaccuracies
Assess code quality, readability, and algorithmic soundness
Ensure responses align with expected conversational behavior and engineering best practices
Apply consistent evaluation standards using defined taxonomies, benchmarks, and guidelines

Requirements:

Strong experience in software engineering, data science, or systems design
Strong academic background in computer science or a related technical field
Strong experience in at least two programming languages such as Python, Java, C++, JavaScript, Go, Rust, or similar
Strong problem-solving skills with the ability to handle complex coding challenges
High attention to detail with the ability to identify logical flaws and inefficiencies
Ability to explain complex technical concepts clearly
Experience contributing to open-source projects is a plus
Familiarity with using AI tools for coding and understanding their strengths and limitations
Strong experience in model evaluation, data annotation, or RLHF is a plus

Software Engineer | Remote

Key skills

About this role

Responsibilities:

Requirements: