Crossing Hurdles is seeking a Software Engineer to evaluate AI-generated responses to coding queries for accuracy and clarity. The role involves conducting fact-checking, annotating model responses, and ensuring code quality while adhering to best practices in software engineering.
Responsibilities:
- Evaluate AI-generated responses to coding and software engineering queries for accuracy, reasoning, clarity, and completeness
- Conduct fact-checking using trusted sources and validate outputs by executing code
- Annotate model responses by identifying strengths, gaps, and inaccuracies
- Assess code quality, readability, and algorithmic soundness
- Ensure responses align with expected conversational behavior and engineering best practices
- Apply consistent evaluation standards using defined taxonomies, benchmarks, and guidelines
Requirements:
- Strong experience in software engineering, data science, or systems design
- Strong academic background in computer science or a related technical field
- Strong experience in at least two programming languages such as Python, Java, C++, JavaScript, Go, Rust, or similar
- Strong problem-solving skills with the ability to handle complex coding challenges
- High attention to detail with the ability to identify logical flaws and inefficiencies
- Ability to explain complex technical concepts clearly
- Experience contributing to open-source projects is a plus
- Familiarity with using AI tools for coding and understanding their strengths and limitations
- Strong experience in model evaluation, data annotation, or RLHF is a plus