Crossing Hurdles is seeking a Software Engineering & Systems Design Expert to evaluate LLM-generated responses to software engineering queries. The role involves validating technical correctness, assessing code quality, and providing structured feedback on model outputs.
Responsibilities:
- Evaluate LLM-generated responses to software engineering and coding queries for accuracy, reasoning, clarity, and completeness
- Validate technical correctness through fact-checking and by executing code where appropriate
- Assess code quality, readability, algorithmic efficiency, and adherence to engineering best practices
- Identify logical errors, edge cases, inefficiencies, and misleading explanations in model outputs
- Annotate responses with structured feedback highlighting strengths, gaps, and conceptual or factual inaccuracies
- Ensure outputs align with expected conversational behavior and system guidelines
- Apply consistent evaluation standards using defined taxonomies, benchmarks, and evaluation frameworks
Requirements:
- Bachelor's, Master's, or PhD degree in Computer Science or a closely related field
- Strong real-world experience in software engineering or related technical roles
- Expertise in at least one major programming language (e.g., Python, Java, C++, JavaScript, Go, Rust)
- Ability to independently solve Medium to Hard–level HackerRank or LeetCode problems
- Experience contributing to well-known open-source projects, including merged pull requests
- Hands-on experience using LLMs during software development and understanding their limitations
- Strong attention to detail and ability to evaluate complex technical reasoning and subtle bugs