Crossing Hurdles is seeking a Software Engineering & Systems Design Expert to evaluate LLM-generated responses to software engineering queries. The role involves validating technical correctness, assessing code quality, and providing structured feedback on model outputs.

Responsibilities:

Evaluate LLM-generated responses to software engineering and coding queries for accuracy, reasoning, clarity, and completeness
Validate technical correctness through fact-checking and by executing code where appropriate
Assess code quality, readability, algorithmic efficiency, and adherence to engineering best practices
Identify logical errors, edge cases, inefficiencies, and misleading explanations in model outputs
Annotate responses with structured feedback highlighting strengths, gaps, and conceptual or factual inaccuracies
Ensure outputs align with expected conversational behavior and system guidelines
Apply consistent evaluation standards using defined taxonomies, benchmarks, and evaluation frameworks

Requirements:

Bachelor's, Master's, or PhD degree in Computer Science or a closely related field
Strong real-world experience in software engineering or related technical roles
Expertise in at least one major programming language (e.g., Python, Java, C++, JavaScript, Go, Rust)
Ability to independently solve Medium to Hard–level HackerRank or LeetCode problems
Experience contributing to well-known open-source projects, including merged pull requests
Hands-on experience using LLMs during software development and understanding their limitations
Strong attention to detail and ability to evaluate complex technical reasoning and subtle bugs

Software Engineer | $80/hr | Remote

Key skills

About this role

Responsibilities:

Requirements: