Keystone Recruitment is seeking a Software Engineer (Systems Design) to work with a leading AI research organization focused on developing advanced conversational AI systems. The role involves evaluating and improving the reasoning capabilities of large language models in software engineering and coding tasks.
Responsibilities:
- Evaluate AI-generated responses to software engineering and coding queries for correctness, clarity, and completeness
- Execute and test code to validate functionality, performance, and edge-case handling
- Perform fact-checking using authoritative technical references and public sources
- Annotate model outputs by identifying strengths, weaknesses, bugs, and conceptual gaps
- Assess code quality, readability, algorithmic soundness, and explanation quality
- Ensure outputs align with established conversational and technical guidelines
- Apply standardized evaluation rubrics and benchmarks consistently
Requirements:
- Bachelor's, Master's, or PhD in Computer Science or a closely related field
- Significant professional experience in software engineering or system design
- Expert-level proficiency in at least one major programming language (e.g., Python, Java, C++, JavaScript, Go, Rust)
- Ability to independently solve medium-to-hard algorithmic problems
- Experience contributing to open-source projects with accepted pull requests
- Strong familiarity with using LLMs for coding and understanding their limitations
- Exceptional attention to detail and ability to detect subtle technical errors
- Prior experience with RLHF, model evaluation, or technical data annotation
- Background in competitive programming or algorithmic problem solving
- Experience reviewing or maintaining production-level code
- Familiarity with multiple programming paradigms and technology stacks
- Ability to explain complex technical topics to non-technical audiences