24-MAG is offering a specialized part-time consulting opportunity for PhD-level engineers focused on the development and evaluation of advanced AI systems. The role involves evaluating AI-generated responses, ensuring technical accuracy, and improving the reasoning capabilities of AI systems in engineering contexts.
Responsibilities:
- Write and refine prompts that guide AI models in engineering-related scenarios
- Evaluate AI-generated responses for technical accuracy and applied reasoning
- Verify engineering claims using domain expertise and authoritative sources
- Annotate responses by identifying strengths, gaps, and conceptual inaccuracies
- Assess clarity, structure, and appropriateness of explanations for different audiences
- Apply structured evaluation guidelines and benchmarking standards
- Ensure model responses align with expected system behavior and engineering logic