Mercor is focused on enhancing AI systems by assembling a red team of human data experts who probe AI models for vulnerabilities. The Test Engineer will be responsible for red teaming conversational AI models, generating high-quality human data, and documenting findings to improve AI safety for customers.
Responsibilities:
- Red team conversational AI models and agents: jailbreaks, prompt injections, misuse cases, bias exploitation, multi-turn manipulation
- Generate high-quality human data: annotate failures, classify vulnerabilities, and flag systemic risks
- Apply structure: follow taxonomies, benchmarks, and playbooks to keep testing consistent
- Document reproducibly: produce reports, datasets, and attack cases customers can act on
Requirements:
- Fluent Language Skills Required: English & German. Native-level fluency in English and German is required for this position
- Prior red teaming experience (AI adversarial work, cybersecurity, socio-technical probing)
- Curiosity and adversarial mindset: instinctively push systems to breaking points
- Structured approach: use frameworks or benchmarks, not just random hacks
- Strong communication skills: explain risks clearly to technical and non-technical stakeholders
- Adaptability: thrive on moving across projects and customers
- Adversarial ML: jailbreak datasets, prompt injection, RLHF/DPO attacks, model extraction
- Cybersecurity: penetration testing, exploit development, reverse engineering
- Socio-technical risk: harassment/disinfo probing, abuse analysis, conversational AI testing
- Creative probing: psychology, acting, writing for unconventional adversarial thinking