MLabs is the fastest-growing online cybersecurity training platform, serving millions of users and global businesses. They are seeking an AI Engineer to develop a fully autonomous AI pentesting agent that can handle complex offensive security tasks with precision.
Responsibilities:
- Contribute to the design, development, and optimization of an autonomous AI pentesting agent, focusing on core logic and decision paths
- Implement agent functions such as reasoning, complex planning, tool orchestration, and structured memory
- Build and maintain secure environments to execute, test, and benchmark agent behaviors against offensive security scenarios
- Assist in evaluating and comparing various Large Language Models (including Claude, OpenAI, Mistral, and Llama) to optimize specific agent tasks
- Build UI components and dashboards using React and support browser automation workflows using Playwright for agent evaluation
- Support the iterative improvement of the agent through experimentation, observability, and rigorous lab testing
- Work closely with offensive security researchers to align agent behaviors with real-world attacker workflows and vulnerability exploitation methodologies
Requirements:
- 2+ years of software development experience with a high level of proficiency in Python
- Proven experience building AI agents utilizing frameworks such as LangChain, CrewAI, or similar SDKs
- Hands-on experience with reasoning patterns, tool orchestration, memory management, and structured outputs
- Proficiency in prompt engineering, Retrieval-Augmented Generation (RAG), chain-of-thought processing, and few-shot learning
- Experience with SQL/NoSQL databases, data modeling, Docker, AWS, cloud deployment, and shell scripting
- Experience using React for developing frontends and analytical dashboards
- A demonstrable interest in cybersecurity; while deep expertise is not required, curiosity and a passion for the field are essential
- Our client is currently unable to provide visa sponsorship for this position
- Familiarity with the OWASP Top 10 vulnerabilities
- Experience in model training and fine-tuning (e.g., LoRA, PEFT) and evaluation
- Practical cybersecurity expertise in pentesting methodologies or CTF platforms
- Experience using Playwright for browser automation in the context of agent workflows