Blankfactor is dedicated to engineering impact, building high-quality tech solutions for fast-moving industries like payments and banking. They are seeking an AI Engineer Lead to bridge the gap between high-level architecture and production-grade execution, transforming designs into scalable code while mentoring a team to deliver high-quality LLM applications.
Responsibilities:
- Convert architecture into implementation: Translate high-level technical designs into actionable patterns, preventing fragmented development and ensuring a unified codebase
- Drive code coherence: Act as the primary owner for code quality, conducting rigorous reviews and establishing "done-done" definitions that include comprehensive testing, evaluation, and logging
- Set the standard: Develop reference implementations and templates that the team can leverage to accelerate delivery without sacrificing quality
- Master LangGraph workflows: Establish team-wide patterns for state strategy, routing, and subgraphs to ensure consistency across complex agentic graphs
- Engineer for resilience: Implement specialist-level reliability patterns, including retries, fallbacks, idempotency, and safe degradation to handle the inherent unpredictability of LLMs
- Manage Prompt/Version discipline: Maintain strict version control over prompts and model configurations to ensure reproducible and predictable outputs
- Own the feedback loop: Design and maintain offline evaluation suites and regression gates to measure quality before deployment
- Monitor runtime health: Establish observability signals for latency, failure rates, and hallucination indicators to ensure production stability
- Determine optimization paths: Personally evaluate when to utilize fine-tuning (SFT/PEFT) versus RAG or prompt engineering to achieve the best performance-to-cost ratio
- Partner with stakeholders: Work closely with architects and product teams to resolve technical ambiguity and move features through the lifecycle at pace
- Raise the bar: Pair with and mentor team members through active code reviews and collaborative problem-solving to improve the overall engineering maturity of the group
Requirements:
- Total Experience: 7+ years in software engineering, with at least 5+ years specifically in ML/AI engineering
- GenAI Expertise: 2–4 years of hands-on delivery with LLMs and Agentic systems
- Deep LangGraph Proficiency: Practical leadership experience building and reviewing complex graphs, state management, and tool-calling patterns
- Fine-tuning: Proven experience executing at least one end-to-end fine-tuning run (SFT or PEFT/LoRA)
- Reliability Engineering: Expert knowledge of handling timeouts, regression gates, and error handling in non-deterministic AI environments
- Model Context Protocol (MCP): Experience designing or consuming MCP tools/servers and managing safe tool-access patterns
- Regulated Domain Expertise: Knowledge of building for environments with strict PII boundaries, audit logging, and traceability requirements