Fastino Labs is focused on developing specialized, efficient AI, and they are seeking an AI Engineer to innovate and deploy high-performance agentic systems. The role involves collaborating with engineering teams to enhance model performance and ensuring the stability of inference pipelines for enterprise customers.
Responsibilities:
- Innovate at the edge of efficiency by designing and deploying high-performance agentic systems that leverage Fastino’s optimized model architectures to outperform traditional LLM benchmarks
- Bridge the gap between research and production by collaborating with engineering teams to turn novel architectural breakthroughs into scalable, low-latency solutions for enterprise customers
- Drive rapid, iterative prototyping of AI functionalities, refining model performance and task-accuracy based on real-world telemetry to ensure specialized models meet rigorous developer standards
- Own the stability and throughput of inference pipelines, proactively solving scalability bottlenecks to ensure models deliver consistent, reliable performance under massive operational loads
- Architect large-scale data and fine-tuning strategies to continuously improve the precision and domain-specific reliability of the Fastino models
Requirements:
- 2+ years of hands-on experience in AI/ML engineering roles
- Demonstrated proficiency with LLMs and a track record of applying AI/ML techniques to solve complex, unstructured problems
- You are comfortable working across the stack from prompt engineering and vector DB tuning to Kubernetes deployment and API design
- Experience building microservices that handle high-concurrency agentic workloads
- Familiarity with GLiNER or other information extraction architectures