NextGen Coding Company is hiring a Senior AI / LLM Full-Stack Engineer to design, build, and deploy production-grade AI systems for enterprise and regulated clients. The role focuses on embedding large language models into existing applications, building robust RAG pipelines, and delivering reliable, cost-efficient AI-powered features at scale.
Responsibilities:
- Embed large language models into existing enterprise applications and workflows
- Design, build, and maintain Retrieval-Augmented Generation (RAG) pipelines
- Integrate AI services with internal APIs, databases, and data platforms
- Develop backend services and supporting infrastructure for AI-driven features
- Implement light frontend components as needed to support AI functionality
- Optimize prompt quality, system latency, and inference cost across deployments
- Collaborate closely with client engineers, architects, and product owners
- Support production monitoring, debugging, and iterative improvement of AI systems
Requirements:
- 5+ years of professional software engineering experience
- 2–3+ years of hands-on AI / LLM work in production environments
- Strong proficiency in Python
- Experience working with LLMs such as GPT-4, Claude, or equivalent models
- Practical experience building RAG systems using LangChain or similar frameworks
- Experience with vector databases such as Pinecone, FAISS, or Weaviate
- Backend development experience with FastAPI, Flask, and/or Node.js
- Frontend familiarity with React or equivalent frameworks (not design-heavy)
- Cloud experience with AWS or Azure
- Full availability during Eastern Time Zone business hours
- Experience supporting enterprise or regulated clients strongly preferred