Own the full LLM pipeline from data preparation to production real case usage.
Design, iterate and optimize prompts (zero-/few-shot, chain-of-thought, tool-calling, etc.) to maximize model utility and safety across products and languages.
Build and maintain Retrieval-Augmented Generation (RAG) QA/search systems that connect to multi-source knowledge bases.
Familiar with vLLM/SGLang inference architectures and have proven experience deploying and operating LLM services on multi‑GPU or cluster environments.
Design, implement and operate multi‑agent LLM architectures (e.g. LangGraph, CrewAI, AutoGen) including task decomposition, agent orchestration, memory sharing and tool‑calling workflows.
Develop evaluation pipelines (automatic metrics & human feedback) to measure prompt and model quality, bias, and hallucination rates.
Collaborate with product and CS teams to integrate AI models into conversational Chatbot in different scenarios.
Track cutting-edge research, author tech blogs, and keep improve current architecture.
Requirements
Master’s Degree or higher in Computer Science, Data Science or related field..
At least 2 years of deep-learning/NLP experience, including 1+ year practical LLM work (SFT, DPO, RAG, quantization, inference optimization, etc.).
Practical experience building and deploying multi‑agent LLM workflows, with understanding of agent‑orchestrator patterns, shared memory, long‑horizon planning and guard‑rail design.
Proficient in both English and Chinese communication for efficient cross team collaboration
Benefits
Competitive salary and company benefits
Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)