Welocalize is seeking a Prompt Engineer who will be responsible for the end-to-end technical migration workflow for transitioning templates to LLM autoraters. The role involves using internal tools to leverage prompt engineering techniques to maximize model performance and ensure high-quality outputs.

Responsibilities:

Utilize Automatic Prompt Generation (APG) tools to create baseline prompts for complex parent-child template clusters
Run and supervise Automated Prompt Optimization (APO) tool, review the outputs, and flag when the APO reaches deadlocks or plateaus
Manually draft, test, and refine prompts to navigate complex template architectures, overcome anti-patterns, and handle edge cases where tooling is lacking or broken. Solve edge-case scenarios by designing and refining manual prompts
Monitor shadowbot runs to ensure sufficient disagreements (between human and LLM ratings) are registered, generated, and tracked
Run prompt versions against established gold data to continuously measure autorater quality against the human crowd baseline, calculating accuracy metrics such as F1 scores, precision, and recall
Draft technical launch readiness justifications (Launch Certification Documentation) for final

Requirements:

Native fluency in English
Must be based in United States
Bachelor's, Master's, or Doctorate degree in Computer Science, Data Science, Computational Linguistics, Human-Computer Interaction (HCI), Cognitive Science, or a related analytical field
At least 4 years' experience as Prompt Engineer. Proven experience tuning Large Language Models (LLMs) for strict, structured outputs, complex classification tasks, and familiarity with chain-of-thought and few-shot learning
Strong proficiency in identifying error patterns, analyzing model performance, and using SQL or other data analytics tools
Ability to quickly learn and master proprietary tools with minimal supervision
Excellent verbal and written communication skills
Familiarity with enterprise-grade LLM interfaces like the Goose API
Experience in AI model evaluation, data science, computational linguistics, or software engineering
Hands-on experience with Automated Prompt Optimization (APO) systems or tuning workflows
Linguistic expertise, including an understanding of semantics and logic

Project Lion - Senior Prompt Engineer - United States (Remote, Part-Time)

Key skills

About this role

Responsibilities:

Requirements: