NumpyPandasPythonScikit-LearnSQLRAIMachine LearningNLPNatural Language ProcessingLarge Language Modelsscikit-learnNumPyMLOpsXGBoostLightGBMData EngineeringGitVersion ControlCommunication
About this role
Role Overview
Design, develop, and optimize prompts for Large Language Models (LLMs)
Collaborate with data scientists and play a role in defining, testing and optimizing prompts that guide our AI systems to generate accurate, informative and creative outputs
Create and improve AI models and algorithms, as well as maintain prompt libraries to generate prompts for natural language processing (NLP) applications
Stay abreast of most recent developments on large language models
Develop, validate, and maintain supervised machine learning models including linear and logistic regression, decision trees, random forests, gradient boosting methods (XGBoost, LightGBM), and support vector machines
Apply sound practices around feature engineering, hyperparameter tuning, cross-validation, and model selection to ensure robust, generalizable solutions
Partner with data engineering teams to source, clean, and transform structured and semi-structured datasets
Conduct thorough exploratory data analysis to surface patterns, anomalies, and opportunities that inform both modeling strategy and business decisions
Apply appropriate evaluation metrics — such as RMSE, MAE, AUC-ROC, precision-recall, F1, R² and lift curves — to assess model performance in context
Leverage model explainability techniques (e.g., SHAP values, partial dependence plots) to communicate findings clearly to both technical and non-technical audiences
Collaborate with engineering and MLOps teams to package and deploy models into production environments
Establish monitoring frameworks to track model drift, data quality issues, and performance degradation over time and lead remediation efforts when needed
Translate complex analytical findings into clear, compelling narratives for business stakeholders
Contribute to project scoping discussions, help define success metrics, and proactively surface risks or limitations in proposed analytical approaches
Contribute to the team's collective growth by participating in code reviews, documenting your work thoroughly, and sharing learnings through internal presentations or knowledge repositories
Requirements
3 to 5 years of hands-on experience in data science or a closely related quantitative role
Strong proficiency in Python, including libraries such as scikit-learn, pandas, NumPy, and matplotlib
Experience in Domino Lab is a plus
Demonstrated experience building and deploying regression and classification models in a business context
Solid understanding of statistical fundamentals including probability, hypothesis testing, and model assumptions
Experience working with SQL for data extraction and transformation
Familiarity with version control using Git and collaborative development practices
Strong written and verbal communication skills with the ability to present technical work to diverse audiences