Contribute to the full data & model lifecycle: problem framing, EDA, feature engineering, modeling, evaluation, deployment, and monitoring
Develop production solutions in Python (pandas, PySpark) on Databricks (Delta Lake, Unity Catalog, MLflow, Workflows/Jobs); apply CI/CD and reproducible experiments
Employ automated testing for code, data, and models (unit tests, data validation, drift/bias checks); ensure repeatable pipelines
Identify and address data/model technical debt; improve schemas, data contracts, and performance/cost
Deliver well‑defined work items; clarify requirements through thoughtful questions and proposals
Mentor data scientists and engineers; lead code/design reviews and foster a collaborative culture
Own the quality, maintainability, security, governance, and total cost of ownership of delivered solutions
Help shape elements of the data/ML platform strategy and drive execution within your domain
Balance strategic and pragmatic concerns; proactively surface risks and resolve issues
Requirements
BS/MS in a quantitative field (CS, Statistics, Applied Math, Engineering, Economics) or equivalent
5+ years in data science/ML
Experience with automated pipelines, testing for data/models, and model lifecycle management (registry, versioning, monitoring)
Solid understanding of cloud‑native architectures (AWS)
Excellent communication and collaboration skills; adept at asking incisive questions and aligning diverse stakeholders
Experience with AI/GenAI (LLMs, RAG, prompt design/evaluation) (Preferred)
Strong Python and Spark expertise; hands‑on Databricks experience (Delta Lake, Unity Catalog, MLflow, Jobs/Workflows) (Preferred)