The Voleon Group is a technology company that specializes in applying AI and machine learning techniques to finance. They are seeking a Data Scientist for their Feature Engineering team to turn complex datasets into predictive signals for machine learning models, involving tasks from data sourcing to feature construction and validation.

Responsibilities:

Explore, profile, and curate complex and often messy datasets from third-party vendors and internal sources, developing a deep understanding of what each dataset can and cannot tell us
Harness financial intuition, academic research, and statistical rigor to inform design and implementation of predictive features in collaborative setting
Validate features through a disciplined, test-driven framework — including cross-sectional analysis, stationarity testing, and point-in-time correctness — to ensure signals are real and not artifacts of data issues
Build and maintain data pipelines that bring features from prototype to production, with monitoring for data health and correctness along the way
Communicate your findings clearly — both the signal you've found and the story of how the data produces it — to researchers and leadership
Proactively investigate anomalies in data feeds and production behavior, performing root-cause analysis and surfacing issues to relevant stakeholders
Leverage AI tools to accelerate exploration, coding, and analysis — and share what you learn about effective workflows with the team

Requirements:

2 years of applied industry experience (including internships) working end-to-end with complex datasets: curation, querying, aggregation, exploratory analysis, and visualization
Experience using statistical methods to analyze data, identify patterns, conduct root-cause analysis, and translate findings into actionable insights
Ability to frame and answer questions mathematically
Ability to infer useful forward-looking directions from the results of retrospective analysis
Fluency in managing, processing, and visualizing tabular data using SQL and Python (Pandas or Polars)
Basic software development skills and experience with bash, Linux/Unix, and git
Ability to refine ambiguous requests into well-scoped analyses and communicate results with clarity and precision
Bachelor's degree in a quantitative discipline (statistics, data science, computer science, economics, physics, or a related field)
Master's degree in a quantitative discipline
Prior industry experience or demonstrated interest in finance — academic projects, coursework in financial engineering, or industry internships
Familiarity with financial datasets such as Compustat, IBES, or similar vendor data
Experience developing in a production-facing environment with standard tooling (CI/CD, git, workflow orchestration)
Hands-on experience with AI coding assistants or LLM-based tools in a data science or engineering workflow
A track record of curiosity-driven exploration — side projects, Kaggle competitions, research papers, or anything that shows you can't leave an interesting dataset alone

Data Scientist - Feature Engineering

Key skills

About this role

Responsibilities:

Requirements: