Role Overview

Develop and implement advanced analytics on top of noisy, heterogeneous RiskOS data to understand user behavior, product usage, fraud patterns, and workflow effectiveness; translate findings into concrete product and risk strategy improvements.
Architect and build scalable data pipelines and production ML workflows, collaborating with data engineering to ensure robust, reliable, and efficient data processing for both batch and streaming use cases.
Lead the design, execution, and analysis of experimentation frameworks to optimize user journeys, feature adoption, and workflow performance across the RiskOS platform.
Lead the creation and evaluation of Generative AI solutions (LLMs, agents, prompt‑based tools) that automate analytics, power case review and investigation assistants, streamline documentation, and enhance RiskOS workflows and reporting.
Define rigorous evaluation frameworks for GenAI solutions, including offline benchmarks, human‑in‑the‑loop review, safety and hallucination checks, and impact measurement in production.
Partner with platform and engineering teams to define and build core RiskOS data science infrastructure, including feature stores, model‑serving APIs, evaluation services, and monitoring frameworks for both traditional ML and GenAI systems.
Own end‑to‑end deployment of production‑grade solutions: packaging models and GenAI workflows, integrating with RiskOS services, establishing SLAs, and instrumenting telemetry, alerting, and feedback loops.
Develop and automate tools for model evaluation, stress testing, backtesting, and adversarial scenario simulation to ensure robustness and operational resilience—especially in high‑risk fraud and compliance contexts.
Enable product and risk teams through self‑serve analytics and tools: build dashboards, template analyses, and GenAI‑driven assistants that help non‑technical users explore RiskOS data, tune workflows, and debug decisions.
Collaborate cross‑functionally with product, engineering, risk, solution consulting, and customer‑facing teams to translate business requirements into data‑driven solutions and actionable insights, particularly for fraud and risk use cases on RiskOS.
Mentor and provide technical guidance to other data scientists and analysts, modeling best practices in experimentation, software engineering hygiene, GenAI safety, and rigorous model evaluation.
Ensure all solutions adhere to best practices in data privacy, security, and compliance, especially when handling sensitive PII and financial data in regulated fintech and public‑sector environments.
Contribute to company‑wide standards for ML and GenAI explainability, risk evaluation, feature logging, and documentation, helping raise the overall AI bar across Socure.
Communicate complex technical concepts and findings clearly to both technical and non‑technical stakeholders, including executive leadership and external partners.

Requirements

Master’s or PhD in Computer Science, Machine Learning, Statistics, Engineering, or a related quantitative field, or equivalent professional experience.
6+ years of hands‑on experience in data science, machine learning, or high‑scale data engineering roles, with a proven track record in fraud prevention, risk analytics, or complex decisioning systems.
Strong experience applying Generative AI in production or near‑production contexts, including:
Building and evaluating LLM‑based applications or agents (e.g., retrieval‑augmented generation, workflow assistants, data‑insight copilots).
Prompt design and optimization, safety and guardrail techniques, and quantitative/qualitative evaluation of LLM outputs.
Deep proficiency in Python and SQL, with hands‑on experience using ML frameworks such as scikit‑learn, XGBoost, TensorFlow, or PyTorch, plus modern GenAI/LLM tooling (e.g., OpenAI/Anthropic APIs, Hugging Face ecosystems, orchestration frameworks).
Demonstrated experience building and maintaining scalable data pipelines and deploying ML models in production environments, ideally involving streaming or near‑real‑time data and modern data platforms (e.g., Databricks, Spark, PySpark, BigQuery, or similar).
Solid understanding of data engineering concepts, including ETL, data warehousing, schema design, and distributed computing.
Experience with platform‑oriented data science: working with feature stores, model‑serving infrastructure, CI/CD for ML, automated monitoring, and feedback collection workflows.
Hands‑on experience wrangling messy, high‑volume datasets: designing robust cleaning, normalization, and quality‑control processes; reasoning under missing or biased data; and building reusable data abstractions for other users.
Familiarity with privacy‑preserving ML techniques, secure data handling, and regulatory requirements in fintech, credit, or public‑sector environments is strongly preferred.
Proven ability to collaborate effectively in cross‑functional, fast‑paced teams; strong communication skills with comfort presenting trade‑offs and recommendations to senior stakeholders.
Product‑minded and outcome‑oriented: you care about how models and GenAI tools are used, how they shape user experience and risk posture, and how to measure their real‑world impact.

Tech Stack

BigQuery
ETL
PySpark
Python
PyTorch
Spark
SQL
Tensorflow

Benefits

Offers Equity
Offers Bonus

Staff Data Scientist – RiskOS

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits