Design, develop, and maintain robust pipelines to collect, transform, and store data used in model monitoring workflows (e.g., scoring data, performance metrics, outcomes)
Build scalable data architectures to support real-time and batch monitoring, including data ingestion, enrichment, and retention practices
Develop reusable monitoring components (e.g., performance drift detectors, threshold-based alerts, metric repositories) that support various model types and regulatory needs
Integrate data pipelines with model lifecycle platforms, MLOps tools, and observability solutions to ensure seamless model performance tracking
Partner with model risk and compliance teams to ensure data lineage, audit trails, and documentation are preserved and accessible for regulatory reviews (e.g., SR 11-7 compliance)
Collaborate with data scientists, model validators, and product managers to align monitoring data infrastructure with evolving model monitoring requirements
Work closely with the model monitoring analytics and strategy monitoring analytics teams within MO&A to ensure the monitoring data infrastructure adapts to changing analytics and monitoring needs
Enable visualization and reporting capabilities through dashboards (e.g., Power BI, Tableau) that summarize model health, stability, and issue alerts
Designing and maintaining high-performance data pipelines that ingest, transform, and version datasets for Model and Strategy Monitoring
Optimize data storage and compute performance for large-scale monitoring use cases involving high-frequency scoring or model ensembles
Requirements
Bachelor’s degree in a quantitative, technical, or data-focused field (e.g., Statistics, Mathematics, Computer Science, Data Science, Engineering) with 5+ years’ experience or in lieu of degree, and 7+ years of relevant work experience in, data engineering or related roles in the financial services or regulated analytics domain
Strong proficiency with data engineering tools and frameworks (e.g., Apache Spark, Airflow, Kafka, dbt, PySpark)
Proficient in programming languages such as SAS, Python, and SQL for building monitoring pipelines and validation checks
Experience with cloud-based data infrastructure (e.g., AWS, Azure, GCP) and data warehousing (e.g., Snowflake, Redshift, BigQuery)
Familiarity with MLOps practices, model metadata tracking (e.g., MLflow), and monitoring toolkits (e.g., Evidently AI, Why Labs, Prometheus)
Understanding of model risk governance requirements and the role of data engineering in ensuring compliant model monitoring
Ability to work in an agile environment and deliver high-quality, production-grade code in collaboration with DevOps and platform engineering teams
Tech Stack
Airflow
Amazon Redshift
Apache
AWS
Azure
BigQuery
Cloud
Google Cloud Platform
Kafka
Prometheus
PySpark
Python
Spark
SQL
Tableau
Benefits
best-in-class employee benefits and programs that cater to work-life integration and overall well-being