Block builds simple, powerful tools that make progress towards an economy that’s truly open to all. The Senior Machine Learning Engineer in Model Risk Management will lead the evaluation of models to ensure they are sound enough for customer and regulatory use, while also developing tools for validating AI outputs.
Responsibilities:
- Independently challenge model owners across lending, fraud, and AML: reproduce their results, set and defend the acceptance thresholds, and own the call on whether a model is sound
- Hunt the silent errors that make metrics lie, and prove them out before they reach production
- Choose evaluation that holds up under real conditions: rare events, shifting populations, and drift that only shows up after launch
- Work hands-on in codebases you did not write, learning the data, configs, and conventions, and ship production code in the tooling you build to validate them
- Build the agentic validation tooling the team depends on, orchestrating agents that run in parallel
- Reason about ML systems end to end — how features, training, serving, monitoring, and scale fit together — to evaluate and challenge an owner's design
- Tie explainability and fair-lending findings on consumer credit models back to the model and product decisions that follow
- Help define how Block validates the systems at the frontier of production AI, setting standards where none exist yet
Requirements:
- A quantitative degree or equivalent experience, and senior-IC depth building or validating models in a high-stakes domain such as credit, fraud, or financial crime
- Command of effective-challenge methodology: reproduction, conceptual-soundness review, benchmarking, stress testing, and outcomes analysis, with an eye for how a model holds up after launch and where its assumptions break
- Deep applied ML and statistics across model families, from regression and tree ensembles to deep learning, with sound judgment about evaluation, calibration, and generalization
- Experimentation and statistical rigor: holdout and experiment design, reasoning about uncertainty, and evaluating a model beyond aggregate accuracy
- Solid software and data engineering: production-quality Python, SQL on large datasets, and reproducible, tested code
- Fluency with modern AI: building with LLMs and agentic tools, and the judgment to know when their output can be trusted
- Familiarity with model risk management frameworks and fair-lending standards, with the specifics learnable on the job
- The communication to explain and defend your conclusions to model owners and senior stakeholders, and the independence to operate under ambiguity