Role Overview

The Core Foundation & Integration (CFI) Tribe implements and delivers the Card Issuing core applications — from Card Issuing lifecycle to Transactions clearing & settlement — for hundreds of banks.
With over 150 million cards and 1.16 billion monthly transactions, our platforms operate at the forefront of capacity and performance.
We are looking for an engineer to be the architect and guardian of our Card Issuing solutions’ performance and reliability.
This is a strategic, hands-on engineering role for someone obsessed with making large-scale distributed systems fast, resilient, and efficient by design.
You will be the technical cornerstone of our platform’s stability, with the mandate to influence architecture, write code, and solve the most complex performance challenges.
Review critical solution designs and challenge architectures that present unmitigated performance or reliability risks.
Architect and operate our centralized performance and load testing platform, providing it as a self-service to development squads.
Design and lead critical end-to-end performance tests and Chaos Engineering exercises to identify system weaknesses proactively.
Analyze long-term performance metrics (SLOs) to identify slow regressions and silent bottlenecks, and drive resolution before they become incidents.
Lead war rooms for performance incidents, using advanced tools (profilers, distributed tracing) to diagnose systemic root causes.
Recommend solutions and prototype fixes, optimize critical code paths, and demonstrate solutions to development teams.
Liaise with SRE and operations teams of internal and external clients, building deep technical credibility.

Requirements

5+ years of proven experience with deep, hands-on expertise in a language like Java, or Python.
Knowledge of legacy and public cloud environments (GCP, AWS).
Deep understanding of large-scale distributed systems: concurrency, queuing, caching, consistency, and failure modes.
Expertise across communication patterns: synchronous/high-throughput APIs, event-driven architectures (Kafka), and large-scale batch processing.
Expertise in Application Performance Monitoring, database performance optimization (Oracle, PostgreSQL), network analysis, and low-level system performance.
Proficiency with observability tools: Prometheus, Grafana, and distributed tracing frameworks.
Strong technical leadership: ability to mentor senior engineers and influence architecture decisions across multiple teams.
Excellent communication skills — able to explain complex technical concepts clearly to diverse audiences.
Strong ownership mindset with a bias for action and measurable delivery.
Full professional proficiency in English.
Experience in the payments or card issuing domain.
Background in chaos engineering tooling or reliability frameworks (e.g., SRE practices, SLO/SLA management).
Experience mentoring or coaching engineering teams on performance best practices.

Tech Stack

AWS
Cloud
Distributed Systems
Google Cloud Platform
Grafana
Java
Kafka
Oracle
Postgres
Prometheus
Python

Benefits

Holiday Voucher
Private medical insurance
Performance bonus
Easter and Christmas bonus
Employee referral bonus
Bookster subscription
7card
Work from home options depending on project

Performance & Reliability Engineer – Banking

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits