Build and sustain an autonomous crew that owns systems end-to-end, from data pipelines to APIs including production operations, incident management, and on-call, while removing organisational blockers to maximise delivery focus.
Champion SLO-driven reliability (via Datadog), with robust runbooks, blameless post-mortems, and a strong emphasis on proactive observability to detect issues before users.
Structure and prioritise work with clear milestones and a consistent delivery cadence, balancing short-term operational needs with longer-term strategic initiatives; use metrics (e.g. velocity, cycle time, effort allocation) to inform decisions and ensure transparency.
Maintain high engineering standards by creating space for testing, automation, observability, documentation, and technical debt reduction; guide architectural decisions across Python and TypeScript systems.
Oversee AWS infrastructure (Terraform-managed) and platform components (ArgoCD, Airflow), ensuring scalability, reliability, and cost efficiency while supporting expansion into new commodity domains.
Develop and retain engineers through clear goals, regular feedback, and career development, fostering high performance while addressing challenges with empathy and clarity.
Partner with recruitment to maintain a strong hiring bar and ensure the team is appropriately staffed and balanced in skills and seniority.
Collaborate closely with Product, Research, and Data teams, communicating progress, managing risks, and increasing the visibility and impact of the crew’s work.
Leverage retrospectives, health checks, and existing metrics to drive continuous improvement, building on an already strong data-driven culture.
Share and scale best practices across the wider engineering organisation, contributing to broader technical excellence and consistency.
Requirements
Overall 8+ years of engineering management experience, including at least 3 years leading teams of 5+ engineers on production, data-intensive systems.
Proven success building autonomous, high-performing teams that own systems end-to-end—from data ingestion to client-facing APIs.
Experience managing distributed or remote teams across multiple time zones.
Track record of recruiting, developing, and retaining engineering talent, with confidence in performance management and career growth conversations.
Strong experience managing teams that build and operate scalable data pipelines or ETL systems using tools like Airflow, Dagster, or Prefect.
Hands-on background with Python backend systems (FastAPI, pandas/polars, SQLAlchemy), as IC or manager.
Proficient with cloud-native infrastructure on AWS (or equivalent), including RDS/Aurora PostgreSQL, S3, Kubernetes/ECS, IAM, and Terraform.
Familiarity with CI/CD pipelines, GitOps practices, and deployment automation (GitHub Actions, ArgoCD).
Strong operational mindset: SLOs/SLIs, incident management, on-call rotations, and post-mortems.
Tech Stack
Airflow
AWS
Cloud
ETL
Kubernetes
Pandas
Postgres
Python
Terraform
TypeScript
Benefits
Don’t let the confidence gap stand in your way, we’d love to hear from you! We understand that experience comes in many different forms and are dedicated to adding new perspectives to the team.
Kpler is committed to providing a fair, inclusive and diverse work-environment. We believe that different perspectives lead to better ideas, and better ideas allow us to better understand the needs and interests of our diverse, global community.
We are here to help: We are accessible and supportive to colleagues and clients with a friendly approach.