Fundraise Up is a global fundraising platform that aims to make donating to nonprofits fast and accessible. The DevOps Engineer / SRE will be responsible for the stability, performance, and security of the server infrastructure, working with on-premise systems and driving automation projects.

Responsibilities:

Work primarily with on‑premise infrastructure (bare metal and VMs): setup, maintenance, troubleshooting
Drive clarity in ambiguous situations by defining requirements, assumptions, and next steps
Own automation projects end‑to‑end (design → rollout → maintenance)
Improve how we operate: harden and tune systems and also improve the way the team works in terms of operational hygiene
Keep the platform stable, fast, and secure: servers, web servers, databases, queues
Investigate production incidents across OS / networking / infrastructure layers, apply temporary mitigations, coordinate with developers and participate in post‑mortems
Participate in on‑call rotations
Use AI in all aspects of day‑to‑day work: researching, troubleshooting, developing

Requirements:

4+ years as a DevOps Engineer / SRE (or very close responsibilities)
Real, hands-on experience with servers (VMs, bare metal) at the OS level and below: configuring, troubleshooting, digging into 'why it's broken'
Confident Linux skills (we use Ubuntu). We expect you to be comfortable with the core tools from Linux Crisis Tools
Solid understanding of networking basics; ability to configure and troubleshoot iptables
Ansible + Git
Experience with Bash or Python scripting for automation/observability
Production/on-call experience: diagnosing incidents, restoring service, participating in post-mortems
Ownership and attention to detail. Downtime is expensive: five years ago, 10 minutes of downtime cost us $100k — today it's even more
ClickHouse, MongoDB: what each database is used for, monitoring, troubleshooting performance and slow queries, sharding
Kafka: operating clusters at scale (topic moves, broker replacements, tuning)
Redis: high-load tuning, replication, sharding, performance monitoring
Elasticsearch: configuration, scaling, sharding/cluster management
HAProxy / Nginx: load balancing, SSL/TLS, caching, reverse proxying, performance monitoring
OS tuning: kernel/network stack/filesystem parameters for high-load systems
Full Disk Encryption on LVM: We use Clevis + Tang in production
Infrastructure Security: Teleport, HashiCorp Vault

DevOps Engineer / SRE

Key skills

About this role

Responsibilities:

Requirements: