Increase automation of operational activities to reduce the risk of downtime, in collaboration with Platform Engineering and Domain Squads.
Drive systemic improvements across engineering teams based on root cause analyses (RCA) of incidents and telemetry insights.
Implement non-functional improvements (such as resilience, performance and reliability) directly in code, with review and approval from Domain Squads.
Promote SRE best practices across development teams — including integration patterns, monitoring, alerts and real-time tracing.
Provide cross-platform observability capabilities, complementing and enhancing what Domain Squads already provide.
Investigate issues and incidents, proposing and implementing necessary changes.
Continuously review logs, metrics and alerts to identify and implement ongoing improvements.
Design non-functional tests and run them continuously to ensure quality at all stages, including production.
Requirements
Advanced English — strong comprehension and communication skills.
Strong scripting skills using Python and Bash;
Experience working with cloud environments;
Knowledge of Grafana, Application Insights, OpenTelemetry and Prometheus;
Experience as a DBA, including creating and maintaining databases in SQL Server, MongoDB or PostgreSQL;
Understanding of APIs and asynchronous distributed software architectures;
Practical experience with AI-assisted tools, such as VS Code, Claude Code, among others;
Experience applying AI in Site Reliability Engineering (SRE);
Familiarity with process automation tools, such as n8n.
Tech Stack
Cloud
Grafana
MongoDB
Postgres
Prometheus
Python
SQL
Benefits
CAJU Flex card (meal or grocery): R$ 880.00 per month;
Health insurance with copayment for the employee;
Dental plan without copayment for the employee;
Personal life insurance with no payroll deduction;
Wellhub/Gympass;
3 days off per year, following internal policy guidelines and subject to prior alignment with leadership;
Parking available for visits to the São Paulo office;
Performance-based bonus;
Childcare allowance: Eligible for a parent with children up to 1 year old, regularly enrolled in a private daycare or school, in the amount of R$ 474.50 per month.