ECLARO is seeking a Senior Performance Engineer to be a technical leader within the Group Analytics Platform Agile Release Train. The role involves establishing and leading performance engineering practices, partnering with teams to ensure performance objectives are met, and evaluating AI-assisted performance tooling.
Responsibilities:
- Define and own the performance engineering strategy for the GAP ART, including standards, best practices, and success metrics
- Build and mature a repeatable performance engineering practice spanning design-time analysis, pre-release testing, and production monitoring
- Establish performance SLAs/SLOs, performance budgets, and measurable acceptance criteria aligned with business outcomes
- Act as the primary performance subject matter expert for the ART, advising architects and teams during design and planning
- Work across multiple Scrum teams to design, execute, and interpret load, stress, spike, and endurance tests for APIs, batch applications, and data pipelines
- Guide teams in diagnosing and resolving performance bottlenecks across application, data, and infrastructure layers
- Embed performance considerations into Agile ceremonies, PI planning, and architectural reviews
- Coach and mentor engineers on performance fundamentals, tooling, and diagnostic techniques
- Apply expert-level understanding of system performance fundamentals, including: Latency vs. throughput vs. concurrency, Back-pressure and flow control, CPU, memory, disk I/O, and network behavior, Horizontal and vertical scaling strategies
- Analyze and optimize performance of data-intensive workloads, including: Query execution and optimization (Redshift, Spark, PostgreSQL), Ingestion pipelines and batch processing jobs, API-based data access patterns
- Perform capacity modeling and growth forecasting to proactively identify scalability risks
- Define and evolve observability standards using metrics, logs, and distributed tracing
- Leverage golden signals to detect, diagnose, and communicate performance issues
- Lead root cause analysis efforts for performance-related incidents and near misses
- Partner with platform and SRE teams to improve production monitoring and alerting
- Evaluate emerging AI-assisted performance, testing, and observability tools
- Lead proofs of concept to assess value, accuracy, and applicability to the ART's ecosystem
- Present findings, recommendations, and tradeoffs to technical and leadership audiences
- Drive adoption of approved tools and integrate them into performance workflows and CI/CD pipelines
- Communicate performance risks, findings, and recommendations clearly to engineers, architects, and leadership
- Translate technical performance data into business impact and decision-ready insights
- Provide regular updates on the maturity, roadmap, and effectiveness of the performance engineering practice
Requirements:
- Extensive experience in performance engineering for complex, distributed, and data‐intensive systems
- Expert‐level knowledge of: System performance fundamentals (latency, throughput, concurrency, back‐pressure, resource utilization)
- Performance testing and modeling (load, stress, spike, capacity planning, growth forecasting)
- Observability and diagnostics (metrics, logs, traces, golden signals, RCA)
- Proven experience performance testing APIs and batch applications
- Strong hands-on experience with performance and observability tools such as JMeter and New Relic
- Ability to work effectively across multiple teams and technologies in a scaled Agile environment
- Exceptional communication, leadership, and stakeholder management skills
- Experience with AWS-based data platforms, including services commonly used in lakehouse architectures
- Knowledge of query engine behavior and optimization, including Redshift, Spark, and PostgreSQL
- Experience with event- or messaging-based integrations (Kafka or similar) is a plus
- Familiarity with CI/CD pipelines and integrating performance testing into delivery workflows
- Experience evaluating or using AI-assisted tooling for performance analysis, testing, or observability