Distributed SystemsJavaKafkaPostgresAIClaudePostgreSQLGitHubEvent StreamingRemote Work
About this role
Role Overview
Design, evolve, and maintain core libraries, framework extensions, and shared components used across all engineering teams
Take ownership of system performance, scalability, and high-load architecture, ensuring core services can handle massive throughput with low latency
Proactively monitor production behavior, investigate complex application/database performance bottlenecks, and introduce advanced metrics, dashboards, and alerting systems
Establish and advocate for best practices in concurrency, memory management, and resource utilization across the backend ecosystem
Collaborate with cross-functional teams including backend, web, mobile, QA, and product teams
Take ownership of specific functional areas within our large-scale architecture
Leverage AI tools and assistants as integral parts of your development workflow—from system design to code review
Contribute to maintaining high test coverage, predictable production behavior, and seamless scalability
Drive engineering excellence through modern practices and continuous improvement
Requirements
Deep, low-level understanding of Java internals (JVM, GC behavior, memory management, concurrency/multithreading, and performance profiling)
Advanced knowledge of relational databases (PostgreSQL), including query optimization, indexing strategies, locking mechanisms, and connection pooling under heavy load and distributed systems architecture, specifically high-throughput event streaming with Kafka (partitioning, replication, consumer group rebalancing, and tuning).
Strong system design, troubleshooting and debugging capabilities
Comfortable and proactive in using AI coding assistants (GitHub Copilot, Claude Code, etc.) as productivity multipliers
Practical experience validating, refining, and taking ownership of AI-generated output