CodeGeniusRecruit is seeking a Backend Engineer to design, build, and optimize distributed infrastructure for AI agents. The role involves developing core backend systems, collaborating with AI teams, and ensuring system performance and reliability.

Responsibilities:

Design, build, and optimise distributed infrastructure for training, deploying, and scaling AI agents across high-performance compute environments
Develop core backend systems (services, APIs, and orchestration layers) that support agent lifecycles, tool execution, memory access, and multi-agent coordination
Collaborate closely with research and applied AI teams to integrate model-serving pipelines, agent reasoning loops, memory stores, and planning components into production systems
Build and maintain agent runtime infrastructure, including task scheduling, state management, inter-agent communication, and execution reliability
Implement monitoring, observability, and fault-tolerance mechanisms for long-running agent processes and distributed workflows
Evaluate and improve system performance across compute, networking, storage, and inference layers, identifying and resolving bottlenecks
Participate in synchronous collaboration sessions (4-hour windows, 2–3 times per week) to review architecture decisions, troubleshoot distributed systems, and iterate on design improvements

Requirements:

Strong foundation in Computer Science, Software Engineering, or Systems Design, with experience building large-scale distributed systems
Proficiency in one or more backend or systems programming languages such as Go, Rust, Python, C++, Java, Scala, C#, Kotlin, or TypeScript/JavaScript
Experience with cloud infrastructure (AWS, GCP, or Azure) and containerisation/orchestration tools such as Docker and Kubernetes
Strong experience designing production-grade backend services, APIs, and distributed systems
Knowledge of networking, data streaming, caching, and performance optimisation in distributed systems
Excellent collaboration and communication skills
Ability to commit 30-40 hours per week, including required synchronous collaboration sessions
Familiarity with LLM inference pipelines, agent frameworks, multi-agent architectures, or reinforcement learning environments is a strong plus

Backend Engineer - Remote

Key skills

About this role

Responsibilities:

Requirements: