Docker, Inc is a leading company in app development, aiming to enhance developer experiences. They are seeking a Senior Software Engineer to join their AI Developer Tools team, focusing on building AI-powered tools that improve developer productivity and streamline workflows.
Responsibilities:
- Build AI-Powered Developer Tools: Design, implement, and ship production-ready AI agents and tools that accelerate developer productivity such as code review and refactoring assistants, automated test generators, local environment setup tools, deployment pipeline diagnostic agents, and on-call assistance tools
- Implement LLM Integrations: Build robust, production-grade integrations with LLM APIs (OpenAI, Anthropic, etc.) such as prompt engineering, response parsing, error handling, rate limiting, cost management, and performance optimization
- Develop Agent Orchestration Systems: Create agent frameworks and orchestration systems that enable complex multi-step workflows, tool calling, context management, and agent-to-agent communication
- Contribute to Platform Infrastructure: Build self-service platform capabilities that enable teams across Docker to rapidly deploy and operate their own AI developer tools such as deployment pipelines, observability integration, security controls, and operational tooling
- Drive Adoption of AI-Native Development: Build tools and programs that accelerate adoption of AI developer tools such as Claude Code, Cursor, and Warp across Docker's engineering organization
- Ensure Production Quality: Write well-tested code with strong test coverage (unit, integration, end-to-end); establish monitoring, alerting, and operational excellence for AI systems
- Collaborate Cross-Functionally: Partner with Principal Engineer on architecture, work with product and design teams on features and UX, and collaborate with platform teams (Infrastructure, Security, Data) on integrations
- Participate in Operations: Take part in on-call rotation for AI developer tools; respond to incidents, debug production issues, and drive continuous improvement of system reliability
- Mentor and Share Knowledge: Guide other engineers through code reviews, pair programming, and technical discussions; document patterns and best practices for AI tool development
- Measure and Iterate: Instrument AI tools to measure adoption, effectiveness, and developer productivity impact; iterate based on data and user feedback to continuously improve developer experience
Requirements:
- 5+ years building production-grade backend systems or developer-facing tools
- Hands-on experience with AI/ML technologies such as practical production experience with LLM APIs (OpenAI, Anthropic, etc.), prompt engineering, or AI agent development
- Proficiency in Go (preferred), Rust, Java, or Python with strong software engineering fundamentals
- Experience designing and building distributed systems, microservices, or platform infrastructure
- Strong understanding of cloud-native systems (AWS, GCP, or Azure), APIs, and data stores
- Solid grasp of CI/CD, automated testing, code review practices, and modern development workflows
- Product-minded approach to building developer tools with focus on user experience and measurable outcomes
- Excellent communication skills in remote, asynchronous environments with ability to document technical decisions clearly
- Ownership mentality with bias for action and iterative delivery
- Comfortable working autonomously across distributed teams and navigating ambiguity
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
- Experience with AI agent frameworks (LangChain, LangGraph, CrewAI, or similar)
- Contributions to open source AI tools, developer tooling, or platform engineering projects
- Experience with MCP (Model Context Protocol) or similar AI agent integration standards
- Background in developer productivity, DevOps, SRE, or platform engineering domains
- Experience with Kubernetes, Docker, and container orchestration
- Knowledge of developer tools ecosystems (IDEs, CI/CD platforms, observability tools)
- Experience with infrastructure-as-code (Terraform, Pulumi) and GitOps deployment patterns (ArgoCD, FluxCD)
- Understanding of security, compliance, and operational best practices for production AI systems