Abnormal AI is seeking a Senior Software Engineer for their Dev Accelerator team, which is responsible for building and operating the internal developer platform. The role involves designing and evolving tools and infrastructure that enable engineers to develop products efficiently and safely.
Responsibilities:
- Design and evolve the internal developer platform that underpins virtually all backend development
- Own high-leverage projects across CLI tooling, CI/CD, shared libraries, and test infrastructure
- Work across Python, Go, Bazel, Kubernetes, AWS, and CI systems to make our Golden Path fast, reliable, and intuitive
- Shape abstractions and workflows that are AI-native, consumable by both humans and AI agents
- Build and evolve developer CLI tooling
- Extend our primary developer CLI to scaffold new component types and services end-to-end (service manifests, container/build configs, deployment charts, build files, API definitions, starter code, alerts, runbooks)
- Improve environment and credentials tooling to make local development setup fast and reliable
- Own core CI/CD and linting infrastructure
- Design and maintain the backend CI workflow used by backend services
- Evolve linting, formatting, and typing tools to enforce architectural and code-quality guardrails across the monorepo
- Debug and fix CI issues that block engineers, and proactively reduce flakiness and runtime
- Steward shared Go and Python ecosystems
- Own key shared Go libraries (auth, caching, clients, configuration, cryptography, logging, metrics, domain/realm, server, etc.) and their usage across many applications
- Maintain and evolve Python shared libraries and frameworks in our core libraries, gRPC helpers, utilities, and standardized components
- Strengthen test and release safety
- Extend automated canary analysis with new metric types, backtesting, and safer defaults
- Build and improve test automation tooling, bad-test detection dashboards, and dependency-analysis utilities to keep main green and tests reliable
- Contribute to automation that classifies CI failures and summarizes them for engineers (including LLM-assisted workflows)
- Drive platform-level design and abstractions
- Design abstractions that balance simplicity for product engineers with enough power for advanced use cases
- Collaborate with PM/TPM, infrastructure, and product teams to scope and deliver multi-team initiatives (e.g., prompt-to-product workflows, typing and linting initiatives, test automation)
Requirements:
- Strong experience with Python 3.x: CLI development (e.g., Click or similar)
- YAML/Jinja2-style templating
- Modern type hints and typing discipline (e.g., typing, dataclasses / attrs, Pydantic-style patterns)
- Testing with pytest or similar
- Solid experience with Go: Shared library and service development (gRPC/HTTP)
- CLI patterns (e.g., Cobra/Viper or equivalents)
- Testing with Go testing frameworks (e.g., Ginkgo/Gomega or the standard library)
- Protobuf/gRPC: Schema design and evolution
- Cross-language client/server generation and integration
- Bazel in a large monorepo: BUILD rules and dependency management
- Working with code generation for APIs and clients
- Docker image builds
- Kubernetes concepts (Helm-style values, service deployments, readiness/liveness/health checks)
- Experience with a meaningful subset of AWS: object storage, relational databases (e.g., Postgres), key–value/document stores, search, streaming/ingest services, Kafka, Redis, IAM
- Authoring non-trivial pipelines (matrix builds, reusable workflows, secrets/permissions)
- Comfortable working with service manifests (YAML), Terraform/Terragrunt-like patterns, or internal equivalents
- Hands-on experience configuring and tuning linting and typing tools (e.g., pylint, ruff, mypy, golangci-lint)
- Monorepo experience: Worked in a large, shared codebase with complex dependency graphs and shared frameworks
- Familiar with dependency graph analysis and strategies to keep builds/tests fast
- Built or maintained CLI tools, scaffolding systems, or internal frameworks used by other engineers
- Thoughtful about ergonomics, documentation, and guardrails
- Exposure to canary analysis / progressive rollout systems (e.g., Prometheus/PromQL, Grafana, automated deployment checks)
- Experience with test data management, integration/E2E test infrastructure, or bad-test detection
- Experience with Kafka (topic design, producers/consumers, observability, error handling)
- Experience or strong interest in using LLMs to improve developer workflows (e.g., failure summarization, smart code generation, AI-native CLIs)