Scorpion is the leading provider of technology and services helping local businesses thrive. The Senior AI Software Engineer, Python will be responsible for making Scorpion’s Conversational Intelligence Platform production-grade by building reliable, observable, and scalable infrastructure for real-time AI workloads.
Responsibilities:
- Own the reliability of the event-driven messaging layer, including backpressure management, idempotency, dead-letter handling, and retry strategies
- Build and operate the infrastructure that runs LLM orchestration workloads at scale
- Own the operational data layer for the CI runtime, including state management, session persistence, and real-time data access patterns
- Own observability for the CI platform, including structured logging, distributed tracing (OpenTelemetry), and error tracking (Sentry)
- Maintain and harden the interfaces between CI and downstream platforms, including contract testing, versioning, and failure handling
- Conduct code reviews and mentor team members on Python engineering practices and production readiness
- Own production support for CI infrastructure, including on-call responsibilities and incident response
Requirements:
- 5+ Years of Experience or Relevant Experience
- Developing production-grade Python services at scale
- Operating real-time or high-throughput infrastructure supporting AI/ML workloads
- Designing and operating distributed, event-driven systems in production environments
- Deep command of Python internals: async/await lifecycle, event loop mechanics, GIL implications for concurrency strategy, memory profiling
- Production experience with Pydantic, type systems, and structured data modeling in high-throughput services
- Strong opinions on code organization, error handling patterns, and testability in long-lived Python codebases
- Hands-on experience operating LLM inference infrastructure at scale
- Deep experience with NoSQL data modeling: partition strategy, consistency tradeoffs, query cost optimization, and hot-partition avoidance
- Experience with event-driven architecture in production: backpressure, idempotency, dead-letter handling, retry strategies
- Proficiency with observability tooling: distributed tracing (OpenTelemetry), structured logging, error tracking (Sentry)
- Experience with Azure cloud platform services
- Ability to negotiate technical boundaries with teams that own upstream and downstream services
- Clear, direct communication in design discussions, incident response, and code review