Newfold Digital is a company that has been trusted for decades to help people get online and stay ahead. As a Senior Software Engineer - AI, you will design and scale APIs, implement workflows with AI models, and maintain CI/CD processes to support a robust AI platform.
Responsibilities:
- Design & scale async REST/WebSocket APIs with Python 3.11+ + FastAPI, using dependency-injection, type hints, and clean vertical-slice architecture
- Implement multi-agent workflows with Semantic Kernel (handoff, sequential, concurrent) to route traffic among specialised LLM agents
- Integrate LLM providers (OpenAI GPT-4.1/mini, Google Gemini 2.5 Flash) behind a provider-agnostic layer for A/B and cost-aware routing
- Deliver Retrieval-Augmented Generation with vector stores such as Azure AI Search, pgvector, or Chroma
- Expose tool-using agents via OpenAI Assistants (Code-Interpreter) for data-analysis / file-manipulation tasks
- Evolve schemas with SQLModel / SQLAlchemy 2 & Alembic; tune Postgres for high-concurrency async access
- Maintain robust CI/CD (Bitbucket Jenkins) that lint, type-check, test, package (Docker), and deploy
- Instrument services with structlog JSON logs, OpenTelemetry traces, and cost/latency metrics; hold p95 < 100 ms
- Champion AI-assisted development (GitHub Copilot, Cursor) and share pragmatic problem-solving practices with the team
Requirements:
- 5 + yrs building production APIs in Python; 2 + yrs with FastAPI (or similar async stack)
- Deep knowledge of async I/O, Pydantic v2, DI, and observability
- Hands-on with Semantic Kernel or comparable agent frameworks
- Practical RAG implementations using Azure AI Search, pgvector, or Chroma
- Strong Postgres skills, including SQLModel/SQLAlchemy 2 and Alembic migrations
- Proven integrations or Side Projects with LLM APIs (OpenAI, Gemini) and structured-output design
- Dependency management via Poetry and virtual-env isolation
- End-to-end CI/CD ownership (build → scan → test → deploy)
- Excellent analytical and problem-solving ability
- Remote work readiness with daily overlap of at least 09:00 – 13:00 EST
- Event/message queues (RabbitMQ, Azure Service Bus, Kafka)
- Observability stacks (Grafana, LangFuse) for LLM cost governance