Q6 Cyber is seeking a specialized Infrastructure Engineer to bridge the gap between large data repositories and the evolving world of Large Language Models (LLMs). The role involves building infrastructure that enables effective AI utilization, including deploying servers and managing scalable environments for AI agents.
Responsibilities:
- AI Architecture Guidance: Guide the architecture that will allow us to leverage AI tools with our large existing data stores and incoming streams of realtime intelligence
- Cross-Team Integration: Work closely with other infrastructure engineers and software development teams to integrate AI tools into existing systems
- MCP Ecosystem Management: Design, deploy, and maintain Model Context Protocol (MCP) servers to allow LLMs to securely interact with our internal databases, APIs, and external tooling
- Agentic Infrastructure: Build and orchestrate sandboxed, scalable environments (e.g., using Docker or specialized runtimes) where users can safely build and execute AI agents
- Internal RAG Platform: Develop and manage the infrastructure for our internal RAG (Retrieval-Augmented Generation) pipeline, including vector database management (e.g., Pinecone, Weaviate, or pgvector) and automated embedding pipelines
- Deployment & Scaling: Utilize Kubernetes (K8s) and Infrastructure as Code (Terraform/Pulumi) to deploy LLM-related tools, ensuring high availability and low latency for model inference and data retrieval
- Security & Governance: Implement strict guardrails for data privacy within LLM workflows, ensuring internal datasets remain secure while being accessible to authorized AI tools
Requirements:
- 5+ years of experience in DevOps, Platform Engineering, or SRE, with at least 1-2 years specifically focused on AI/ML infrastructure
- Proven track record of building production-grade RAG pipelines or LLM-integrated applications
- Thrives in 'day zero' environments where the tools and protocols (like MCP) are evolving weekly
- Deep understanding of the security implications of LLMs (prompt injection, data leakage, and secure tool execution)
- Experience working with substantial datasets (over 1bn objects, dozens or hundreds of TBs) and the challenges of leveraging AI tools with these data sets
- Bachelor's degree or equivalent in computer science or related field
- Cloud & Orchestration: AWS/GCP/Azure, Kubernetes, Terraform, Helm
- AI Frameworks: LangChain, LlamaIndex, LangGraph
- Data & Vectors: Pinecone, Milvus, Qdrant, or pgvector; Apache Kafka/Pulsar; Elasticsearch/OpenSearch; traditional SQL RDBMS
- Languages: Python (Expert), TypeScript/Node.js (for MCP development), Go
- AI Protocols: Model Context Protocol (MCP), REST/gRPC