PointClickCare is focused on delivering innovative healthcare solutions, and they are seeking a Principal AI Platform Engineer to build the infrastructure that connects AI systems with existing products. This role involves designing and maintaining core infrastructure for GenAI products and ensuring seamless delivery of AI-generated insights into workflows.
Responsibilities:
- Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools
- Implement secure access controls and authentication mechanisms integrated by default into the AI platform components
- Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure
- Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications
- Optimize infrastructure for scalability, high availability, cost efficiency for production workloads
Requirements:
- Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security
- Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow)
- Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs
- Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads
- Experience with CI/CD pipelines and automation for AI model deployment and platform operations
- Strong knowledge of authentication and authorization frameworks integrated into AI platforms