CrowdStrike is a global leader in cybersecurity, dedicated to stopping breaches with their advanced AI-native platform. They are seeking a Principal Data Engineer to design and build data infrastructure for AI-driven security products, focusing on LLM integration and scalable solutions.

Responsibilities:

Architect, implement, and optimize data platforms and pipelines specifically designed to support LLMs, Retrieval-Augmented Generation (RAG), and sophisticated AI agentic systems at Exabyte scale
Drive the adoption and deployment of agentic workflows and agent harnessing techniques to create autonomous, data-driven security features
Design and implement highly scalable, fault-tolerant, and cost-effective data solutions, emphasizing rapid iteration and high-quality deployment
Write elegant, production-ready code with a focus on performance, maintainability, and testing rigor, ensuring the ability to ship fast without compromising quality
Provide technical leadership and deep expertise in data modeling, normalization, and semantic cataloging for AI/ML workloads
Establish best practices for MLOps/DataOps surrounding LLMs, including monitoring, observability, and zero-touch recovery mechanisms for AI services
Actively mentor engineers, conducting technical workshops, leading design reviews, and strengthening the team's knowledge in cutting-edge AI platform technologies
Collaborate across the organization with Data Scientists, Product Managers, and other engineering teams to transform research prototypes into robust, production-grade services
Own the end-to-end lifecycle of critical data services: development, testing, deployment, and monitoring

Requirements:

Master's degree or PhD in Computer Science, Data Engineering, or a related STEM field, or equivalent practical experience
10+ years of progressive experience in Data Engineering/Platform Engineering, with at least 3 years focused on architecting and building platforms for AI/ML or Data Science at massive scale
Demonstrable hands-on experience in LLM engineering (fine-tuning, prompt engineering, deployment), RAG, and developing agentic workflows
Proven track record of designing and delivering large-scale distributed systems (sharding, partitioning, concurrency)
Exceptional ability to write clean, elegant, performant, and well-tested code, coupled with a proactive mindset for delivering results quickly
A thorough understanding of engineering practices, including effective peer code reviews, resilient architecture design, and comprehensive testing paradigms
Prior experience in a Principal or Staff level engineering role, demonstrating technical leadership and mentorship capabilities
Direct experience building, deploying, and managing LLMs in a production environment
Prior experience in the cybersecurity, intelligence, or high-compliance industries
Contributions to open-source projects related to data or AI/ML

Principal Data Engineer, LLM/AI Platforms (Remote)

Key skills

About this role

Responsibilities:

Requirements: