Abnormal AI is focused on building and supporting world-class data pipelines for their AI-native security platform. The Senior Software Engineer - Data Engineering will establish the technical foundation for data excellence, ensure the reliability of critical data pipelines, and collaborate with various teams to align data infrastructure with business needs.

Responsibilities:

Own mission-critical pipeline reliability: Take end-to-end ownership of our production data pipelines processing billions of messages weekly, ensuring 99.9% uptime for revenue-critical pipelines that directly enable sales and customer-facing AI products
Build self-healing pipelines: Design and implement automated monitoring, testing, and recovery systems for data pipelines that eliminate manual intervention and reduce MTTR from hours to minutes
Accelerate development velocity: Deploy CI/CD pipelines and self-service platforms that reduce deployment time from 3-5 days to under 2 hours, enabling Data Scientists to safely deploy models without engineering bottlenecks
Architect for scale: Optimize data pipelines handling exponential annual growth, implementing cost-effective solutions that support regional expansion and compliance requirements (GDPR, FedRAMP, SOC2)
Bridge technical and business domains: Partner with Sales, Finance, and Product teams to ensure data infrastructure aligns with business needs, making critical trade-off decisions when pipelines impact revenue
Establish data engineering excellence: Define best practices for dbt, Airflow, Spark usage, PII anonymization, and cross-divisional data sharing while mentoring embedded Data Guild team members on these
Enable AI and accessible data consumption: Design and maintain an accessible semantic layer that provides consistent, trustworthy definitions and abstractions, making it easy for stakeholders to consume data and incorporate AI-driven insights into their workflows

Requirements:

6+ years of software engineering experience in backend, distributed systems, or data-focused roles
Proven experience designing and running large-scale, production-grade data pipelines
Proficiency in our stack: Python, Spark/PySpark, Airflow, SQL, dbt, Databricks, Snowflake, AWS
Proven track record of driving pipeline reliability to 99%+ uptime, including SLAs, observability tooling, and automated recovery patterns
Strong systems-thinking skills with the ability to debug complex distributed systems, optimize for performance and cost, and make architectural decisions balancing short-term needs with long-term scalability
Demonstrated ownership mindset and ability to drive projects from conception to production independently, including on-call responsibilities for critical systems
Experience collaborating with Data Science, Analytics, Product, Finance, Marketing, and Sales, along with the ability to communicate technical decisions clearly to non-technical stakeholders and executives
Bachelor's degree in Computer Science, Applied Sciences, Information Systems or other related quantitative fields
Experience building or operating AI/ML data pipelines, including data readiness for training and evaluation
Background in high-growth environments where data volume doubles annually, requiring frequent re-architecture and optimization
Experience with compliance frameworks such as GDPR, SOC2, FedRAMP, plus familiarity with PII handling and anonymization
Knowledge of multi-region data architectures, cellular/multi-tenant systems, or related large-scale distributed design patterns
Background in cybersecurity, threat detection, or email security
Experience building internal developer tools for data scientists and analysts
Track record of mentorship, tech leadership, and driving cross-functional initiatives
Advanced degree in Computer Science or related fields

Senior Software Engineer - Data Engineering

Key skills

About this role

Responsibilities:

Requirements: