Anaplan is a leader in AI-infused scenario planning and analysis platforms, helping global companies optimize their business decision-making. They are seeking a Senior Data Engineer who will set the technical direction for data ingestion, transformation, storage, and governance, building robust data pipelines for business users and supporting advanced analytics initiatives.
Responsibilities:
- Lead the data architecture, design, and deployment of scalable, high-throughput Big Data systems into production environments
- Architect, deploy, and manage the foundational data systems that underlie modern AI infrastructure, including vector, NoSQL, and document databases
- Develop end-to-end data engineering solutions, including robust ETL/ELT pipelines, API services, and data ingestion frameworks
- Design and build the storage and processing layers powering our analytics workloads: data lakes, data warehouses, distributed file systems, and real-time streaming architectures
- Engineer feature-rich context pipelines that process large-scale enterprise data, balancing batch and streaming patterns seamlessly
- Optimize and scale large distributed queries and data transformations to ensure high performance and low latency for end users
- Implement data quality frameworks to measure and ensure data integrity, reliability, and governance across all data assets
- Collaborate with analytics, product, and platform teams to build data models that capture the semantics of customer metrics, hierarchies, and relationships
- Stay current with the modern data stack and big data landscape, evaluating new tools, distributed computing frameworks, and database technologies for potential adoption
Requirements:
- 7+ years of dedicated data engineering experience, demonstrating a strong track record of hands-on execution and delivery in complex data environments
- Deep practical understanding of the database ecosystems that power AI and machine learning infrastructure (e.g., Vector databases, NoSQL, and Document stores)
- Hands-on experience building, scaling, and shipping large-scale data platforms in production
- Deep practical experience with distributed data processing frameworks (e.g., Apache Spark, Flink, Hadoop)
- Strong expertise in message brokers and event streaming platforms (e.g., Apache Kafka, Kinesis)
- End-to-end exposure to data pipeline lifecycle development, including extensive experience with workflow orchestration tools (e.g., Apache Airflow, Dagster)
- Hands-on expertise with cloud data warehouses (e.g., Snowflake, BigQuery, Redshift) and data lake architectures (e.g., Databricks, Delta Lake, Apache Iceberg)
- Advanced SQL skills and proficiency in Python, Scala, or Java
- Strong background in modern software development practices (testing, code review, CI/CD, Infrastructure as Code)
- Extensive, progressive experience leading technical projects and mentoring engineering teams
- Hands-on experience with cloud-native infrastructure (AWS, GCP, or Azure)
- Deep understanding of dimensional data modeling and warehouse optimization techniques
- Experience implementing data observability, monitoring, and alerting frameworks at scale
- Background in enterprise software, planning, or financial analytics applications
- Familiarity with Anaplan or similar enterprise planning platforms