Chainalysis is a global organization focused on building trust in cryptocurrencies through innovative products. They are seeking a Senior Data Platform Engineer to lead the development and optimization of their data storage, processing, and querying platforms, enabling high-performance analytics and real-time data applications.
Responsibilities:
- Help define the technical vision of the team/org, articulate how our data platform and architecture could evolve
- Design, implement, and optimize our high-performance, scalable data serving platform that enables data querying and consumption across the organization and external facing data products
- Design, implement, and optimize our high-performance, scalable data storage and transformation platform that enables both batch and stream processing with 100+ million updates per day on datasets > 100 billion rows
- Build seamless integrations between Data Cloud and various relational and noSQL OLTP databases
- Build batch and streaming data pipelines for core blockchain datasets widely used across the company
- Deploy cloud infrastructure at scale with enterprise-grade reliability, implement and maintain infrastructure automation and self-service, and create robust CI/CD pipelines
- Establish and maintain observability, security, and data governance solutions to ensure high quality, efficiency, and reliability of data pipelines
Requirements:
- 6+ years of experience as a Data Platform Engineer or Data Engineer or Data Infrastructure Engineer, with hands-on expertise in building and maintaining cloud-based data platforms at large scale
- Passion for leading/contributing towards the technical vision of the team/org, strong ownership of mission critical systems, dedication to honing their craft while mentoring others
- Experience in building and maintaining both batch and streaming data pipelines using DBT/Databricks/Apache Spark/Apache Flink, as well as deep understanding of data architecture and data modeling best practices
- Expertise with AWS services, cloud architecture, fault-tolerant distributed data systems, and proficiency with Terraform for provisioning and managing cloud infrastructure
- Deep understanding of modern data lakehouse architectures and ecosystem such as Kafka, Flink, Spark, Databricks, Snowflake, DBT, Airflow, Debezium, Delta/Iceberg, StarRocks, Clickhouse, and proficient with Python/Java and SQL
- Experience building tools and frameworks to accelerate the development of data pipelines, and familiarity with data governance, data quality, and observability best practices
- Exposure to or interest in the cryptocurrency technology ecosystem
- Experience working with different blockchain technologies is a plus