Consensys is the leading blockchain and web3 software company founded by Joe Lubin, CEO of Consensys and Co-Founder of Ethereum. The Senior Data Engineer will design, build, and maintain robust data pipelines, ensuring data quality and governance while collaborating with various stakeholders to support the organization's data needs.
Responsibilities:
- Design, build, and maintain robust data pipelines that integrate sources across the business
- Collaborate closely with analysts, business stakeholders, and other engineering teams to gather requirements, align timelines, discuss architecture, and ensure successful delivery
- Document data pipelines, best practices, and processes to support onboarding and knowledge sharing within the team
- Develop and optimize data models to deliver trusted, structured, and business-ready data
- Ensure data quality, security, and governance are embedded in all pipelines and systems
- Orchestrate and monitor pipeline execution to ensure reliability and scalability
- Deploy and manage infrastructure as code
- Build and tune big data pipelines using SQL, Python, and distributed processing frameworks
- Work with cloud data warehouses to enable insights and analytics
- Maintain and update reporting solutions and user dashboards
- Automate workflows and improve CI/CD pipelines to reduce manual processes
Requirements:
- Design, build, and maintain robust data pipelines that integrate sources across the business
- Collaborate closely with analysts, business stakeholders, and other engineering teams to gather requirements, align timelines, discuss architecture, and ensure successful delivery
- Document data pipelines, best practices, and processes to support onboarding and knowledge sharing within the team
- Develop and optimize data models to deliver trusted, structured, and business-ready data
- Ensure data quality, security, and governance are embedded in all pipelines and systems
- Orchestrate and monitor pipeline execution to ensure reliability and scalability
- Deploy and manage infrastructure as code
- Build and tune big data pipelines using SQL, Python, and distributed processing frameworks
- Work with cloud data warehouses to enable insights and analytics
- Maintain and update reporting solutions and user dashboards
- Automate workflows and improve CI/CD pipelines to reduce manual processes
- Over 6 years of experience as a Data Engineer
- Strong SQL skills and experience with cloud warehouses (e.g.: Snowflake, BigQuery, Redshift)
- Comfort with Python or other scripting languages for ETL and automation
- Experience deploying and managing infrastructure as code (e.g.: Terraform, Pulumi)
- Experience with big data and distributed processing (e.g.: Apache Spark, AWS EMR, S3)
- Experience maintaining and improving reporting solutions and dashboards (e.g.: Preset/Superset, Cube.dev)
- Familiarity with CI/CD practices and automation (e.g.: GitHub Actions)
- A collaborative mindset and eagerness to work with both technical and non-technical colleagues
- Use Trusted Execution Environments (TEEs) to securely process sensitive user data, ensuring it remains protected and only aggregated insights are released
- Hands-on experience with transformation and orchestration tools (e.g.: dbt, Airflow, Dagster)
- Familiarity with data governance and metadata management (e.g.: DataHub)
- Exposure to data integration and ingestion tools (e.g.: Airbyte, Segment)
- Exposure to open source projects