Consensys is the leading blockchain and web3 software company founded by Joe Lubin, CEO of Consensys and Co-Founder of Ethereum. The Senior Data Engineer will design and maintain data pipelines, ensuring data quality and accessibility while collaborating with various stakeholders to support the organization's data needs.
Responsibilities:
- Design, build, and maintain robust data pipelines that integrate sources across the business
- Collaborate closely with analysts, business stakeholders, and other engineering teams to gather requirements, align timelines, discuss architecture, and ensure successful delivery
- Document data pipelines, best practices, and processes to support onboarding and knowledge sharing within the team
- Develop and optimize data models to deliver trusted, structured, and business-ready data
- Ensure data quality, security, and governance are embedded in all pipelines and systems
- Orchestrate and monitor pipeline execution to ensure reliability and scalability
- Deploy and manage infrastructure as code
- Build and tune big data pipelines using SQL, Python, and distributed processing frameworks
- Work with cloud data warehouses to enable insights and analytics
- Maintain and update reporting solutions and user dashboards
- Automate workflows and improve CI/CD pipelines to reduce manual processes
Requirements:
- Over 6 years of experience as a Data Engineer
- Design, build, and maintain robust data pipelines that integrate sources across the business
- Collaborate closely with analysts, business stakeholders, and other engineering teams to gather requirements, align timelines, discuss architecture, and ensure successful delivery
- Document data pipelines, best practices, and processes to support onboarding and knowledge sharing within the team
- Develop and optimize data models to deliver trusted, structured, and business-ready data
- Ensure data quality, security, and governance are embedded in all pipelines and systems
- Orchestrate and monitor pipeline execution to ensure reliability and scalability
- Deploy and manage infrastructure as code
- Build and tune big data pipelines using SQL, Python, and distributed processing frameworks
- Work with cloud data warehouses to enable insights and analytics
- Maintain and update reporting solutions and user dashboards
- Automate workflows and improve CI/CD pipelines to reduce manual processes
- Strong SQL skills and experience with cloud warehouses (e.g.: Snowflake, BigQuery, Redshift)
- Hands-on experience with transformation and orchestration tools (e.g.: dbt, Airflow, Dagster)
- Comfort with Python or other scripting languages for ETL and automation
- Familiarity with data governance and metadata management (e.g.: DataHub)
- Experience deploying and managing infrastructure as code (e.g.: Terraform, Pulumi)
- Exposure to data integration and ingestion tools (e.g.: Airbyte, Segment)
- Experience with big data and distributed processing (e.g.: Apache Spark, AWS EMR, S3)
- Experience maintaining and improving reporting solutions and dashboards (e.g.: Preset/Superset, Cube.dev)
- Familiarity with CI/CD practices and automation (e.g.: GitHub Actions)
- A collaborative mindset and eagerness to work with both technical and non-technical colleagues
- Use Trusted Execution Environments (TEEs) to securely process sensitive user data, ensuring it remains protected and only aggregated insights are released
- Exposure to open source projects