Elmwood Park, New Jersey, United States of America
Full Time
1 day ago
$140,000 - $160,000 USD
H1B Sponsor
Key skills
AirflowAWSAzureCloudETLGoogle Cloud PlatformSQLELTData EngineeringSnowflakedbtGCPGoogle CloudS3GCSCommunicationRemote Work
About this role
Role Overview
Define strategy for legacy system decommissioning, including data extraction, migration, translation, validation, and long-term archival access.
Architect ETL/ELT data pipelines from legacy electronic medical record (EMR/EHR) systems (e.g., Cerner, eClinicalWorks (eCW), NextGen,) including both structured and unstructured content into the LKOasis platform leveraging Snowflake.
Own and architect the Medallion architecture (Bronze/Silver/Gold layers), establishing standards for ingestion, curation, quality, lineage, and consumption-ready datasets.
Build strategy to identify change records (CDC/incrementals) and implement them in a Medallion (Bronze/Silver/Gold) design so downstream tables always reflect the latest changes.
Design, build, and optimize ETL/ELT pipelines for batch and incremental processing.
Orchestrate pipelines using tools such as Airflow, ADF, dbt or similar schedulers/orchestrators.
Establish and enforce best practices for performance and cost optimization (warehouse sizing, workload isolation, clustering, query tuning).
Implement Snowflake capabilities such as stages, file formats, Snowpipe, Streams/Tasks, stored procedures, and secure data sharing as needed.
Create data quality, testing, and monitoring frameworks (validation checks, SLAs, alerting, incident triage).
Partner with stakeholders to translate requirements into curated datasets/data products and documentation.
Lead implementation of Snowflake Container Services (SCS) to run and manage containerized workloads that support data ingestion, processing, and platform integrations within Snowflake.
Mentor data engineers and contribute to architecture reviews, design patterns, and technical roadmap.
Requirements
7+ years of experience in data engineering and/or data architecture, including 3+ years working with Snowflake.
Proven experience building and orchestrating ETL/ELT pipelines in production environments.
Strong expertise in Snowflake architecture and development (SQL, performance tuning, Snowpipe, Streams/Tasks, UDFs/stored procedures).
Hands-on experience with at least one orchestration/transformation stack. Airflow and/or dbt preferred.
Strong SQL skills and programming experience.
Experience with cloud platforms and storage (AWS S3, Azure ADLS, or GCP GCS).
Strong understanding of data integration patterns (CDC, incremental loads, semi-structured data like JSON/Parquet).
Experience implementing data governance and security controls in cloud data platforms.
Strong communication skills and ability to lead cross-functional technical discussions.