TEKsystems is a leading provider of business and technology services, and they are seeking a Senior Big Data Engineer to drive the design, development, and operational excellence of their data platform. The role involves developing scalable data pipelines, optimizing Spark jobs, and establishing best practices for data management.
Responsibilities:
- Design and implement scalable data pipelines using Delta Lake and manage enterprise-wide data access, security, and lineage using Unity Catalog
- Optimize large-scale Spark jobs (PySpark/SQL) and cluster configurations (Photon) to meet stringent SLA and cost performance targets across all workflows
- Build resilient data scheduling via Databricks Workflows (Jobs) and establish automated CI/CD pipelines for reliable code promotion across Dev, Staging, and Prod workspaces
- Migrate data and models from relational databases to Databricks
- Ensure best practices for development using industry standard development patterns
- Monitor data pipelines performance
- Partner with Data Management and Full Stack development engineers to operationalize models with current applications and processes
- Support existing data pipelines to ensure business continuity
- Stay updated with the latest trends and technologies in data engineering and cloud computing
- Perform other related duties as assigned
Requirements:
- 10+ years of experience acting as a Sr Data Engineer / Big Data Engineer with the ability to develop and build data pipelines
- 2-3+ years of experience working on the databricks platform and Medallion Architecture performing duties like: Autoloading Lake Flow Connections Clusters, Catalogs, and Work spaces Building CI/CD Pipelines Data Masking Data Streaming
- 5+ years of experience working on the AWS cloud platform and with AWS technologies. (Azure would also suffice)
- 5+ years of experience working with Python coding language, PySpark, and PySQL
- Ability to work with cross functional teams, business and IT stakeholders
- Strong business acumen within the HealthCare domain (Health Insurance, Pharmacy, etc..) with knowledge of HIPPA guidelines, PHI/PII for Data Masking
- Bachelor's degree is required
- 5+ years of experience in Data Engineering, with a significant focus on data warehousing, ETL/ELT development, and distributed systems
- 3+ years of hands-on experience developing enterprise solutions on the Databricks platform
- Expertise in PySpark and high-performance SQL
- Deep understanding of Delta Lake architecture and optimal maintenance practices
- Experience with cloud platforms (AWS preferred) and integrating Databricks with native cloud services (S3, Secret Manager, IAM)
- Solid experience implementing CI/CD for Databricks notebooks and associated libraries
- Strong development experience with: Python, Spark, or similar technologies
- Excellent verbal, communication, negotiation, and presentation skills
- Strong analytical and problem-solving skills
- Ability to work autonomously while collaborating across teams to deliver timely projects
- Ability to explain complex concepts in simple terms
- Dedicated, hardworking employee who achieves maximum efficiency and productivity
- Strong knowledge of domain-based design, data modeling and data structures
- Strong knowledge of best practice in data management
- Healthcare experience is preferred
- If the candidates do not have HealthCare / PHI experience, then Financial Experience is required so they have experience working with PII information