McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. The role of Databricks Data Engineer involves designing and operating reliable, scalable data workflows on the Databricks platform, focusing on data process monitoring, job optimization, and data quality.
Responsibilities:
- Build, optimize, and maintain batch and streaming data pipelines using Databricks, Apache Spark, and Delta Lake for cloud analytics workloads
- Monitor, troubleshoot, and report on the status and health of data pipelines and processing jobs using Databricks-native tools, logs, and dashboards to ensure timely and reliable data delivery
- Analyze and resolve job failures, resource bottlenecks, and data quality issues, escalating problems as needed and providing root-cause analysis
- Apply strong SQL and data modeling knowledge (from Oracle, PostgreSQL, MongoDB) when creating, transforming, and validating large data sets to support a variety of business and analytics use cases
- Implement and enforce data security controls, encryption, and access policies within Databricks, following industry best practices and healthcare compliance requirements
- Work with data governance, compliance, and IT security teams to continuously evaluate and improve system security, privacy and regulatory alignment
- Document pipeline architecture, monitoring processes, and standard operating procedures for the data engineering team and other stakeholders
- Collaborate with business intelligence, analytics, and data operations teams to deliver high-quality data with consistent performance and availability
Requirements:
- Degree or equivalent and typically requires 4+ years of relevant experience
- 4+ years of hands-on experience with Databricks and Apache Spark in a cloud or enterprise setting
- Experience with data process monitoring tools, alerting automation, and dashboarding inside Databricks
- Advanced knowledge of Databricks jobs, job monitoring, error handling, and performance metrics tools
- Good understanding of database fundamentals, including SQL, table design, indexing, and troubleshooting (Oracle, PostgreSQL, MongoDB)
- Experience building, documenting, and supporting reliable production-grade data workflows
- Proficient in Python and SQL for data engineering and automating monitoring/reporting tasks
- Bachelor's degree in computer science, Information Systems, Engineering, or related discipline
- Candidates must be authorized to work in USA. Sponsorship is not available for this role
- Understanding of Delta Lake and lakehouse
- Experience with Azure cloud environment security, job orchestration, and production support best practices