EPAM Systems is a leading global provider of digital platform engineering and development services, seeking an innovative Principal Data Platform Engineer to join their team. The role involves leading the design and scaling of secure platforms on Azure, Databricks, and Snowflake, while driving best practices in data governance and enabling AI-driven analytics.
Responsibilities:
- Build and enhance Databricks-based data pipelines, prototypes, and production-ready solutions across structured, semi-structured, and unstructured data
- Lead architecture and development of a cloud-native data and artificial intelligence platform on Azure Databricks Lakehouse and Snowflake
- Design and implement API-first data platform capabilities for secure, governed data sharing across teams and partners
- Architect and maintain experimentation platforms for evidence-based decisions and composable frameworks for exploration and insights
- Implement and optimize Unity Catalog, Delta Lake, Delta Live Tables, Databricks SQL, Photon, MLflow, Workflows, and Lakeview dashboards
- Operationalize RBAC, access controls, and governance standards, and automate infrastructure provisioning using Terraform
- Develop and deploy RAG/LLM sandboxes, implement vector search, and design document ingestion and hybrid search patterns using Azure AI Search
- Lead data governance and stewardship practices using Collibra, including cataloging, lineage, metadata management, and quality frameworks
- Conduct PHI/PII scanning, implement data masking and anonymization, and enforce RBAC/ABAC access patterns
- Build operational dashboards for audit, data sensitivity, performance monitoring, and FinOps
- Develop lightweight data portals for self-service cataloging, search, and discovery
- Partner with internal engineers through code reviews, workshops, and hands-on knowledge sharing
Requirements:
- 12+ years in platform engineering, data engineering, or cloud data development roles
- 3+ years of hands-on Databricks experience in a senior, lead, or principal capacity
- Deep practical expertise with Unity Catalog, Delta Lake, Delta Live Tables, Databricks SQL, Photon, MLflow, Workflows, and Vector Search
- Experience building pipelines for structured, semi-structured, and unstructured data, including document parsing and natural language processing
- Strong SQL and Python skills, with experience in Azure Databricks, ADLS, ADF, Azure ML, Event Hubs, and Azure OpenAI
- Experience designing API-driven data platforms and working with Delta, Apache Iceberg, and UniForm
- Experience with Collibra or equivalent governance platforms
- Familiarity with Terraform, Docker, or Kubernetes for reusable deployment standards
- Databricks Associate or Professional certification(s) required
- Strong engineering mindset: prototype fast, productionize well, create repeatable patterns others can follow
- Experience with dbt Cloud, medallion architecture, Fivetran, or Snowpark
- Familiarity with Databricks Genie, MCP/server-based integrations, or self-service AI/BI enablement
- Experience with Salesforce integration patterns
- Prior experience in nonprofit, foundation, healthcare, or mission-driven organizations
- Working knowledge of Snowflake (Iceberg, Horizon Catalog, Cortex AI) in a modern lakehouse environment