Define and execute the group's data technical roadmap, aligning with Infra, DevOps, and Performance teams
Design and maintain flexible ETL/ELT frameworks for ingesting, transforming, and classifying cluster verification and telemetry data
Build and optimize streaming pipelines using Apache Spark, Kafka, and Databricks, ensuring high throughput, reliability, and adaptability to evolving data schemas
Ensure data quality and pipeline health through observability standards, schema validation, lineage tracking, monitoring, and alerting
Deliver reliable insights for cluster performance analysis, telemetry visibility, and end-to-end test coverage
Support self-service analytics for engineers and researchers via Databricks notebooks, APIs, and datasets
Drive best practices in data modeling, code quality, and operational excellence; collaborate with cross-functional teams to support data-driven decision-making
Contribute to the development of AI Agents that enhance the visibility and accessibility of insights and data for our users
Requirements
B.Sc. or M.Sc. in Computer Science, Data Science, or a related field
5+ years of hands-on experience in data engineering
Strong practical experience with Apache Spark( PySpark or Scala) and Databricks
Proficiency in Python and SQL for data transformation, automation, and pipeline logic
Experience with Apache Kafka, including stream ingestion and event processing
Experience with schema evolution, data versioning, and validation frameworks (Delta Lake, Iceberg, or Great Expectations)
Strong problem-solving skills and ability to debug and troubleshoot complex data-related issues
Strong communication skills and ability to work effectively across teams
Tech Stack
Apache
ETL
Kafka
PySpark
Python
Scala
Spark
SQL
Benefits
NVIDIA is committed to fostering a diverse work environment