Design and develop high‑performance data converters for multi‑sensor autonomous‑driving data (camera, lidar, radar), ensuring accurate time alignment and robust handling of raw sensor logs.
Design, build, and optimize large‑scale ingestion and transformation pipelines (ETL/ELT) capable of processing petabyte‑scale autonomous‑driving sensor data, and automate them for reliable, production‑grade deployment.
Work with data formats such as ROS bags, MCAP, and custom binary encodings; establish standards for schema evolution and metadata integrity.
Implement automated data validation, quality checks, and lineage tracking to ensure reliability of production datasets.
Collaborate closely with ML, annotation, simulation, and perception teams to ensure cross‑team ownership of data products and deliver datasets that are consistent, semantically correct, and ready for downstream consumption.
Proactively assess current capabilities to identify areas for improvement proposing solutions that align with core strategy and operation.
Requirements
Bachelor's or Master's degree in STEM related field with 5+ years of working experience with cloud technologies & data operations.
Experience building or maintaining converters, decoders, or transformation pipelines for sensor‑rich data (e.g., lidar point clouds, camera streams, radar detections).
Understanding of multimodal data synchronization, timestamp alignment, and multi‑sensor calibration workflows.
Experience with distributed compute frameworks (Ray, Spark, Beam) and cloud‑based platforms like Anyscale and Databricks for large‑scale data‑pipeline execution.
Experience with high‑performance computing techniques, including vectorized data processing (NumPy), multithreaded or parallel execution, and GPU‑accelerated compute for optimizing large‑scale sensor‑data workloads.
Proficiency in Python, SQL, Shell Scripting.
Experience with major cloud providers like AWS, Google Cloud Platform (GCP) or Azure.
Operates with broad autonomy, leading complex technical work and driving alignment across team boundaries.
Owns key data‑pipeline and converter solutions end‑to‑end, setting direction and building consensus.
Provides project leadership and mentors less‑experienced engineers to ensure high‑quality execution.
Tech Stack
AWS
Azure
Cloud
ETL
Google Cloud Platform
Numpy
Python
Ray
Shell Scripting
Spark
SQL
Benefits
A competitive compensation package that includes a bonus component and stock options
100% paid medical, dental, and vision premiums for full-time employees
401K plan with a 6% employer match
Flexibility in schedule and generous paid vacation (available immediately after start date)