Set technical strategy and oversee development of high scale, reliable data platform to manage, visualize and serve large-scale datasets for ML model training and validation.
Build up the data lakehouse for autonomous driving scene datasets, including the sensor data, calibration data, as well as annotation data
Drive the Autonomous Driving Data SDK development, including scene data search, datasets preparation, dataset loading, etc.
Dig into performance bottlenecks all along the data processing pipelines, from data processing latency, data search latency to Test Procedure (TP) coverage.
Bootstrap and maintain infrastructure for Data Platform components—Data Processing Pipeline, Database, Data Lakehouse and Data Serving.
Collaborate with cross-functional teams, including ML algorithm, ML application, and Cloud Infra to align ML Platforms with overall Autonomous Driving System Architecture.
Requirements
Bachelor's degree or higher in Computer Science, Engineering, Robotics, or a similar technical field.
Minimum of 7 years of experience in Data Engineering or ML Platform roles
Expert-level proficiency in Python and solid experience in Python SDK development
Solid working experience in Databases (e.g., MongoDB, PostgreSQL, etc)
Strong understanding of modern AI frameworks (e.g., PyTorch, TensorFlow etc.), especially the principle of distributed data loader for model training
Hands-on experience with data pipeline job orchestration with Databricks Workflows or Apache Airflow, as well as integrating data pipelines with machine learning models
Extensive experience with data technologies and architectures such as Data Warehouse (e.g., Hive) or Lakehouse (e.g., Delta Lake)
Experience with Apache Spark or other big data computing engines
Excellent leadership and communication skills, with a demonstrated ability to lead technical projects.
Tech Stack
Airflow
Apache
Bootstrap
Cloud
MongoDB
Postgres
Python
PyTorch
Spark
Tensorflow
Benefits
In accordance with fair hiring practices, do not include any personal information unrelated to your job qualifications (e.g., Social Security Number, family relations, marital status, age, photo, physical condition, place of birth, etc.) in your resume.
All documents must be submitted in PDF format and under 30MB in size.
If you experience issues uploading your resume, please send it along with the job posting URL to recruit@42dot.ai.
We strongly encourage applications from U.S. veterans and candidates eligible for employment preference under applicable laws.
Qualified individuals with disabilities are encouraged to apply and will receive consideration under the Americans with Disabilities Act (ADA).
42dot does not accept unsolicited resumes and will not pay fees for any such submissions. Equal Opportunity Statement
42dot is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees, regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or veteran status.