Take ownership of the infrastructure in support of developing and deploying Machine Learning models for Autonomous Vehicles
Architect and deploy cloud and on-prem ML training and evaluation infrastructure
Own the the data management pipelines, from ingestion and storage, to model training and evaluation that span vehicle compute, cloud, and on-prem
Change model training code to take advantage of the better data storage techniques and formats you propose
Evaluate and implement methods, software, and hardware for model deployment onto the test and production vehicles
Develop systems and processes to improve transition of models from research to production while balancing cost
Participate in model design, research and set requirements to model design that ensure their successful deployment
Own and deliver projects end-to-end
Optional: be able to hire, manage, or at least mentor other engineers who join this project when growth is needed
Requirements
Experience in architecting and implementing data engineering solutions for a small engineering team / product (1-20 ppl)
2+ years of software engineering experience in any of the following: ML Infrastructure, Data Engineering, Platform Engineering, Distributed Systems
Either existing experience with ML Infrastructure as described below, or strong expertise in non-ML Data infrastructure combined with a strong desire to learn ML Infra specifics
Production ML experience with at least one of the following
(1). Model conversion and optimization for production (ONNX, TensorRT), (2) Model deployment on specialized hardware (e.g. Jetson), or (3) Model monitoring and MLOps
Ability to programmatically access cloud services using Python, NodeJS, or equivalent
Knowledge of or experience with data management solutions, such as
(1) Workflow orchestration pipelines (e.g. Argo, Airflow, Kubernetes) or (2) Managed large-scale data processing systems (e.g. Spark, Dataproc, Databricks)