Torc Robotics is a leader in autonomous driving technology, focused on developing software for automated trucks. The Senior Software Engineer - Data Pipeline will design and develop high-performance data converters and large-scale ingestion pipelines to support the company's autonomous driving stack, ensuring the reliability and quality of production datasets.

Responsibilities:

Design and develop high‑performance data converters for multi‑sensor autonomous‑driving data (camera, lidar, radar), ensuring accurate time alignment and robust handling of raw sensor logs
Design, build, and optimize large‑scale ingestion and transformation pipelines (ETL/ELT) capable of processing petabyte‑scale autonomous‑driving sensor data, and automate them for reliable, production‑grade deployment
Work with data formats such as ROS bags, MCAP, and custom binary encodings; establish standards for schema evolution and metadata integrity
Implement automated data validation, quality checks, and lineage tracking to ensure reliability of production datasets
Collaborate closely with ML, annotation, simulation, and perception teams to ensure cross‑team ownership of data products and deliver datasets that are consistent, semantically correct, and ready for downstream consumption
Proactively assess current capabilities to identify areas for improvement proposing solutions that align with core strategy and operation

Requirements:

Bachelor's or Master's degree in STEM related field with 5+ years of working experience with cloud technologies & data operations
Experience building or maintaining converters, decoders, or transformation pipelines for sensor‑rich data (e.g., lidar point clouds, camera streams, radar detections)
Understanding of multimodal data synchronization, timestamp alignment, and multi‑sensor calibration workflows
Experience with distributed compute frameworks (Ray, Spark, Beam) and cloud‑based platforms like Anyscale and Databricks for large‑scale data‑pipeline execution
Experience with high‑performance computing techniques, including vectorized data processing (NumPy), multithreaded or parallel execution, and GPU‑accelerated compute for optimizing large‑scale sensor‑data workloads
Proficiency in Python, SQL, Shell Scripting
Experience with major cloud providers like AWS, Google Cloud Platform (GCP) or Azure
Operates with broad autonomy, leading complex technical work and driving alignment across team boundaries
Owns key data‑pipeline and converter solutions end‑to‑end, setting direction and building consensus
Provides project leadership and mentors less‑experienced engineers to ensure high‑quality execution
Working experience with design patterns & frameworks development for ML & operational data pipelines in the cloud
Familiarity with 3D labeling and CV annotation workflows
Experience optimizing I/O‑heavy workloads, including columnar formats (Parquet, Arrow)
Knowledge of orchestration tools (Airflow, Argo, Prefect)
Hands‑on experience designing CI/CD automation for data services, including GitHub Actions, Databricks pipelines, and cloud‑native deployment workflows
Background in Agile Engineering Practices

Senior, Software Engineer - Data Pipeline

Key skills

About this role

Responsibilities:

Requirements: