Stack is developing revolutionary AI and advanced autonomous systems designed to enhance safety, reliability, and efficiency in the trucking transportation industry. The Staff Software Engineer for the ML Platform will focus on building multimodal data mining and semantic search solutions to support autonomous vehicle product development, while collaborating across teams to improve data understanding and infrastructure.
Responsibilities:
- Build state-of-art multimodal data mining and semantic search solutions to power AV product development
- Develop data understanding platform infrastructure for real-time querying/vector databases and batch/stream processing using technologies like Ray, Spark, Lance, or similar
- Deliver end-to-end data mining solutions that span onboard (C++) and offboard (ML & Data Infra) infrastructure to accelerate AV product development
- Develop e2e solution for real-time semantic search services (text/images/videos) and vector DBs
- Discover and identify key issues in existing ML infra and proactively improve system performance
- Build low latency/high throughput batch or stream processing pipelines
- Drive technical discussions across multiple orgs and deliver solutions on a timely basis
- Architect and tune ETL pipelines to maximize GPU/CPU/Ram utilization
- Write readable and high-performance Python/C++ code
Requirements:
- Experience with both ML platforms and building ML-based applications (modeling experience is a bonus)
- Proven track record of building scalable, reliable infrastructure in a fast-paced environment
- Ability to collaborate effectively across teams
- Experience building or using ML infrastructure for a large number of customer teams
- Deep understanding of design trade-offs with the ability to articulate those trade-offs and achieve alignment with others
- Experience with model training, model optimization, or large data processing pipelines
- 6+ years of experience with multimodal data indexing and inference pipelines
- 6+ years of experience building semantic search service, embedding generation for video/images and vector DB
- 6+ years of experience with large scale ML pipelines (Airflow/Flyte) and model optimization
- Experience in building ML models or infrastructure in domains such as autonomous vehicles, perception, and decision-making (desirable but not required)
- Prior experience in autonomous vehicles (AV) is a plus