Seer is a research-driven AI company focused on building scalable intelligent systems capable of robust operation in dynamic environments. They are hiring Senior and Staff-level Data Infrastructure Machine Learning Engineers to scale the systems powering their ML training data platform, focusing on building and optimizing high-throughput data infrastructure and large-scale indexing and retrieval systems.

Responsibilities:

Architect, build, and operate distributed data infrastructure capable of processing and managing billions of video and multimodal data samples
Design systems with strong guarantees around reliability, latency, scalability, and cost efficiency
Optimize cloud object storage, metadata systems, databases, and large-scale distributed storage architectures
Build efficient indexing and retrieval systems to support rapid dataset querying, filtering, and iteration
Improve data access patterns and retrieval performance for research and production ML workflows
Design scalable metadata and search infrastructure for multimodal datasets
Develop monitoring, alerting, failure recovery, and performance optimization frameworks for large-scale data pipelines
Build tooling to identify bottlenecks and improve operational visibility across distributed systems
Optimize workload balancing and throughput across distributed compute and storage infrastructure
Build systems for artifact management, dataset versioning, lineage tracking, and reproducibility across training workflows
Ensure traceability and consistency across evolving datasets and training runs
Develop lightweight internal tooling enabling engineers and researchers to explore and analyze data at scale
Integrate and scale vision-language model (VLM) inference within distributed data pipelines
Support automated enrichment, filtering, metadata generation, and preprocessing workflows
Collaborate closely with ML systems and research teams to improve data quality and training velocity

Requirements:

5+ years of experience in data infrastructure, distributed systems, ML infrastructure, or related fields
Strong experience building and operating large-scale distributed data pipelines
Deep understanding of: Distributed systems architecture, Databases and metadata systems, Indexing and retrieval strategies, Cloud storage architectures
Experience optimizing throughput, workload balancing, and cost-performance tradeoffs in cloud environments
Hands-on experience with distributed processing frameworks such as Ray or Spark
Strong observability, monitoring, and production reliability experience
Strong software engineering fundamentals with the ability to own systems end-to-end
Experience managing large multimodal datasets
Familiarity with ML training workflows and data lifecycle management
Experience running large-scale ML inference workloads in distributed or cloud environments
Familiarity with vision-language models (VLMs)
Experience working with real-world sensor data such as video, telemetry, or time-series streams
Familiarity with data warehouse technologies such as Snowflake, BigQuery, or Redshift
Experience with data versioning and lineage systems such as DVC, Delta Lake, or similar tooling

Research Engineer - Data Infrastructure

Key skills

About this role

Responsibilities:

Requirements: