About this role

Eventual is a company focused on revolutionizing data processing for Physical AI systems. As a Research Engineer on the Visual Understanding team, you will develop and implement methods to make vast amounts of video data easily queryable and efficient for customer training needs.

Responsibilities:

Own the visual understanding roadmap end-to-end: from picking the model family for a customer's taxonomy to landing it in production inference at corpus scale
Train, fine-tune, and evaluate VLMs, VQA models, embedding models, and convolutional perception models against customer datasets and benchmarks
Drive down per-clip annotation cost — model selection, distillation, batching, decode pipelining — so "annotate every clip in a 10K-hour corpus" stays economical
Build the rich, queryable datasets that customers train on: design taxonomies with researchers, instrument quality, version the outputs
Partner with the dataloading and storage teams so visual understanding outputs flow into the index and on to the GPU without re-engineering
Work directly with researchers at our partner labs — your shortest feedback loop is their next training iteration

Research Engineer, Multimodal Data

Key skills

About this role

Responsibilities: