TwelveLabs is pioneering the development of multimodal foundation models that comprehend videos like humans. As a Senior Backend Software Engineer, you will build the server-side infrastructure for a new application layer, focusing on video processing workflows and integrating machine learning models into scalable backend services.
Responsibilities:
- Design and build backend services for video processing workflows — ingestion, transcoding, 4K export, metadata extraction, and timeline operations
- Architect scalable, high-availability systems to support enterprise-grade video workloads across cloud-native infrastructure (AWS, GCP)
- Build and optimize APIs that power real-time and async frontend workflows, including streaming data delivery and long-running job orchestration
- Own performance and reliability for distributed video processing pipelines with low latency and high throughput requirements
- Collaborate closely with frontend engineers on API design, data models, and streaming strategies
- Integrate and run inference on computer vision models for tasks like video resizing, scene detection, automatic audio noise cleaning, and visual analysis
- Deploy and serve ML models on cloud-based or cloud-native platforms — evaluate build-vs-buy for model serving and SaaS alternatives
- Work with the research team to productionize model outputs into reliable, scalable backend services
- Build pipelines that bridge TwelveLabs’ foundation models with third-party CV models to power intelligent video workflows