Twelve Labs is pioneering the development of cutting-edge multimodal foundation models for video comprehension. The Principal Software Engineer, Video Engineering will own the architecture and implementation of video processing pipelines, optimizing for AI model performance and ensuring cost-efficiency at scale.

Responsibilities:

Own the video pipeline end-to-end: Architect and implement ingestion → decode → chunking → storage → retrieval → playback, across batch and streaming modes based on AI/ML workflows or media application workflows
Deep codec & decode mastery: Drive decisions on decode strategies (hardware vs. software, GPU-accelerated pipelines), container format handling (fMP4, CMAF, MKV, TS), and codec support (H.264, H.265, VP9, AV1) with pragmatic cost/quality tradeoffs
Semantic & heuristic chunking: Work with our ML Research Scientists to design and implement intelligent video segmentation that goes beyond fixed-interval splitting — scene boundary detection, shot change analysis, content-aware chunking that optimizes downstream AI model performance
Streaming ingestion: Architect low-latency streaming pipelines (HLS, DASH, LL-HLS, WebRTC ingest) that process video in near-real-time, including streaming decode and incremental chunking
Video storage architecture: Design storage tiers and retrieval patterns optimized for AI workloads — balancing hot/warm/cold access, frame-level random access, and cost at petabyte scale
Playback & delivery: Ensure video can be served back to users with accurate temporal navigation, supporting time-coded references from AI analysis results
FFmpeg & media toolchain expertise: Be the internal authority on FFmpeg, libav, and related tooling. Build and maintain custom processing pipelines, filters, and integrations
Cost engineering: Quantify and optimize cost-per-hour-of-video-processed. Drive decode efficiency through hardware acceleration (NVDEC, VA-API), pipeline parallelism, and intelligent resource allocation
Cross-team technical leadership: Partner with ML teams on how video is preprocessed for model consumption, with platform teams on infrastructure, and with product on customer-facing media capabilities
Standards & best practices: Establish video engineering standards, author reference implementations, and mentor engineers across teams on media fundamentals

Requirements:

12+ years in software engineering with 7+ years focused on video/media engineering in production systems processing video at scale
Deep FFmpeg expertise: Not just CLI usage — understanding of libavcodec, libavformat, filter graphs, custom demuxers/decoders, and performance tuning
Codec internals knowledge: H.264/H.265 bitstream structure, AV1 adoption tradeoffs, hardware decode paths, quality metrics (VMAF, SSIM, PSNR)
Streaming protocol fluency: HLS, DASH, LL-HLS, WebRTC
Experience with live/real-time ingest pipelines
Systems engineering depth: Comfortable in C/C++, Rust, or Go for performance-critical media code; Python for pipeline orchestration
Can reason about memory layout, SIMD, GPU pipelines
Storage & retrieval at scale: Experience designing video storage systems — object stores, frame-indexed access patterns, tiered storage strategies
Content-aware processing: Experience with scene detection, shot boundary analysis, temporal segmentation, or perceptual quality optimization
Production instincts: Incident response, observability for media pipelines, debugging decode failures at scale, handling format edge cases gracefully
AI/ML integration experience (strongly preferred): Worked with teams consuming video frames for model training/inference
Understands how preprocessing decisions (resolution, frame rate, chunking strategy) impact model quality
Made major contributions to FFmpeg, GStreamer, or open-source media projects
Deep familiarity with GPU-accelerated video processing (ex. NVDEC/NVENC)
Experience running media pipelines in constrained environments such as on-prem or edge settings

Principal Software Engineer, Video Engineering

Key skills

About this role

Responsibilities:

Requirements: