Paramount is on a mission to unleash the power of content, and they are seeking a Senior Machine Learning Engineer to lead the development of multimodal embedding and retrieval systems. This role involves owning the full lifecycle of multi-modal embedding systems that impact how users discover and engage with video content.
Responsibilities:
- Design and build embedding pipelines for video content metadata and clip-level representation
- Design collection and vector schemas to shape data structure, indexing behavior, and retrieval performance under scale and modality complexity
- Lead the transition from traditional feature engineering to a vector-centric "context-first" architecture, through compositional queries and by designing high-dimensional hyper-vector representations that unify visual, textual, and behavioral signals
- Design offline/online evaluation frameworks (e.g., nDCG, MRR, Recall@K) specifically for multimodal alignment, ensuring content embeddings match search intent
- Build hybrid retrieval systems that combine vector similarity search with lexical search and reranking layers to deliver fast, accurate, and scalable performance at production scale
- Engineer the retrieval layer to capture nuanced user-content relationships that model training alone cannot surface, combining multimodal embeddings to improve recommendation depth at scale
- Implement query-time optimizations including caching, filtering, and index sharding strategies
- Tune vector quantization strategies (PQ, SQ, Binary Quantization) to reduce memory footprint and improve search throughput without compromising retrieval precision
- Own performance SLAs and monitor retrieval systems for latency, throughput, recall, and cost efficiency
- Build and maintain scalable batch and streaming pipelines, with logging, metrics, and alerting to surface anomalies and maintain observability
- Process content at scale using distributed frameworks such as Spark or Ray
- Architect and build scalable integration layers on top of vector databases, exposing robust APIs and services for similarity search, hybrid retrieval, and metadata filtering
- Own model versioning and embedding migration strategies, building compatibility tooling that prevents embedding drift from degrading retrieval quality across model upgrades
- Collaborate with backend and platform teams to ensure interoperability with upstream data pipelines and integration with downstream personalization and discovery surfaces
- Communicate technical system behavior, tradeoffs, and recommendations clearly to both technical and non-technical stakeholders
- Mentor direct reports, providing technical guidance in multimodal ML, vector retrieval, and production systems design
- Take ownership of project outcomes from scoping through delivery in a dynamic environment, proactively identifying and mitigating risks across video processing, metadata, and indexing workflows
Requirements:
- 5–8+ years of experience in machine learning engineering, with a focus on production ML systems
- Expertise in multimodal ML, including experience with video, image, and/or audio embedding models
- Deep knowledge of vector embedding generation, storage and retrieval, with preference for hands-on Qdrant experience (FAISS, Pinecone, Pgvector, AlloyDB or similar also considered)
- Strong Python proficiency; Java is a plus
- Demonstrated experience building and operating data pipelines at scale, including batch and streaming ingestion workflows
- Solid understanding of hybrid retrieval systems: vector search, lexical search, and reranking
- Proven ability to communicate technical concepts clearly and partner effectively with product and engineering teams
- Track record of mentoring engineers and leading technical decisions in a team setting
- Experience with agentic systems and multi-agent orchestration
- Knowledge of Diversity & Relevance algorithms such as Maximal Marginal Relevance (MMR) within the re-ranking phase
- Background in video codecs, FFmpeg, or low-level video processing pipelines
- Awareness with retrieval-augmented generation (RAG) systems