Build scalable, robust ingestion pipelines for processing clickstream data, handling both participant-level (full-day employee activity) and case-level (end-to-end case lifecycle) data streams.
Design and implement streaming platform-based orchestration for pipeline coordination, ensuring reliable data flow and processing guarantees.
Deploy and manage containerized services on enterprise container platforms, implementing CI/CD pipelines and infrastructure-as-code practices.
Implement comprehensive observability for LLM and VLM pipelines, including: Performance monitoring and metrics collection, Distributed tracing for multi-model pipelines, Logging and alerting for model inference, Cost tracking and optimization.
Build fault-tolerant systems with retry mechanisms, circuit breakers, dead letter queues, and graceful degradation patterns.
Work with VLMs to process clickstream video data and generate high-quality transcripts for downstream analysis.
Ensure seamless integration with downstream Analysis & Evaluation workstream.
Requirements
5+ years of Specialty Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education.
2+ years of experience with streaming platforms for data orchestration.
2+ years of experience with containerization and container orchestration platforms.
2+ years of deploying and operating ML/AI models in production environments.
3+ years of experience programming with Python.
Tech Stack
Python
Benefits
Health benefits
401(k) Plan
Paid time off
Disability benefits
Life insurance, critical illness insurance, and accident insurance