Catalyst Labs is a leading talent agency specializing in Applied AI, Machine Learning, and Data Science. They are seeking a Tech Lead, Data & Inference Engineer to design, develop, and scale a data platform that transforms diverse data sources into reliable business insights, while mentoring engineers and promoting best practices across the organization.
Responsibilities:
- Lead the design, development and scaling of an end to end data platform from ingestion to insights, ensuring that data is fast, reliable and ready for business use
- Build and maintain scalable batch and streaming pipelines, transforming diverse data sources and third party application programming interfaces into trusted and low latency systems
- Take full ownership of reliability, cost and service level objectives. This includes achieving ninety nine point nine percent uptime, maintaining minutes level latency and optimizing cost per terabyte
- Conduct root cause analysis and provide long lasting solutions
- Operate inference pipelines that enhance and enrich data. This includes enrichment, scoring and quality assurance using large language models and retrieval augmented generation
- Manage version control, caching and evaluation loops
- Work across teams to deliver data as a product through the creation of clear data contracts, ownership models, lifecycle processes and usage based decision making
- Guide architectural decisions across the data lake and the entire pipeline stack
- Document lineage, trade offs and reversibility while making practical decisions on whether to build internally or buy externally
- Scale integration with application programming interfaces and internal services while ensuring data consistency, high data quality and support for both real time and batch oriented use cases
- Mentor engineers, review code and raise the overall technical standard across teams
- Promote data driven best practices throughout the organization