The Know is a venture-backed, early-stage company that provides AI-driven decision-making tools for corporate executives. They are seeking a Senior Data Scientist / Applied ML Engineer to own their text intelligence stack, improve NLP systems, and translate customer needs into actionable insights.

Responsibilities:

Own our text intelligence stack end-to-end: improve and scale topic classification, complex opinion extraction, and emotion detection
Build modern NLP/LLM systems in production: tokenization to embeddings/vectorization to retrieval to classification/generation, with rigorous evaluation and monitoring
RAG + vector space reasoning: design retrieval strategies, embedding/index choices, chunking/metadata schemes, and confidence and explainability methods for customer-facing outputs
Emerging topic discovery: clustering + dimensionality reduction + labeling workflows to identify new topics on the fly and consolidate them into useful meta-topics
Performance + cost discipline: optimize throughput, latency, and cloud spend while maintaining integrity
Data storytelling: turn messy, high-volume text streams into clear narratives and visualizations customers can trust
Partner on product + customer needs: translate customer goals into measurable system improvements and ship features from idea to production to iteration
Mentor junior teammates: code reviews, light technical leadership, raising the bar without slowing shipping

Requirements:

Strong applied NLP background, including:
Text preprocessing/tokenization
Embeddings/vectorization, transformer models
Clustering + topic modeling approaches, dimensionality reduction
Feature engineering, model selection, tuning, evaluation design
Production mindset: shipped ML/NLP systems that run reliably, with monitoring, drift, and failure modes in mind
Comfort building in real codebases: Python is a must; familiarity with JavaScript or other backend languages is highly preferred
Data + systems fundamentals: database schema design and querying, efficient data pipelines, Git, basic Unix tooling
Communication: explain methods, limitations, and confidence clearly to non-technical stakeholders
Experience with AWS (serverless architecture, hosted ML models, non-relational databases)
Experience with OpenAI / Claude / Gemini APIs, model routing
Hugging Face ecosystem fluency (Transformers, Datasets, fine-tuning/inference)
Some frontend familiarity (React) or strong data visualization chops (Plotly, D3)
Experience in domains like news intelligence, social listening, elections/policy monitoring, trust/safety, public opinion, or risk analytics

Senior Data Scientist / Applied ML Engineer

Key skills

About this role

Responsibilities:

Requirements: