Cisco is a leading technology company focused on revolutionizing how data and infrastructure connect and protect organizations in the AI era. The Principal Machine Learning Engineer will define and champion the strategic vision for AI and foundation models, leading the research, design, and deployment of large-scale models targeting machine-generated data while mentoring senior technical talent and fostering innovation.
Responsibilities:
- Set and Drive Vision: Define and champion the strategic vision for AI and foundation models across Splunk and Cisco platforms, shaping the research and technology roadmap to anticipate and address industry‑defining challenges
- Architect and Lead Breakthroughs: Lead the end‑to‑end lifecycle of research, design, and deployment for large‑scale foundation models targeting machine‑generated data, with deep focus on logs and complementary modalities (time series, traces, events)
- Influence at Scale: Partner with executive leadership, engineering, product, and data science teams to ensure AI solutions align with broader organizational objectives, product strategies, and customer needs
- Mentorship and Thought Leadership: Cultivate organizational excellence by mentoring senior technical talent, fostering research communities, and driving best practices in AI across global teams
- Foster Innovation: Embed cutting‑edge research and technological advances into products, driving sustained competitive advantage and transformation at enterprise scale
Requirements:
- PhD in Computer Science, or related quantitative field, plus 7+ years of industry research experience
- Proven track record in at least one of the following areas: large language modeling for both structure and unstructured data, deep learning‑based time series modeling, advanced anomaly detection, and multi-modality modeling
- Solid proficiency in Python and deep learning frameworks (e.g., PyTorch, TensorFlow)
- Experience translating research ideas into production systems
- Deep NLP & Domain‑Adapted LLMs: Background in building and adapting large‑scale language models (e.g., T5, BERT, LLaMA, GPTs) for specialized domains including structured/unstructured logs, text, and event sequences
- Log Analytics Expertise – In‑depth knowledge of structured/unstructured system logs, event sequence analysis, anomaly detection, and root cause identification
- Advanced Anomaly Detection – Experience creating robust, scalable approaches (statistical, deep learning, or hybrid) for high‑volume, real‑time logs data
- Multi‑Modal AI Modeling – Strong track record fusing logs, time series, traces, tabular data, and graphs for foundation models tackling complex operational insights
- Large‑Scale Training & Optimization – Experience optimizing model architectures, distributed training pipelines, and inference efficiency to minimize cost and latency while preserving accuracy
- MLOps & Continuous Learning – Fluency in automated retraining, drift detection, incremental updates, and production monitoring of ML models
- Strong Research Track Record – Publications in top AI/ML conferences or journals (e.g., NeurIPS, ICML, ICLR, AAAI, CVPR, ACL, KDD) demonstrating contributions to state‑of‑the‑art methods and real‑world applications