SAP Taulia is a fintech company that is part of the SAP group, seeking an AI Data Engineer to ensure their AI agent ecosystem is powered by high-quality data. The role involves building data pipelines, managing data quality, and integrating data from various business systems to optimize AI performance.
Responsibilities:
- Building and maintaining the custom connectors (APIs, ETL pipelines) required to extract data from core business systems for use in AI tools
- Working with system owners to unlock "siloed" data that is currently inaccessible to our AI ecosystem
- Designing and maintaining the data pipelines that feed our AI knowledge base
- Optimizing "Retrieval-Augmented Generation" (RAG) performance by improving how documents are chunked, tagged, and indexed to reduce hallucinations
- Ensuring data freshness so agents never act on obsolete information
- Creating and maintaining the "AI Data Library" - a comprehensive technical map of where our enterprise data lives, its schema, and its owner
- Working with business teams to identify "Dark Data" (valuable data trapped in PDFs or desktops) and bringing it into the AI ecosystem
- Implementing automated checks to monitor data quality and completeness
- Ensuring sensitive data (PII) is properly flagged and excluded from general AI access where appropriate
Requirements:
- Demonstrated experience integrating data with Enterprise Search engines and AI agents, or RAG-based systems
- Practical experience preparing data specifically for consumption by Large Language Models and agentic orchestration tools
- 5+ years in data engineering, ETL development, or database management
- Strong experience building custom API connectors and data ingestion scripts
- Experience working with unstructured data (text, documents) and NLP concepts
- Strong proficiency in SQL and Python
- Meticulous attention to detail—you care deeply about data cleanliness
- Understanding of enterprise knowledge management challenges
- Ability to audit data sources and identify gaps
- Strong communication skills to work with business owners on data access and cleanup