ECI Software Solutions is rebuilding how enterprise software is built and operated using an AI-native model. The VP of Data Engineering will be responsible for designing and owning the infrastructure that enables AI agents to reason over complex ERP data, leading a team to build a robust data platform from scratch and ensuring data quality and governance standards are met.
Responsibilities:
- Design and own the retrieval systems that allow AI agents to reason over ERP data with zero hallucinations
- Build and scale the vector infrastructure — pgvector, Qdrant, or equivalent — with production-grade embedding and reranking pipelines
- Own the hybrid search strategy: semantic retrieval layered on top of SQL-scoped financial data
- Drive context window optimization — packing the most relevant financial 'truth' into each LLM call efficiently
- Lead the Master Data Management strategy — golden record survivorship, identity resolution, entity deduplication across ERP entities
- Build the knowledge graph that maps relationships between Vendors, Purchase Orders, Invoices, GL Entries, and Inventory so agents understand meaning, not just rows
- Own the semantic layer: translate a 500-table legacy schema into a structured, LLM-readable ontology
- Define data quality standards and automated validation pipelines that enforce them continuously
- Build the core data platform from scratch: ingestion, transformation, storage, and serving layers
- Own the modern data stack — dbt, Airflow or equivalent, Postgres/SQL Server — with an AI-augmented workflow throughout
- Implement data-centric evals: 'Judge Agents' that verify AI output against ground truth SQL
- Build synthetic data generation pipelines that produce high-fidelity, relationally consistent ERP data for agent training and testing
- Own the Data Builder squad: hire, develop, and hold the team to Builder-level output standards
- Partner with the Dev and QA Builder leads to ensure data systems are the right interface for agentic tool-calling
- Run the Data track of the Builder Bootcamp — define the curriculum, set the graduation bar, make the calls
- Partner with product and engineering on AI feature data requirements — you are the upstream dependency for almost everything
- Define data governance policies for AI-consumed data: lineage, access control, PII handling, audit trails
- Own compliance requirements relevant to financial data in an ERP context — SOC 2, data residency, retention policies
- Build the observability layer: OpenTelemetry, Weights & Biases, or equivalent for embedding quality and retrieval performance
Requirements:
- You have built and led a data engineering team before — you know how to hire, structure, and technically lead a team that ships production data systems
- Knowledge graph or MDM at scale: you have designed entity resolution, survivorship rules, and ontologies for complex relational domains — not just prototyped them
- AI/ML platform or LLMOps experience: you have operated embedding pipelines, vector stores, and LLM-integrated data systems in production — you understand latency, cost, and quality trade-offs
- You think in systems: schema design, retrieval architecture, and data contracts are your native language
- You are comfortable in ambiguity — greenfield means no existing patterns to follow and no team to hand things off to on day one
- Production RAG pipelines over structured or financial data — you have gone beyond demos and operated retrieval systems with real precision/recall requirements
- ERP, financial, or supply chain data domain — you understand what makes a General Ledger different from a web analytics event stream
- Modern data stack depth: dbt, Airflow, Postgres, SQL Server — you have opinions about transformation layer design and know when to break the rules
- Experience working across time zones with an offshore engineering team (India context is a plus)