Innodata Inc. is a global data engineering company focused on the responsible advancement of artificial intelligence. They are seeking a Data Engineer to design and build enterprise data warehouses and data pipelines, enabling data-driven decision-making for supply chain and real estate operations.

Responsibilities:

Design and implement data-driven solutions on GCP including BigQuery, Cloud Storage, Dataflow, Pub/Sub, and Looker/BI
Build ETL scripts using SQL and Python to extract, clean, and transform structured and unstructured data from ERP, procurement, logistics, and facility management systems
Develop and optimize data pipelines for ingestion, transformation, and loading into enterprise data lakes and warehouses
Build and extend end-to-end data and BI solutions, spanning extraction, storage, transformation, and visualization layers
Partner with supply chain, real estate, and AI/ML teams to provide pipelines for AI solutions (e.g., RAG ingestion, Copilot integration, multi-agent workflows)
Ensure data governance, lineage, and compliance across supply chain datasets
Continuously optimize query performance, ETL processes, and pipeline reliability

Requirements:

Advanced proficiency in SQL (complex queries, optimization) and Python (data engineering, scripting, APIs)
Experience building ETL/ELT pipelines operating on structured and unstructured data sources
Knowledge of enterprise data warehouse and data lake architectures
Exposure to data pipelines for AI/ML (vector DB ingestion, embeddings, RAG pipelines, copilots, agents)
Strong hands-on expertise with GCP services: BigQuery, Dataflow, Pub/Sub, Cloud Storage, Looker/BI (or similar, preferred)
Familiarity with supply chain or data center operations data is a strong plus
Bonus: experience with ML Engineering, data visualization tools (Looker, Tableau, Power BI) and MLOps practices

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: