Tata Consultancy Services is seeking a Data & AI Engineer specializing in Microsoft Fabric and experimentation data. The role involves designing and maintaining datasets, conducting data gap analyses, and building AI-powered agents to enhance experimentation workflows.
Responsibilities:
- Experimentation data enablement (Silver layer ownership)
- Own the design, build, and maintenance of curated Silver-layer datasets in Microsoft Fabric to support experimentation reporting and analysis
- Partner with the Data Reporting/BI team to identify required dimensions, metrics, and joins (visitor/session, variant, campaign/flight, geo, device, channel, funnel steps, conversion events) and ensure these are available in Silver
- Translate experimentation team needs into standardized, reusable data products (tables/views) that can be consumed consistently for scorecards, dashboards, and ad hoc analysis
- Ensure Silver-layer outputs are analysis-ready (cleaned, conformed, deduplicated, and aligned to agreed definitions)
- Data gap analysis and assessment
- Conduct regular gap assessments between:
- Experimentation requirements (scorecards/KPIs)
- Existing Silver layer availability, and
- Upstream telemetry/source systems
- Identify missing/incorrect fields, inconsistent definitions, data latency issues, or join-key problems; document:
- Business impact
- Severity/priority
- Remediation approach
- Timelines and dependencies
- Provide recommendations on data model improvements (facts/dimensions, grain, surrogate keys, conformance rules) to reduce recurring data quality issues
- Gold layer requirements and stakeholder requirement gathering
- Lead requirement workshops with stakeholders (experimentation, measurement, BI/reporting, engineering) to define Gold layer outputs:
- KPI definitions and calculation logic
- Experiment attribution rules
- Scorecard structure
- Segmentation needs and slicing dimensions
- Governance and refresh SLAs
- Produce clear functional + technical specifications: source-to-target mappings, data dictionary, metric definitions, validation rules, and acceptance criteria
- Drive alignment on single source of truth definitions to avoid mismatch across CJA/Power BI/scorecards
- Data pipeline engineering (1DS + Fabric pipelines / ADF)
- Build and operate robust pipelines using Microsoft Fabric Pipelines and/or ADF to ingest and transform data into Silver and Gold layers
- Understand and work with 1DS (telemetry) pipelines (or equivalent) to ensure required events and attributes flow correctly into Fabric
- Implement reliable orchestration, incremental loads, error handling, and monitoring to meet experimentation reporting timelines
- Data validation and reconciliation (CJA included)
- Perform data validation and reconciliation between Silver/Gold datasets and Customer Journey Analytics (CJA):
- Event counts, session/user logic, conversions
- Experiment/variant attribution consistency
- Time window alignment and filtering rules
- Create validation checks and automated routines for:
- Missing data detection
- Duplicate events
- Schema drift
- Metric anomalies (sudden drops/spikes)
- SRM-supporting signals (where applicable from data)
- Document issues and coordinate fixes with upstream owners (telemetry, tagging, product engineering, reporting teams)
- Experimentation lifecycle and scorecard readiness
- Support the experimentation lifecycle by ensuring datasets are ready for:
- Pre-launch readiness checks
- Launch measurement
- Scorecard generation
- Ongoing health checks
- Post-test learnings/archives
- Enable consistent scorecard outputs by curating:
- Experiment metadata (test IDs, start/end dates, allocations)
- KPI metrics (primary/secondary), and
- Slicing dimensions required by experimentation stakeholders
- AI agent design & build for experimentation team
- Design and build AI-powered agents (Fabric Data Agents / Copilot / Azure OpenAI) to accelerate experimentation workflows, such as:
- Automated scorecard creation and narrative summaries
- Self-serve Q&A over experimentation datasets
- Anomaly explanations and investigation guidance
- Metric definition assistant / data dictionary lookup
- Pipeline health and data quality assistant
- Define the agent’s:
- Scope, personas, and usage scenarios
- Grounding data sources (Silver/Gold tables, metadata, documentation)
- Security model (RBAC, data access boundaries)
- Evaluation metrics (accuracy, timeliness, adoption)
- Partner with experimentation and reporting teams to iterate through pilot → feedback → rollout
- Documentation, governance, and operational excellence
- Maintain documentation for:
- Dataset definitions (Silver/Gold)
- Transformation logic
- Metric calculation rules
- Pipeline design and dependencies
- Validation checklists and runbooks
- Establish best practices for:
- Naming conventions
- Semantic consistency
- Versioning and backward compatibility
- Cost/performance optimization in Fabric
- Provide operational support: monitoring, troubleshooting, incident triage, and continuous improvement
Requirements:
- 6 - 8 years of experience
- Core Data Engineering Competencies
- Data Concepts & Data Modelling
- Big Data Platforms
- data pipelines design
- Microsoft fabric data agents
- Azure AI Services
- AI and ML Integration
- Analytical and problem solving skills
- Performance tuning and monitoring
- PySpark
- ecommerce domain knowledge
- Adobe Analytics
- Customer Analytics
- Adobe customer journey analytics
- clickstream data