Lextech is a group of inventive thinkers who care deeply about our clients, and they are looking to add a Data Engineer to the team. The core deliverable for this role is to conduct a comprehensive end-to-end data pipeline audit across various platforms and deliver a prioritized remediation plan within the first 4 weeks.
Responsibilities:
- Conduct a comprehensive end-to-end data pipeline audit across CosmosDB, Azure Data Factory, Azure SQL, and Power BI, and deliver a prioritized remediation plan within the first 30 days
- Perform systematic root cause analysis on the multi-campaign attribution issue (sessions incorrectly assigned to multiple campaigns)
- Identify any other data quality issues not yet discovered
- Assess current data validation and monitoring gaps
- Evaluate database performance issues (e.g., query taking 9 minutes over 5M records - possible indexing problems)
- Deliver a prioritized remediation roadmap with level of effort estimates
- Recommend proactive monitoring strategy (dashboards, alerts, health checks)
- Deliverables:
- Data flow diagram mapping every pipeline from source to report
- Complete issue registry with severity, affected pipeline, root cause analysis, and evidence
- Prioritized remediation plan with wave-based rollout, fix specifications, testing plans, and rollback plans
- Monitoring framework recommendation for ongoing data quality assurance
Requirements:
- 5+ years in data engineering or database architecture
- Strong SQL and relational database optimization (indexing, query performance)
- Experience with ETL pipelines and data quality frameworks
- Proven track record in systematic root cause analysis
- Familiarity with BI tools (PowerBI preferred)
- Experience implementing data monitoring and alerting systems
- Ability to communicate technical findings clearly to non-technical stakeholders
- Azure data stack experience (CosmosDB, Azure Data Factory, Azure SQL)
- Ability to articulate a phased audit methodology (discovery, documentation, issue identification, root cause analysis, prioritized remediation) without prompting
- Understanding that pipelines implement business logic, not just move data — ability to identify when a transformation is technically correct but produces wrong output due to incorrect or outdated business rules
- Systematic over reactive — works from architecture down to specifics, not from tickets up. Traces full pipelines from source to report rather than picking off backlog items
- Fast discovery instinct — inspects schemas, runs exploratory queries, and checks field-level join logic before scheduling meetings. Forms firsthand understanding of the data independently
- Owns the full vertical — can trace a number from a Power BI dashboard back through the view, the fact table, the staging table, and the source event log
- Proactive communicator — surfaces findings, blockers, and next steps without being asked. Does not wait for standup to share status
- Healthcare or pharmaceutical industry experience
- Experience with marketing/campaign analytics data
- Experience building scalable, centralized data layers or APIs