Home
Jobs
Saved
Resumes
Data Scientist at EXL | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Data Scientist
EXL
Website
LinkedIn
Data Scientist
Bengaluru, Karnataka, India
Full Time
1 week ago
No Sponsorship
Apply Now
Key skills
Hadoop
Numpy
Pandas
Python
Scikit-Learn
SQL
ML
NLP
scikit-learn
NumPy
About this role
Role Overview
Design and implement entity resolution and record linkage pipelines across multiple data sources
Build and evaluate matching algorithms using classical ML, statistical scoring, and fuzzy string-matching techniques
Develop attribute fusion logic to construct canonical golden records from conflicting multi-source data
Analyze data quality issues, document findings, and propose remediation strategies
Assess new external data sources (open and commercial) for coverage, quality, and applicability to Customer Master use cases
Produce structured evaluation reports with recommendations for adoption or rejection
Profile source datasets and track match quality metrics (precision, recall, F1, coverage)
Build dashboards and analytical summaries to communicate pipeline performance to stakeholders
Document data lineage, matching logic, and provenance for audit and reproducibility
Requirements
Python
Pandas, NumPy, scikit-learn, rapidfuzz / jellyfish
SQL
Complex queries, window functions, aggregations; Hadoop/Hive or Presto/Trino
Classical ML & Statistics
Supervised/unsupervised models, probabilistic scoring, clustering, feature engineering
String matching & NLP
Fuzzy matching (Jaro-Winkler, Levenshtein, TF-IDF), text normalization, tokenization
Entity Resolution
Record linkage concepts: blocking, scoring, deduplication, cluster evaluation
Data Quality Assessment
Completeness, consistency, coverage metrics; source profiling
Data Analysis
Exploratory analysis, hypothesis testing, statistical reasoning
Tech Stack
Hadoop
Numpy
Pandas
Python
Scikit-Learn
SQL
Apply Now
Home
Jobs
Saved
Resumes