PySparkPythonSparkSQLRMachine LearningMLNLPNatural Language ProcessingLarge Language ModelsData Mining
About this role
Role Overview
Design, develop, and deploy machine learning models to solve complex business problems in the AdTech domain
Analyze large datasets to generate actionable insights and improve product performance
Build and maintain scalable data pipelines using big data tools and frameworks
Perform data preprocessing, cleaning, feature engineering, and model evaluation
Collaborate with cross-functional teams including engineers, analysts, and product managers
Ensure model accuracy, reliability, and scalability through rigorous testing and validation
Stay updated on emerging data science techniques, tools, and best practices
Contribute to team discussions, process improvements, and knowledge sharing
Requirements
4+ years of experience in data science projects applying machine learning, statistical modeling, and data mining techniques
Hands-on experience working with Natural Language Processing (NLP) and Large Language Models (LLMs), including applying them to real-world business problems
Strong understanding of statistics and ability to apply statistical methods to analyze data and validate model performance
Proficiency in Python or R for data analysis and model development
Strong SQL skills and experience working with big data technologies such as Hive and Spark
Hands-on experience with PySpark or RSpark for large-scale data processing
Experience building, evaluating, or optimizing ML/NLP models in production or near-production environments