Apply statistical techniques (regression, distribution analysis, hypothesis testing) to derive insights from data and create advanced algorithms and statistical models such simulation, scenario analysis, and clustering
Explain complex models (e.g., RandomForest, XGBoost, Prophet, SARIMA) in an accessible way to stakeholders
Visualize and present data using tools such as Power BI, ggplot, and matplotlib
Explore internal datasets to extract meaningful business insights and communicate results effectively and write efficient, reusable code for data improvement, manipulation, and analysis
Manage project codebase using Git or equivalent version control systems
Design scalable dashboards and analytical tools for central use
Build strong collaborative relationships with stakeholders across departments to drive data-informed decision-making while also helping in the identification of opportunities for leveraging data to generate business insights
Enable quick prototype creation for analytical solutions and develop predictive models and machine learning algorithms to analyze large datasets and identify trends
Communicate analytical findings in clear, actionable terms for non-technical audiences
Mine and analyze data to improve forecasting accuracy, optimize marketing techniques, and informed business strategies, developing and managing tools and processes for monitoring model performance and data accuracy
Work cross-functionally to implement and evaluate model outcomes
Requirements
Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, Engineering, or related technical field
3+ years of relevant experience in data science and analytics and adept in building and deploying time series models
Familiarity with project management tools such as Jira along with experience in cloud platforms and services such as DataBricks or AWS
Proficiency with version control systems such as BitBucket and Python programming
Experience with big data frameworks such as PySpark along strong knowledge of data cleaning packages (pandas, numpy)
Proficiency in machine learning libraries (statsmodels, prophet, mlflow, scikit-learn, pyspark.ml)
Knowledge of statistical and data mining techniques such as GLM/regression, random forests, boosting, and text mining
Competence in SQL and relational databases along with experience using visualization tools such as Power BI
Strong communication and collaboration skills, with the ability to explain complex concepts to non-technical audiences.