AWSCloudElasticSearchNeo4jNoSQLNumpyPandasPythonSQLTensorflowMachine LearningMLDeep LearningNLPTensorFlowNumPyXGBoostAnalyticsData MiningSageMakerElasticsearchRemote Work
About this role
Role Overview
Run CRISP-DM projects with a team of 3 data scientists in the following
Selecting features, building and optimizing classifiers (logistic regression or RF based propensity models) and recommenders using machine learning techniques
Data mining using state-of-the-art methods
Automate scoring using machine learning techniques, build recommendation systems, improve and extend the features used by recommendation and propensity modeling algos
Extending company’s data with third party sources of information when needed
Enhancing data collection procedures to include information that is relevant for building analytic systems
Processing, cleansing, and verifying the integrity of data used for analysis and perform deep EDA (we create our own training data for our models)
Doing ad-hoc analysis and presenting results in a clear manner
Creating recommended and propensity models and tracking of its performance especially to compensate for concept drift
Requirements
5+ years of experience
Hands on machine learning techniques and algorithms, such as k-NN, Naive Bayes, Ensemble methods XGBoost, Decision Forests and working towards deep learning methods using TensorFlow especially for NLP like word embeddings and topic discovery
Hands with common data science toolkits, such as Python, scikit learn, numpy, pandas, plotly, TensorFlow, ElasticSearch, Auto-ML platforms like AWS Sagemaker
Solid proficiency in using query languages such as SQL
Experience with NoSQL databases, such as Elasticsearch and Graph DBs as Neo4j
Good understanding of applied statistics skills, such as distributions, statistical testing, regression and strong EDA skills
Data-oriented personality with strong sense of appreciating the business domain of your work
Interest in working in tech domain e.g. data center analytics, understanding of macro factors in market for computing, software and cloud adoptions