ApacheAWSAzureCloudETLGoogle Cloud PlatformPythonPyTorchSparkSQLTensorflowRAIMachine LearningDeep LearningNLPNatural Language ProcessingTensorFlowData EngineeringAnalyticsGCPGoogle CloudLeadership
About this role
Role Overview
Develop and optimize ETL pipelines, ensuring high-quality, reliable data
Design and conduct statistical studies and data analysis to evaluate the impact of internally adopted AI tools, research, and engineering results and to create interpretable insights and make data-driven decisions
Curate and maintain datasets to support the development, evaluation, and deployment of AI models
Provide technical leadership, mentorship, and guidance to the AI team and internal research projects, fostering a culture of innovation and excellence
Partner with machine learning engineers, product managers, and executives to translate data insights into tangible business and product improvements
Develop scalable algorithms and automated data processing frameworks to optimize analytics workflows
Requirements
PhD or MS in Computer Science, Data Science, Statistics or a related quantitative field with scientific background and with 5+ years of relevant experience
Strong expertise in data science, Bayesian modeling, probabilistic programming, and uncertainty quantification
Hands-on experience with neural network analysis, deep learning frameworks (e.g., TensorFlow, PyTorch), and model evaluation
Proficiency in Python, R, SQL, and data engineering tools such as Spark or Apache Beam and experience in designing, executing, and analyzing A/B tests
Ability to develop and optimize ETL pipelines for large-scale data processing
Solid understanding of causal inference, time series forecasting, and statistical modeling
Hands-on experience with cloud computing platforms (e.g., AWS, GCP, Azure) and big data tools
Knowledge in natural language processing (NLP), reinforcement learning, and graph analytics is preferable