Work with implementation teams from concept to operations, providing deep technical subject matter expertise for successfully deploying large scale data solutions in the enterprise, using modern data/analytics technologies on premise and cloud
Build and implement a solution architecture, provision infrastructure, secure and reliable data-centric services and application in GCP
Work with data team to efficiently use Cloud infrastructure to analyze data, build models, and generate reports/visualizations
Integrate massive datasets from multiple data sources for data modeling
Design, build, implement and manage APIs and API proxies.
Organizes and implements all API development processes internally and externally.
Ensures that API’s are satisfactory to business requirements, including features, infrastructure and systems
Troubleshoots and tests all features, systems, and functionality of end products.
Implement methods for automation of all delivery components to minimize labor in development and production
Formulate business problems as technical data problems while ensuring key business drivers are captured in collaboration with product management
Knowledge in machine learning algorithms especially in recommender systems
Extracting, Loading, Transforming, cleaning, and validating data
Designing pipelines and architectures for data processing
Creating and maintaining machine learning and statistical models
Querying datasets, visualizing query results and creating reports
Requirements
BS, MS or PhD in Computer Science, Engineering, Economics, Business or Mathematics.
3+ years experience in designing and optimizing data models on GCP cloud using GCP data stores such as BigQuery, BigTable.
3+ years experience analyzing, re-architecting and re-platforming on-premise data warehouses to data platforms on GCP cloud using GCP/3rd party services.
3+ years hands-on experience architecting and designing data lakes on GCP cloud serving analytics, BI application integrations and implementing scalable API solutions at production scale.
Minimum 3 year of experience in performing detailed assessments of current state data platforms and creating an appropriate transition path to GCP cloud.
Minimum 3 year of designing and building production data pipelines from ingestion to consumption within a hybrid big data architecture using Java, Python, Scala etc.
Hands-on experience with Spark, Cloud DataProc, Cloud Dataflow, Apache Beam, BigTable, Cloud BigQuery, Cloud PubSub, Cloud Functions, etc.
Experience in architecting and implementing metadata management, data governance and security for data platforms on GCP.
Experience in designing operations architecture and conducting performance engineering for large scale data lakes a production environment.
Experience architecting and operating large production Hadoop/NoSQL clusters on premise or using Cloud services.