Building of GenAI and AI solutions, including but not limited to analytical model development and implementation, prompt engineering, general all-purpose programming (e.g., Python), testing, communication of results, front end and back-end integration, and iterative development with business subject matter expertise.
Designing and solutioning AI/GenAI architectures for business teams, specifically for plugin-based solutions and custom AI/GenAI application builds.
Partner with data engineers and architects to design and implement scalable database solutions and data models, organizing both structured and unstructured data.
Develop processes for data cleansing, enrichment, and validation to improve quality and reliability.
Collaborate with data scientists to develop and deploy machine learning models in cloud following best practices.
Monitor, solve, and optimize data pipelines for performance and efficiency.
Partner with IT and technology teams to put together data architecture designs, integration procedures, and data management workflows.
Partner with data and platform partners to implement the best practices in managing data and platforms, supporting AI development and deployment at scale.
Develop automation scripts and tools to ease the deployment, scaling, and management of data systems within the cloud environment.
Continuously monitor and improve the performance of AI solutions through data analysis and testing.
Strong communication to help business partners better understand the use of data, ML models, and AI solutions.
Requirements
Experienced in developing and deploying models using AI and GenAI technologies in cloud.
Experienced in data modeling, ETL/ELT development, and data warehousing.
Master/PhD degree in a quantitative field such as Data Science, Engineering, Computer Science or related technical field.
Hands-on experience in developing models solving NLP tasks, including Document Classification, Entity Extraction, Entity Relation Extraction, etc.
Hands-on experience in processing documents (PDF/Images) using vision and/or language models, as well as OCR techniques.
Proven experience with Databricks and Azure cloud services and data solutions including but not limited to Azure SQL Data Warehouse, Azure Cosmos DB, Azure Data Lake Storage, Azure Function and Azure Data Factory.
Proficiency with big data processing frameworks such as Apache Spark and experience with programming languages like Python, Scala, or SQL.
Skilled in the machine learning modeling life cycle, including exploratory data analysis, data cleansing, feature engineering, model building, deployment and monitoring.
Good understanding of vectorization and embedding, prompt engineering, RAG, Multi-agent techniques.
Experience in developing and deploying models in cloud-based environments, specifically Microsoft Azure, and Databricks, following MLOps and DevOps best practices.
Experience with Git Version Control, Unit/Integration/End-to-End Testing, CI/CD, release management, etc.
Excellent problem-solving skills and the capacity to work under tight deadlines in a fast-paced environment.
Previous work in agile development environments and knowledge of project management tools such as JIRA.
Tech Stack
Apache
Azure
Cloud
ETL
Python
Scala
Spark
SQL
Benefits
Health insurance
Dental insurance
Mental health benefits
Vision insurance
Short
and long-term disability insurance
Life and AD&D insurance coverage
Adoption/surrogacy and wellness benefits
Employee/family assistance plans
Retirement savings plans (including pension/401(k) savings plans and a global share ownership plan with employer matching contributions)
Financial education and counseling resources
Generous paid time off program (including up to 11 paid holidays, 3 personal days, 150 hours of vacation, and 40 hours of sick time)