DeepRec.ai is seeking an experienced AI Data Engineer to build scalable data pipelines and deploy production-ready machine learning solutions. The role involves owning end-to-end data and machine learning workflows, focusing on reliability and scalability using Snowflake and Python.
Responsibilities:
- Build and maintain data ingestion pipelines into Snowflake (structured and time-series data)
- Prepare ML-ready datasets (feature engineering, aggregations, train/test splits)
- Develop, train, and deploy ML models using Python (scikit-learn, XGBoost, LightGBM)
- Operationalise ML workflows in Snowflake using Snowpark
- Write model outputs back to Snowflake for downstream use
- Monitor pipelines and models, including data quality checks and retraining triggers
Requirements:
- Experience in building and maintaining data ingestion pipelines into Snowflake (structured and time-series data)
- Ability to prepare ML-ready datasets (feature engineering, aggregations, train/test splits)
- Experience in developing, training, and deploying ML models using Python (scikit-learn, XGBoost, LightGBM)
- Ability to operationalise ML workflows in Snowflake using Snowpark
- Experience in writing model outputs back to Snowflake for downstream use
- Experience in monitoring pipelines and models, including data quality checks and retraining triggers
- Proficiency in Snowflake (SQL, Streams, Tasks, Snowpipe)
- Proficiency in Python for data engineering and ML (including Snowpark)
- Experience with ML frameworks: scikit-learn, XGBoost, LightGBM
- Experience with time-series data processing
- Familiarity with Azure Data Lake / Microsoft Fabric