CloudIngest is seeking a Data Scientist with expertise in AI and Machine Learning. The role involves building end-to-end machine learning models, fine-tuning large language models, and developing scalable data pipelines while utilizing various tools and frameworks.

Responsibilities:

Strong experience with Python and ML frameworks (TensorFlow, PyTorch, Scikit-learn)
End-to-end model building: data prep, training, evaluation, deployment
Experience with NLP, embeddings, transformer architectures, and LLM fine-tuning
Fine-tuning or prompting GPT-based LLMs
Experience building RAG systems (Retrieval-Augmented Generation)
Knowledge of vector databases
Understanding of agentic frameworks (CrewAI, LangChain agents, AutoGen, etc.)
Developing multi-agent systems with tools, memory, and planning loops
Airflow for workflow orchestration
Docker for containerization
Kubernetes for scalable deployment
CI/CD for model deployment
Model monitoring and drift detection
API development (FastAPI, Flask)
Experience with SQL and NoSQL data stores
Must have experience in:
Palantir Foundry (data ingestion, transformations, ontology modeling, analytics workflows)
Snowflake (advanced SQL, performance tuning, analytical schema design)
Building scalable ETL/ELT data pipelines (batch, near real-time)
Data modeling for analytics and reporting (layers)
Working with relational and NoSQL databases
Python for data processing and automation
Data quality, governance, and lineage practices
Unsupervised clustering and classification of structured data in DBMS/files and unstructured data
Creating multi-stage analysis/transformation pipelines like Contour in Palantir Foundry
Distributed data processing frameworks like Spark
Querying using GraphQL, Scalding, etc
Create apps, reports, dashboards like those in Palantir Foundry
Good to have:
Dbt, Airflow, Spark or similar orchestration/processing frameworks
BI & visualization tools (Tableau, Power BI, Looker, etc.)
Streaming data platforms (Kafka/Kinesis)
Cloud platforms (AWS/Azure/GCP)
ML feature engineering or analytics support
Agentic AI skills

Requirements:

Strong experience with Python and ML frameworks (TensorFlow, PyTorch, Scikit-learn)
End-to-end model building: data prep, training, evaluation, deployment
Experience with NLP, embeddings, transformer architectures, and LLM fine-tuning
Fine-tuning or prompting GPT-based LLMs
Experience building RAG systems (Retrieval-Augmented Generation)
Knowledge of vector databases
Understanding of agentic frameworks (CrewAI, LangChain agents, AutoGen, etc.)
Developing multi-agent systems with tools, memory, and planning loops
Airflow for workflow orchestration
Docker for containerization
Kubernetes for scalable deployment
CI/CD for model deployment
Model monitoring and drift detection
API development (FastAPI, Flask)
Experience with SQL and NoSQL data stores
Must have experience in Palantir Foundry (data ingestion, transformations, ontology modeling, analytics workflows)
Snowflake (advanced SQL, performance tuning, analytical schema design)
Building scalable ETL/ELT data pipelines (batch, near real-time)
Data modeling for analytics and reporting (layers)
Working with relational and NoSQL databases
Python for data processing and automation
Data quality, governance, and lineage practices
Unsupervised clustering and classification of structured data in DBMS/files and unstructured data
Creating multi-stage analysis/transformation pipelines like Contour in Palantir Foundry
Distributed data processing frameworks like Spark
Querying using GraphQL, Scalding, etc
Create apps, reports, dashboards like those in Palantir Foundry
dbt, Airflow, Spark or similar orchestration/processing frameworks
BI & visualization tools (Tableau, Power BI, Looker, etc.)
Streaming data platforms (Kafka/Kinesis)
Cloud platforms (AWS/Azure/GCP)
ML feature engineering or analytics support
Agentic AI skills

Data Scientists – AI/ML Engineer (12+ experience )

Key skills

About this role

Responsibilities:

Requirements: