Ingest and clean incident data in Palantir Foundry from structured fields and free‑text narratives.
Label and curate training datasets for NLP extraction and classification tasks.
Build and validate prototypes: NER for incident narratives, baseline CPUC/SIF triage models, and interactive dashboards.
Create reproducible ETL pipelines and data quality checks in Foundry.
Produce clear documentation and handoffs for Data Scientists and operations teams.
Support human‑in‑the‑loop workflows for rapid CPUC reporting (2 hours during work hours / 4 hours outside).
Present findings and demos to stakeholders and incorporate feedback.
Requirements
Currently enrolled full time in a Bachelors or Masters program studying Data Science, Statistics, Computer Science, Engineering, or related field at an accredited university.
Students must be returning back to school in the fall of 2026 in a full-time academic capacity.
Experience with NLP tasks (NER, text classification) or time series/geo analytics.
Exposure to safety, utilities, or regulatory environments.
Familiarity with data quality frameworks and human‑in‑the‑loop labeling tools.
Hands‑on experience with Python and common libraries (pandas, scikit‑learn, spaCy or similar).
Familiarity with SQL and data modeling concepts.
Comfort working with free text and structured incident data.
Strong communication skills and ability to document reproducible workflows.
Willingness to learn Palantir Foundry; prior Foundry and Git experience is a plus.
PG&E is unable to provide VISA sponsorship to students on an F-1, J-1 or other student visa for this position.
Tech Stack
ETL
Pandas
Python
SQL
Data Analytics Intern at Pacific Gas and Electric Company | JobVerse