Samsara is a pioneer of the Connected Operations™ Cloud, focusing on improving the safety, efficiency, and sustainability of physical operations. As a Data Engineer II, you will own the data platforms that power Samsara’s GTM AI engine, building and optimizing data pipelines while partnering with data scientists and AI engineers to deliver innovative solutions.

Responsibilities:

Build and maintain ETL/ELT data pipelines in Databricks and Spark, ensuring data is ingested, transformed, and delivered reliably for analytics and AI use cases
Develop and evolve logical and physical data models to support reporting, experimentation, and advanced workflows (e.g., scoring models, signal generation)
Implement monitoring, alerts, and testing for data quality, timeliness, and lineage to ensure trustworthy data delivery
Support workflow orchestration with Databricks Jobs, DBT, or equivalent scheduling tools to operate at scale
Contribute to data pipelines and tooling that support retrieval-augmented generation (RAG), vector integrations, or embedding workflows
Design and optimize bulk GenAI data pipelines in Databricks to support generative AI applications at scale
Partner with AI engineers and data scientists to enable experimentation, model training, and production-grade deployments
Develop frameworks for data ingestion, transformation, governance, and monitoring across CRM, sales, and revenue systems
Work with RevOps, sales, and customer success stakeholders to translate business needs into data requirements and stable technical implementations

Requirements:

2-3 years of industry experience in data engineering, with significant experience building large-scale data platforms
Hands-on experience working with modern data technologies stack, such as Databricks, DBT, Redshift, RDS, Snowflake or similar solutions
Proficiency in Python and SQL, with experience in designing robust ETL/ELT pipelines
Experience orchestrating data workflows at scale and enabling machine learning or AI use cases
Strong understanding of data modeling, performance optimization, and cost-efficient infrastructure design
Located in and authorized to work in the United States (this is a fully remote role)
Experience enabling generative AI workflows in Databricks or similar platforms
Familiarity with vector databases, embeddings, and retrieval systems
Experience with Salesforce, Gainsight, Gong, Outreach, or other CRM/enablement tools as data sources
Proven ability to automate repetitive tasks, improve data hygiene, and enable experimentation across GTM data use cases aligning with the emerging responsibilities of GTM engineering where clean, reliable GTM data foundations enable high-leverage automation and insight generation
Exposure to observability, monitoring, and governance best practices for data and AI systems
Ability to collaborate closely with AI/ML teams while driving technical excellence in data engineering

Data Engineer II

Key skills

About this role

Responsibilities:

Requirements: