Design and implement semantically consistent, scalable 360 data models that integrate data across domains.
Build and maintain transformation pipelines that apply cleansing, standardization, enrichment, and derived logic to domain datasets.
Write production-quality, testable code in SQL and Python (or equivalent)—delivering performant and maintainable data assets.
Work closely with domain experts, data scientists, and product stakeholders to translate business concepts into interpretable, decision-ready data models.
Implement logic for classifications, KPIs, scoring algorithms, and business rules, ensuring traceability and data lineage.
Help define and enforce standards for data modeling, documentation, and governance within the semantic layer.
Collaborate across teams to integrate with ingestion, MDM, and data product layers.

8+ years of experience in data engineering or software engineering with a focus on data transformation, modeling, or analytics platforms.
Strong proficiency in SQL and at least one general-purpose language such as Python or Scala.
Experience building and scaling wide, entity-based tables and modeling domain concepts (e.g., customer, fleet, provider) into durable data objects.
Solid understanding of data quality practices —including validation, enrichment, schema enforcement, and business rule encoding.
Experience working with large-scale datasets and optimizing transformation pipelines for performance and maintainability.
Comfort operating in a collaborative, cross-functional environment, balancing business logic with platform scalability.
A mindset for traceability, reproducibility, and semantic clarity —you build data models others can trust and reuse.
Bachelor's degree in Computer Science, Software Engineering, or related field; A Master's or PhD in the areas of Data Science, Machine Learning, Artificial Intelligence, Computer Science, or Statistics, it will be a big plus.

Staff Software Engineer – Semantic Data Lake

Key skills