Clear Fracture is building AI-driven data integration systems that enable organizations to connect, transform, and reason over complex data using agentic workflows. The Data Engineer will focus on data modeling and system design, building core data infrastructure and developing real use cases on the platform to enhance user interaction with data.

Responsibilities:

Design and implement logical and physical data models for complex, evolving datasets
Define schemas and access patterns that support multi-tenant usage and application-level workflows
Balance normalization, performance, and flexibility across different storage systems
Partner with product and engineering teams to translate requirements into scalable data designs
Develop real-world data use cases on top of the platform to validate and extend its capabilities
Design and build data interfaces and abstractions that help users understand and work with data
Contribute to systems such as: Data glossaries, Semantic layers, Metadata and schema discovery tools
Help define how users explore, model, and interact with data within the platform
Translate complex data structures into intuitive, usable representations
Build backend services and APIs that expose and operate on data models
Implement data access layers that are reliable, maintainable, and performant
Contribute to core application architecture where data and services intersect
Write clean, testable, production-grade code
Design and implement pipelines for ingesting, transforming, and validating data
Support both batch and near-real-time processing workflows
Build systems that handle structured, semi-structured, and unstructured data
Enable data flows that support AI-driven and agent-based workflows
Work with embeddings, context retrieval, and data representations used in modern AI systems
Help design systems that make data accessible and useful for autonomous agents
Implement validation, monitoring, and testing for data systems
Ensure correctness, consistency, and observability of data pipelines and services
Diagnose and resolve data-related issues in production environments

Requirements:

Bachelor's degree in Computer Science, Engineering, or related field, or equivalent practical experience
6+ years of professional experience in software engineering and/or data engineering roles
Due to the nature of the work, U.S. Citizenship and the ability to obtain a Secret Clearance are required
Strong programming skills in Python (or similar backend language)
Experience designing and implementing data models for production systems, with advanced knowledge of dimensional modeling topics like slowly changing dimensions and entity relationship diagrams
Proficiency in SQL and experience with relational databases (e.g., PostgreSQL)
Experience building backend services or APIs that interact with data systems
Experience designing and operating data pipelines (ETL/ELT)
Familiarity with NoSQL databases and different data storage paradigms
Experience working with large datasets and performance optimization
Experience with Docker and containerized development workflows
Familiarity with Kubernetes-based environments
Strong understanding of software engineering fundamentals (testing, version control, system design)
Experience building multi-tenant data systems
Familiarity with semantic layers, data catalogs, or data discovery systems
Experience designing data-facing user interfaces or developer tooling
Experience with streaming systems (e.g., Kafka or similar)
Experience with orchestration tools (e.g., Airflow, Dagster, Prefect)
Experience working with AI/ML data pipelines or agent-based systems
Experience supporting on-prem or hybrid deployments
Exposure to data governance, access control, and metadata systems
Experience with cloud platforms (AWS, Azure, GCP)
Familiarity with vector databases (e.g., Pinecone, ChromaDB) and embedding-based retrieval

Big Data Engineer

Key skills

About this role

Responsibilities:

Requirements: