Baseten is a company that powers mission-critical inference for dynamic AI companies by providing flexible infrastructure and developer tooling. They are seeking a Data Engineer to build and scale their internal data platform, transforming raw data into reliable datasets for decision-making across various teams.
Responsibilities:
- Design and maintain core data models and semantic layers
- Develop and orchestrate batch and streaming data pipelines using technologies such as Apache Beam, Kafka, Airflow, or similar frameworks
- Analyze inference and infrastructure telemetry, including data from OpenTelemetry, Grafana, and other observability tools
- Define and maintain company-wide metrics across product usage, performance, and customer lifecycle
- Enable self-service analytics through agents and tools, with well-structured semantic layers and context
- Ensure data reliability and quality through testing, documentation, and governance
Requirements:
- Design and maintain core data models and semantic layers
- Develop and orchestrate batch and streaming data pipelines using technologies such as Apache Beam, Kafka, Airflow, or similar frameworks
- Analyze inference and infrastructure telemetry, including data from OpenTelemetry, Grafana, and other observability tools
- Define and maintain company-wide metrics across product usage, performance, and customer lifecycle
- Enable self-service analytics through agents and tools, with well-structured semantic layers and context
- Ensure data reliability and quality through testing, documentation, and governance
- Understanding of inference metrics such as latency, throughput, token usage, and model performance
- Experience supporting B2B SaaS and/or consumption-based platforms
- Application of forecasting and predictive modeling (e.g., ARIMA, Prophet) to business processes