Obsidian Security is the leading SaaS security platform, trusted by global enterprises. As a Data Engineer, you will own and evolve data pipelines, ensuring customer analytics datasets are accurate and reliable while collaborating with various teams to enhance the data attribution engine.
Responsibilities:
- Own and evolve the data pipelines behind one of Obsidian's core products
- Build and maintain data transformations and orchestrated jobs that turn raw signal into attributed, enriched data at scale
- Keep customer analytics datasets accurate, fresh, and reliable across multiple data stores
- Extend and improve our data attribution engine, including its rule-based and LLM-assisted curation stages
- Make occasional changes to the Go service that exposes this data through our APIs, working alongside backend engineers
- Collaborate with product managers, backend engineers, and the teams who turn this data into customer-facing functionality
- Champion data correctness, observability, and clean, testable transformations
- Participate in code reviews and technical discussions to continuously raise engineering standards
Requirements:
- 3–6 years of experience in a data engineering or software engineering role
- Strong SQL and hands-on experience building production data pipelines
- Experience with a modern data orchestrator such as Dagster, Airflow, or similar
- Proficiency with a data transformation framework such as dbt
- Proficiency in Python
- Familiarity with Git and CI/CD tooling such as GitLab CI/CD
- Familiarity with relational databases (e.g., Postgres) and cloud data warehouses
- A bias toward data quality, testing, and maintainable, well-documented work
- Experience collaborating in a team environment and adapting to changing requirements
- Experience with Databricks/Spark or other large-scale analytics platforms
- Exposure to event/streaming systems such as Kafka
- Existing Go experience for occasional API changes
- Familiarity with containerization and orchestration (Docker, Kubernetes)
- Exposure to LLM-assisted data workflows
- Experience with cloud platforms (AWS or GCP) and object storage (S3/GCS)
- Exposure to observability tooling such as Grafana or Prometheus