Outrider is a software company that is automating distribution yards with electric, self-driving trucks. They are seeking a Data Engineer to design, build, and maintain scalable data pipelines and infrastructure for analytics products, ensuring data availability and quality at scale.
Responsibilities:
- Design, build, and maintain scalable ETL/ELT pipelines that ingest, transform, and deliver data across the organization
- Develop and optimize distributed data processing jobs using Python for large-scale data transformation and aggregation
- Architect and manage PostgreSQL schemas, tables, indexes, and query performance to support downstream analytics and reporting
- Build and maintain Python-based data workflows to orchestrate, validate, and deliver data reliably across environments
- Monitor and improve data quality, freshness, and completeness through automated checks, alerting, and observability tooling
- Design and manage cloud-based data infrastructure on AWS
- Partner with data analysts and stakeholders to translate requirements into well-modeled, maintainable data products
- Maintain documentation for pipelines, data models, data lineage, and infrastructure
- Troubleshoot pipeline failures and data issues, providing timely root-cause analysis and remediation
Requirements:
- 3+ years of professional experience in data engineering
- Schema design, indexing strategies, query optimization, and performance tuning in PostgreSQL
- Pipeline development, data validation, and orchestration frameworks in Python
- Hands-on production experience with tools such as AWS Athena, Apache Spark, etc. for Distributed Processing and Storage
- Proven experience designing and implementing ETL/ELT pipelines in production
- Experience with AWS (S3, EKS, Glue, Athena) for Cloud
- Dimensional modeling, data warehousing patterns, and reproducible transformations in Data Modeling
- Git workflows, code reviews, testing, and CI/CD in Engineering Practices
- Ability to use LLM AI Agents effectively to increase output
- Experience with workflow orchestration tools (e.g., AWS Step Functions, Prefect, Dagster)
- Familiarity with modern data stack concepts: ELT patterns, data lakehouse architecture, semantic layers, and governance
- Experience implementing automated data quality frameworks and pipeline observability
- Exposure to streaming or near-real-time data processing (e.g., Kafka, Spark Streaming, Pub/Sub)