Clever Real Estate is a venture-backed real estate technology company on a mission to revolutionize the way people buy, sell, and manage real estate. They are seeking a Data Pipeline Engineer to own and evolve the data infrastructure that powers their core data products, ensuring reliable data operations that directly enable revenue-driving products.
Responsibilities:
- Own and operate Clever's data platform infrastructure. Manage AWS services (EC2, RDS, VPC, S3, MWAA), Terraform-managed infrastructure-as-code, and Databricks administration. You are the go-to person for keeping these systems running securely and cost-effectively
- Maintain and improve data pipeline reliability. The Airflow/MWAA orchestration layer automates all ETL/ELT jobs feeding Databricks. You'll monitor, triage, and resolve pipeline failures — and proactively improve reliability so failures happen less often
- Build and extend data ingestion pipelines. Design and implement ingestion for new operational data sources (e.g., telephony, CRM, transaction data) that directly support Clever's speed-to-match initiative (P1 company priority)
- Manage database infrastructure. Administer PostgreSQL/RDS instances, including replica promotion, security group configuration, and VPC peering. Ensure databases are performant, secure, and properly networked
- Support security and compliance. Maintain infrastructure aligned with SOC-2 requirements, including VPN management (Pritunl), SSO configuration, and access controls. Respond to audit findings that require infrastructure changes
- Collaborate with the data engineering team. Partner closely with data engineers and data analysts to ensure smooth handoffs between infrastructure and pipeline/transformation work. Provide technical mentorship on infrastructure best practices
- Drive infrastructure strategy. Evaluate opportunities to reduce complexity (e.g., consolidating orchestration, optimizing cloud spend) and propose a forward-looking platform roadmap
Requirements:
- 4+ years of experience in data engineering or data platform roles
- Databricks administration and lakehouse architecture
- Apache Airflow administration and DAG development (MWAA preferred)
- AWS (EC2, RDS, VPC, S3, MWAA): our AWS environment is central to all data operations and requires deep, hands-on expertise
- Terraform (infrastructure as code): managing multiple repos across RDS, CDC, and orchestration infrastructure
- Python (PySpark, general scripting)
- PostgreSQL / RDS management, including replica promotion, security group configuration, VPC peering
- VPN setup and management (Pritunl on EC2)
- SSO configuration
- Excellent communication skills, able to influence across teams and levels
- Experience with SOC-2 compliance infrastructure requirements is a plus
- Real estate, fintech, or marketplace industry experience a plus