Clever Real Estate is a venture-backed real estate technology company on a mission to revolutionize the way people buy, sell, and manage real estate. They are seeking a Senior Data Pipeline Engineer to own and evolve the data infrastructure that powers Clever's core data products, ensuring reliable data pipelines and collaborating with various teams to support company priorities.
Responsibilities:
- Own and operate Clever's data platform infrastructure. Manage AWS services (EC2, RDS, VPC, S3, MWAA), Terraform-managed infrastructure-as-code, and Databricks administration. You are the go-to person for keeping these systems running securely and cost-effectively
- Maintain and improve data pipeline reliability. The Airflow/MWAA orchestration layer automates all ETL/ELT jobs feeding Databricks. You'll monitor, triage, and resolve pipeline failures — and proactively improve reliability so failures happen less often
- Build and extend data ingestion pipelines. Design and implement ingestion for new operational data sources (e.g., telephony, CRM, transaction data) that directly support Clever's speed-to-match initiative (P1 company priority)
- Manage database infrastructure. Administer PostgreSQL/RDS instances, including replica promotion, security group configuration, and VPC peering. Ensure databases are performant, secure, and properly networked
- Support security and compliance. Maintain infrastructure aligned with SOC-2 requirements, including VPN management (Pritunl), SSO configuration, and access controls. Respond to audit findings that require infrastructure changes
- Collaborate with the data engineering team. Partner closely with data engineers and data analysts to ensure smooth handoffs between infrastructure and pipeline/transformation work. Provide technical mentorship on infrastructure best practices
- Drive infrastructure strategy. Evaluate opportunities to reduce complexity (e.g., consolidating orchestration, optimizing cloud spend) and propose a forward-looking platform roadmap
Requirements:
- 8+ years of experience in data engineering or data platform roles
- Databricks administration and lakehouse architecture
- Apache Airflow administration and DAG development (MWAA preferred)
- AWS (EC2, RDS, VPC, S3, MWAA): our AWS environment is central to all data operations and requires deep, hands-on expertise
- Terraform (infrastructure as code): managing multiple repos across RDS, CDC, and orchestration infrastructure
- Python (PySpark, general scripting)
- PostgreSQL / RDS management, including replica promotion, security group configuration, VPC peering
- VPN setup and management (Pritunl on EC2)
- SSO configuration
- Excellent communication skills, able to influence across teams and levels
- Experience with SOC-2 compliance infrastructure requirements is a plus
- Real estate, fintech, or marketplace industry experience a plus