Geolava is a company focused on bringing AI to the physical world through spatial intelligence systems. They are seeking a Senior Data Engineer to build and scale the database infrastructure that powers their products and AI research, involving the design and implementation of secure data pipelines and storage systems.
Responsibilities:
- Drive the technical direction for database solutions used across Product and Research
- Design and implement database solutions that scale to support millions of users across Claude's product ecosystem
- Build and scale database systems through 100x+ growth while maintaining reliability and performance
- Architect data storage solutions that work seamlessly across GCP, AWS, first-party deployments, third-party deployments, and other environments
- Develop database infrastructure that serves both product and research workloads with different performance characteristics
- Partner with product and research teams to understand data requirements and build infrastructure that accelerates innovation
- Work collaboratively with Platform, infrastructure, application engineers, and AI Researchers to build next-generation data platform products and services
- Optimize database performance, reliability, and cost efficiency at massive scale
- Build scalable data pipelines for sourcing, transforming, and publishing data assets for AI use cases
- Ship high-quality, well-tested, secure, and maintainable code
- Navigate roadblocks and deliver solutions quickly and iteratively
- Thrive in a fast-paced, design-driven, product development cycle
Requirements:
- 5+ years of experience (excluding internships/co-ops) in data engineering, building distributed systems and scaling data-intensive systems
- Expertise in distributed database architectures at scale
- Experience at a high-growth company building platform infrastructure relied upon by other teams
- Ability to balance startup speed with production reliability
- Strong technical leadership and cross-functional collaboration skills
- Passion for building data layers that enable next-generation AI
- Extensive experience with Python and PySpark (or Spark)
- Deep expertise in scaling relational and non-relational databases (AWS: Aurora, RDS, DynamoDB, PostgreSQL; GIS experience is a plus)
- Experience with vector databases and async job processing frameworks
- Knowledge of database orchestration and automation at scale
- Experience with security and compliance (ITGC, GDPR, financial controls)
- Background in data warehousing, ETL/ELT pipelines, and analytics infrastructure for ML pipelines
- Proven track record improving data reliability, availability, or cost efficiency
- Security engineering experience focused on data protection and access control