Lead the Design and development of scalable data pipelines and infrastructure
Optimize the performance of data systems and troubleshoot complex data issues
Collaborate with data scientists to ensure data is structured and available for advanced analytics
Ensure data security, privacy, and compliance with organizational policies
Manage the integration of new data sources and ensure data quality
Continuously evaluate and adopt new technologies to improve data engineering practices
Contribute to architecture decisions
Requirements
High level of proficiency in Python, SQL
Good knowledge of Linux and Unix Shell scripting
Experience with GCP, or other cloud providers
Experience with Airflow or similar workflow orchestration tools
Experience with column and/or transactional databases
Understanding of distributed systems and scalable data architectures
Experience writing clean, maintainable, and well-tested code
Strong sense of delivery ownership, balancing speed of execution with adherence to established engineering standards, best practices, and quality requirements
Actively driving alignment and collaboration across cross-functional teams