Role Overview
- Design, develop and support robust data infrastructure solutions with a focus on scalability, security and efficiency.
- Lead the implementation and maintenance of ecosystems such as Trino, AWS Glue, Redshift, Airflow, Kafka and Spark on Kubernetes (K8s).
- Collaborate with engineering, data science and product teams to ensure seamless integration between systems.
- Diagnose and resolve performance, scalability and reliability incidents.
- Apply and promote best practices in architecture, data governance and monitoring.
Requirements
- Strong experience with data infrastructure ecosystems: Trino, AWS Glue, Terraform, CI/CD, Apache Airflow, Kafka and Spark on Kubernetes.
- Proven experience with cloud computing (AWS).
- Advanced proficiency in Python (including PySpark) and object-oriented programming.
- Demonstrated experience with distributed systems, data warehouses and large-scale ETL/ELT pipelines.
- Hands-on knowledge of data governance, security and storage (e.g., S3, Redshift).
- Analytical problem-solving skills with a focus on operational efficiency.
- Strong communication skills and experience working collaboratively.
Preferred qualifications:
- Certifications such as AWS Certified Solutions Architect or similar.
- Familiarity with Trino for ad-hoc analytics.
- Experience with monitoring tools such as Datadog or New Relic.
- Knowledge of AI applied to data, especially RAG (Retrieval-Augmented Generation), for analytics and intelligent insight generation projects.
Tech Stack
- Airflow
- Amazon Redshift
- Apache
- AWS
- ETL
- Kafka
- Kubernetes
- PySpark
- Python
- Spark
- Terraform
Benefits
- Competitive compensation based on experience
- Opportunities for career growth and participation in strategic projects
- Dynamic and challenging work environment
- Opportunity to work at a fast-growing company in the market.