Be a thought leader and forward thinker, help drive an innovative vision for our various products and platforms, design and launch strategic machine learning (ML) solutions and drive business-wide innovation.
Take the lead in the end-to-end software development lifecycle, encompassing design, testing, deployment, and operations, lead technical discussions and strategy, and participate hands-on in design reviews, code reviews, and implementation.
Craft high-performance, Big Data Lakehouse architectures such as Hudi, Delta or Iceberg.
Develop Big Data Platform to solve big data pipeline and processing.
Mentor and develop other engineers on the team, establish technical direction and foster team culture.
Uphold the highest standards of technical rigor in engineering and operational excellence, build highly resilient and scalable systems, and champion operational and process improvements.
Requirements
Degree in mathematics/computer science or related discipline.
5+ years of experience in the complete software development lifecycle including design, coding, code reviews, testing, build processes, deployments, and operations.
4+ years of experience in Spark/PySpark with an in-depth knowledge of its advanced features and libraries.
3+ years of experience in Data Lakehouse solution such as Hudi/Iceberg/Delta with an in-depth knowledge of its advanced features and libraries.
2+ years of experience in leading the design and architecture of large distributed systems preferably on cloud platforms (e.g., AWS, Azure, Google Cloud).
Proficient in Docker, Kubernetes, and modern CI/CD practices.
MS or PhD in Computer Science or equivalent experience in ML preferred.
Experience working with Big Data Infra such as AWS EMR preferred.
Experience with NoSQL and document databases preferred.
Proven ability to handle big data, optimize workflows, and improve system performance preferred.