Wizard AI is a leading AI Shopping Agent known for delivering high-quality products with accuracy and trust. They are seeking a Staff Data Engineer with expertise in PySpark to build scalable data systems, mentor engineers, and shape data strategy while enabling advanced analytics and supporting AI initiatives.

Responsibilities:

Design and evolve scalable, distributed data infrastructure across cloud platforms
Build and maintain real time and batch data processing pipelines supporting analytics and AI/ML workloads
Develop and manage integrations with third party e-commerce platforms to expand our data ecosystem
Ensure data availability, reliability and quality through monitoring and automated auditing
Partner with engineering, AI and product teams on data solutions for business critical needs
Mentor and support data engineers, establishing best practices and code quality standards

Requirements:

5+ years of software development and data engineering experience with demonstrated ownership of production grade data infrastructure
Bachelor's degree in Computer Science or a related field, or equivalent practical experience
Deep expertise scaling Spark in production (Databricks, EMR, etc)
Strong understanding of distributed computing and modern data modeling for scalable systems
Proficient in Python with experience implementing software engineering best practices
Hands-on experience with both relational (MySQL / PostgreSQL) and NoSQL (MongoDB, DynamoDB, Cassandra) databases
Strong communicator with experience influencing cross functional stakeholders
Experience working in early-stage, high-growth environments
Familiarity with MLOps pipelines and integrating ML models into data workflows
Passionate about problem-solving with a proactive approach to finding innovative solutions

Staff Data Engineer (PySpark)

Key skills

About this role

Responsibilities:

Requirements: