Wizard AI is a leading AI Shopping Agent known for delivering high-quality products with accuracy and trust. They are seeking a Staff Data Engineer with expertise in PySpark to build scalable data systems, mentor engineers, and shape data strategy while enabling advanced analytics and supporting AI initiatives.
Responsibilities:
- Design and evolve scalable, distributed data infrastructure across cloud platforms
- Build and maintain real time and batch data processing pipelines supporting analytics and AI/ML workloads
- Develop and manage integrations with third party e-commerce platforms to expand our data ecosystem
- Ensure data availability, reliability and quality through monitoring and automated auditing
- Partner with engineering, AI and product teams on data solutions for business critical needs
- Mentor and support data engineers, establishing best practices and code quality standards
Requirements:
- 5+ years of software development and data engineering experience with demonstrated ownership of production grade data infrastructure
- Bachelor's degree in Computer Science or a related field, or equivalent practical experience
- Deep expertise scaling Spark in production (Databricks, EMR, etc)
- Strong understanding of distributed computing and modern data modeling for scalable systems
- Proficient in Python with experience implementing software engineering best practices
- Hands-on experience with both relational (MySQL / PostgreSQL) and NoSQL (MongoDB, DynamoDB, Cassandra) databases
- Strong communicator with experience influencing cross functional stakeholders
- Experience working in early-stage, high-growth environments
- Familiarity with MLOps pipelines and integrating ML models into data workflows
- Passionate about problem-solving with a proactive approach to finding innovative solutions