Mapbox is the leading real-time location platform for location-aware businesses, providing tools for navigation and data processing. The Software Development Engineer II on the Search Data team will work with geospatial datasets and build systems for data ingestion, while promoting operational excellence and mentoring other engineers.
Responsibilities:
- Work with specialized telemetry and geospatial data sets including addresses, road networks, buildings, and points of interest (POIs)
- Build and support our batch and streaming ingestion systems that ingest terabytes of data per day
- Interface with engineers from other teams to understand their needs for geospatial data and provide solutions
- Simplify and strengthen Mapbox’s processes and tools for designing, deploying, and monitoring data processing and querying workloads on AWS
- Document your work and decision-making processes, and lead presentations and discussions in a way that is easy for others to understand
- Mentor other software engineers to develop all aspects of their engineering skill sets, including participating in design and code reviews
- Promote a culture of operational excellence by meticulously testing and monitoring our systems and code, writing documentation, and being on-call to support the health of our services
- Reduce technical debt, share your knowledge, and invest in your teammates’ health and happiness, while optimizing application performance and accelerating feature velocity
- Uphold a culture of collaboration, transparency, creativity, inclusion, and data-driven decisions
Requirements:
- 5+ years of experience building scalable backend systems and data pipelines
- Hands-on experience with AWS technologies like Lambda, S3, Athena, Glue, and EMR
- Strong proficiency in SQL and Python
- Proficiency in at least one modern programming language (NodeJS, Scala, or Java) suitable for backend services and data processing
- Demonstrated history of designing batch and real-time data processing systems and developed judgment to implement new data pipelines and best practices around it
- Familiarity working with Apache Spark or other Hadoop based technologies
- Familiarity with CI/CD processes
- Experience with introducing quality and operational metrics into a data ETL pipeline
- Integrating data with APIs and querying data through APIs
- Experience with AI tools in the software development lifecycle
- Experience with geospatial data analysis and processing
- Experience with Docker
- Experience with machine learning infrastructure