Edmunds is a company dedicated to simplifying the car buying process, leveraging innovative technology and a strong employee culture. They are seeking a Data Engineer to architect and scale data platforms that support analytics and AI/ML initiatives, enabling real-time decision-making and insights for growth.
Responsibilities:
- Create and maintain scalable, maintainable and reliable data pipelines that process very large quantities of structured and unstructured data in both batch and real time
- Enhance and maintain the data lakehouse that powers the core of the company's decision making process
- Work hands-on with our transactions and pricing pipelines, building and optimizing data workflows using Spark, Databricks, SQL, Scala, and Python
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc
- Support infrastructure and build processes, working within AWS to deploy and maintain reliable, scalable data systems
- Work with stakeholders including the Executive, Product, and Data teams to assist with data-related technical issues and support their data infrastructure needs
- Design solutions, troubleshoot pipeline issues, and ensure data quality across critical business systems while collaborating with team members with a goal of improving personal knowledge of the systems, ensuring that code changes meet business goals and technology best practices
Requirements:
- High proficiency in at least one object oriented or functional programming language (Almost all of our codebase is in Scala and Python)
- Fluency in SQL and demonstrated experience writing ETL Jobs and working with data at scale
- Experience writing and maintaining real time / streaming data pipelines
- Familiarity with some of the following: Spark, Scala, Python, AWS, Databricks, Airflow
- Demonstrated ability to design and write maintainable software, paired with an understanding of software engineering best practices, object oriented analysis & design, and design patterns & algorithms
- Experience enhancing and evolving existing systems
- Demonstrated problem solving, troubleshooting, and communication skills especially in a hybrid and remote environment
- Familiarity with cloud-based data platforms (particularly Databricks and AWS), CI/CD build pipelines
- Desire to learn new technologies
- Exposure to AI/ML workflows or an interest in integrating machine learning into data pipelines