Horizon Industries Limited is a dynamic IT and Management Consulting firm based in the Washington, DC area. They are seeking a Databricks Data Engineer to build end-to-end ETL/ELT pipelines and support various clients with data engineering solutions.
Responsibilities:
- Build end-to-end implementation of multiple ETL/ELT pipelines, demonstrating efficient data transformation and ingest patterns to move raw data from data producers to an enterprise data ecosystem, with a focus on performance and reliability
- Assess and understand the ETL jobs, workflows, BI tools, and reports
- Address technical inquiries concerning customization, integration, enterprise architecture and general feature / functionality of data products
- Experience in crafting database / data warehouse solutions in the cloud (Preferably AWS, Azure, Alternatively GCP)
- Key must have skill sets – Python, SQL, Databricks, AWS Data Services
- Experience with message queuing, stream processing, and highly scalable ‘big data’ data stores
- Experience manipulating, processing, and extracting value from large, disconnected datasets
- Experience manipulating structured and unstructured data for analysis
- Experience with data modeling tools and processes
- Experience aggregating and transforming data from multiple datasets to create data products
- Support an Agile software development lifecycle
Requirements:
- Ability to hold a position of public trust with the US government
- B.S. in Computer Science or equivalent
- 4+ years' experience in data engineering and big data. At least 2 years' of professional services experience interacting directly with clients
- Big data tools: Hadoop, Spark, Kafka, etc
- Relational SQL and NoSQL databases and experience working with relational databases
- AWS cloud services: EC2, S3, RDS, Glue, Step Functions, Lambda, EMR, DynamoDB, DocumentDB, Redshift, Aurora, Athena
- Data Platforms: Databricks
- Data streaming systems: Batch, Kafka, Storm, Spark-Streaming, etc
- Languages: Python, R, Scala, Go
- Ability to inspect existing data pipelines, discern their purpose and functionality, and re-implement them efficiently in Databricks
- Extensive knowledge of data warehousing concepts and hands-on experience deploying pipelines using Databricks a must
- Data modeling and database design skills and knowledge of version control
- Excellent verbal and written communication skills
- Experience architecting scalable and fault-tolerant data solutions across Azure, AWS, and Databricks
- Databricks Data Engineer Professional certification a plus
- Preference for candidates with Databricks Professional certifications