We are expanding our efforts into complementary data technologies for analytics and decision support in areas of ingesting and processing large data sets.
Our interests are in enabling data science and search based applications on large and low latent data sets in both a batch and streaming context for processing.
To that end, this role will engage with team counterparts in exploring, developing and deploying technologies for creating data sets using a combination of batch and streaming transformation processes.
These data sets support both off-line and in-line machine learning training and model execution.
Other data sets support search engine based analytics.
Exploration and deployment of technologies activities include identifying opportunities that impact business strategy, selecting data solutions software, and defining hardware requirements based on business requirements.
Responsibility also includes coding, testing, and documentation of new or modified scalable analytic data systems including automation for deployment and monitoring.
This role participates along with team counterparts to architect an end-to-end framework developed on a group of core data technologies.
Other aspects of the role include developing standards and processes for data engineering projects and initiatives.
Requirements
5-7 years experience with software engineering to include Java, Scala, and Python required
5-7 years proficiency with processing large data sets with Kafka, RabbitMQ, Flume, Hadoop, HBase, Cassandra and/or Spark or similar distributed system required
3-5 years hands-on experience with scripting with Bash, Perl, Ruby required
3-5 years hands-on development / processing experience on Kafka, HBase, Solr, and Hue required
2-4 years hands-on experience with ETL and Business Intelligence technologies such as Informatica, DataStage, Ab Initio, Cognos, BusinessObjects, or Oracle Business Intelligence required
2-3 years hands-on experience with SQL, data modeling, and relational databases such as Oracle, DB2, and Postgres required
Proven track record with NoSQL data stores such as MongoDB, Cassandra, HBase, Redis, Riak or other technologies that embed NoSQL with search such as MarkLogic or Lily Enterprise required
0-2 years management experience with data engineering team preferred
High School Diploma or equivalent required
Bachelor’s Degree in related field or equivalent work or military experience required.
Tech Stack
Cassandra
Cognos
ETL
Hadoop
HBase
Informatica
Java
Kafka
MongoDB
NoSQL
Oracle
Perl
Postgres
Python
RabbitMQ
Redis
Ruby
Scala
Spark
SQL
Benefits
Generous benefits package available on day one to include: 401K matching, bonding leave for new parents (12 weeks, 100% paid), tuition assistance, training, GM employee auto discount, community service pay and nine company holidays.