Home
Jobs
Saved
Resumes
Data Engineer at Yahara Software | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Data Engineer
Yahara Software
Website
LinkedIn
Data Engineer
Wisconsin, United States of America
Full Time
1 month ago
No H1B
Apply Now
Key skills
AWS
Azure
Cloud
Distributed Systems
ETL
Google Cloud Platform
HDFS
Java
MapReduce
Python
Scala
Shell Scripting
Spark
SQL
Shell
ELT
Data Engineering
GCP
Google Cloud
Git
Version Control
Performance Optimization
Agile
About this role
Role Overview
Design and maintain enterprise-scale pipelines using CDP and big data tooling.
Build scalable ETL/ELT workflows for structured and unstructured data.
Develop distributed processing jobs using big data framework components.
Design data storage solutions balancing performance and cost.
Collaborate with analysts, scientists, and developers to deliver data solutions.
Develop technical documentation for pipelines and architectures.
Requirements
5–7 years in data engineering with big data or distributed systems.
Experience with CDP, CDH, or similar enterprise big data platforms.
Degree in CS, Data Science, Information Systems, or equivalent experience.
Strong background in distributed data processing.
Ability to obtain and maintain Public Trust clearance.
Self-starter with a passion for data engineering.
Strong analytical and problem-solving skills.
Enthusiastic about big data technologies and performance optimization.
Detail-oriented with a commitment to accuracy and reliability.
Ability to translate business requirements into effective solutions.
Collaborative, able to recognize blockers and leverage team strengths.
Experience with Agile development environments.
Proven experience designing and implementing production pipelines.
Experience in biohealth, laboratory, or scientific data environments is a plus.
Familiarity with HIPAA, FDA, or GxP preferred but not required.
Cloudera ecosystem experience: CDP, HDFS, Hive/Impala, Spark.
Programming: Python, Scala, or Java.
Advanced SQL and distributed compute (Spark, MapReduce).
Shell scripting and version control (Git).
Data storage formats: Parquet, Avro, ORC.
Workflow orchestration and scheduling.
Cloud experience (Azure, AWS, or GCP) and understanding of hybrid patterns.
Tech Stack
AWS
Azure
Cloud
Distributed Systems
ETL
Google Cloud Platform
HDFS
Java
MapReduce
Python
Scala
Shell Scripting
Spark
SQL
Benefits
20+ days of PTO accruable in the first year!
Comprehensive health insurance (Medical, Dental, Vision) with HMO and PPO options
Health Savings Account (HSA) with annual employer contributions
401(k) with guaranteed company match (Traditional and Roth options)
100% company-paid short-term and long-term disability
100% company-paid life insurance with option to increase coverage
100% company-paid identity theft protection
On-site gym with basketball court
Hybrid/remote schedule with home office stipend
Fresh fruit, healthy snacks, and beverages provided daily
Bonus certification program (Microsoft, AWS, PMP, IIBA, etc.)
Employee Assistance Program (counseling, legal, financial services)
Monthly and Quarterly Recognition Awards with spot bonuses
Company-supported community outreach and volunteer opportunities
Employee-run committee involvement opportunities
Collaborative culture founded on realized values and incredible people
Apply Now
Home
Jobs
Saved
Resumes