Avahi is a premier cloud-first consulting company recognized for its people, culture, and innovative solutions. They are seeking a Data Engineer to design, build, and maintain scalable AWS data platforms, ensuring high data quality and reliable analytics.

Responsibilities:

Design, build, and maintain scalable AWS data platforms supporting batch and streaming pipelines, analytics, and AI/ML workloads, aligned with AWS Well-Architected best practices
Build and operate data ingestion, transformation, and enrichment pipelines from internal systems and external APIs, handling structured, semi-structured, unstructured, and graph data. Implement data normalization workflows to ensure consistent schemas, high data quality, and reliable analytics, BI, and ML use cases
Design and enforce data governance including cataloging, lineage, access control, and auditability
Build and maintain knowledge graphs to model relationships across core business entities, enabling advanced analytics and inference
Identify data gaps, inconsistencies, and missing relationships using strong analytical and inference skills
Integrate data from enterprise platforms such as CRM and ERP systems (Salesforce, HubSpot, SAP, NetSuite, Dynamics 365, Workday)
Design secure data access layers for analytics, BI, ML, and downstream applications. Implement monitoring, observability, and data quality checks for freshness, completeness, and pipeline health
Optimize data architectures for performance and cost efficiency using partitioning, indexing, compression, and storage tiering
Build internal tooling, dashboards, and standardized scaffolding to improve visibility, maintainability, and onboarding
Collaborate with cross-functional teams to deliver high-impact data solutions and share best practices, documentation, and technical guidance

Requirements:

Strong experience designing and operating AWS data platforms, including S3, Glue, Lake Formation, Athena, Redshift, EMR, Kinesis/MSK, DynamoDB, OpenSearch, and Neptune
Strong Python skills for data engineering, focused on modular, testable, and maintainable code
Solid understanding of distributed data systems, including batch and streaming pipelines, fault tolerance, idempotency, and event-driven architectures
Experience with data warehouse and lakehouse architectures, ETL/ELT pipelines, and analytical query engines
Hands-on experience with Spark, Hadoop, Hive, or Flink
Strong data modeling skills, including normalized, denormalized, and graph-based models, with safe schema evolution
Advanced SQL skills for analytics and data engineering, including window functions, CTEs, and query optimization
Experience integrating external APIs and enterprise systems, especially CRM and ERP platforms
Knowledge of data governance, security, and compliance, including encryption, access control, and audit logging
Experience implementing monitoring, observability, and data quality checks using CloudWatch and CloudTrail
Comfort with Infrastructure as Code using CloudFormation or Terraform
Strong end-to-end ownership mindset, with a focus on scalability, reliability, and long-term maintainability
Professional-level English communication skills, able to explain data architectures and trade-offs to technical and non-technical stakeholders

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: