AirflowAmazon RedshiftApacheAWSAzureBigQueryCloudDockerHadoopKafkaKubernetesPythonSparkSQLAIGenerative AIData LakeAnalyticsBISnowflakeRedshiftDatabricksApache KafkaApache AirflowFivetranRancherEvent StreamingAgileDecision MakingCollaborationCloud Security
About this role
Role Overview
Build high-performing cloud data solutions to meet our analytical and BI reporting needs
Design, build, and operate secure, scalable cloud data pipelines that support analytics and data products
Integrate new data sources into the central data warehouse and deliver curated data to applications and other downstream destinations
Identify, design, and implement internal process improvements, such as automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability, etc.
Build and enhance a shared data lake that powers decision-making and model building
Implement data governance best practices across pipelines and datasets, including data quality checks, metadata management, lineage, and access controls
Partner with teams across the business to understand their needs and develop end-to-end data solutions
Collaborate with analysts to perform exploratory analysis and troubleshoot issues
Manage and model data using visualization tools to provide the company with a collaborative data analytics platform
Build tools and processes to help make the correct data accessible to the right people
Participate in active rotational support role for production during or after business hours supporting business continuity
Engage in collaboration and decision making with other engineers
Requirements
5+ years of development experience with any of the following software languages: Python and SQL with cloud experience (Azure preferred or AWS)
Hands-on data security and cloud security methodologies.
Experience in configuration and management of data security to meet compliance and CISO security requirements
Experience creating and maintaining data intensive distributed solutions (especially involving data warehouse, data lake, data analytics) in a cloud environment
Hands-on experience in modern Data Analytics architectures encompassing data warehouse, data lake etc. designed and engineered in a cloud environment
Demonstrated experience applying data governance best practices, including data quality validation, cataloging/metadata, lineage, and role-based access controls
Hands-on expertise with LLMs and generative AI workflows.
Familiarity with AI governance and privacy-preserving techniques
Proven professional working experience in Event Streaming Platforms and data pipeline orchestration tools like Apache Kafka, Fivetran, Apache Airflow, or similar tools
Proven professional working experience in any of the following: Databricks, Snowflake, BigQuery, Spark in any flavor, HIVE, Hadoop, Cloudera or RedShift
Experience developing in a containerized local environment like Docker, Rancher, or Kubernetes preferred
Experience with or knowledge of Agile Software Development methodologies