AirflowAmazon RedshiftApacheAWSAzureBigQueryCloudDockerHadoopKafkaKubernetesMongoDBOraclePostgresPythonSparkSQLAIGenerative AIData LakeAnalyticsBISnowflakeRedshiftDatabricksApache KafkaApache AirflowFivetranRancherCosmos DBPostgreSQLEvent StreamingAgileDecision MakingCollaborationCloud Security
About this role
Role Overview
Build high-performing cloud data solutions to meet our analytical and BI reporting needs.
Design, build, and operate secure, scalable cloud data pipelines that support analytics and data products.
Integrate new data sources into the central data warehouse and deliver curated data to applications and other downstream destinations.
Identify, design, and implement internal process improvements, such as automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability, etc.
Build and enhance a shared data lake that powers decision-making and model building.
Implement data governance best practices across pipelines and datasets, including data quality checks, metadata management, lineage, and access controls.
Partner with teams across the business to understand their needs and develop end-to-end data solutions.
Collaborate with analysts to perform exploratory analysis and troubleshoot issues.
Coach and develop the team by providing hands-on mentorship to engineers, lead code and design reviews, create training/onboarding content, and run knowledge-sharing sessions.
Manage and model data using visualization tools to provide the company with a collaborative data analytics platform.
Build tools and processes to help make the correct data accessible to the right people.
Participate in active rotational support role for production during or after business hours supporting business continuity.
Engage in collaboration and decision making with other engineers.
Lead presentations regularly for stakeholders.
Requirements
5+ years of development experience with any of the following software languages: Python and SQL with cloud experience (Azure preferred or AWS).
Hands-on data security and cloud security methodologies.
Experience in configuration and management of data security to meet compliance and CISO security requirements.
Experience creating and maintaining data intensive distributed solutions (especially involving data warehouse, data lake, data analytics) in a cloud environment.
Proven experience with relational and non-relational databases (e.g., PostgreSQL, Oracle, Azure Cosmos DB, MongoDB), including schema design, indexing, query optimization, and operational management in a cloud environment.
Hands-on experience in modern Data Analytics architectures encompassing data warehouse, data lake etc. designed and engineered in a cloud environment.
Demonstrated experience applying data governance best practices, including data quality validation, cataloging/metadata, lineage, and role-based access controls.
Hands-on expertise with LLMs and generative AI workflows.
Familiarity with AI governance and privacy-preserving techniques.
Proven professional working experience in Event Streaming Platforms and data pipeline orchestration tools like Apache Kafka, Fivetran, Apache Airflow, or similar tools
Proven professional working experience in any of the following: Databricks, Snowflake, BigQuery, Spark in any flavor, HIVE, Hadoop, Cloudera or RedShift.
Experience developing in a containerized local environment like Docker, Rancher, or Kubernetes preferred
Experience with or knowledge of Agile Software Development methodologies.