Microsoft is seeking a talented data professional to join a dynamic team responsible for building and maintaining cutting-edge data infrastructure. In this role, you will design, develop, and optimize scalable data platforms and pipelines that power critical business insights and decision-making across the organization.
Responsibilities:
- Design, build, and maintain scalable data platforms and pipelines using Python, SQL, Airflow, and Spark
- Collaborate with stakeholders to understand and translate business requirements into technical specifications
- Develop and implement data models that support analytics and reporting needs
- Ensure data accuracy, consistency, and reliability by implementing robust data validation and quality checks
- Work with cross-functional teams, including data analysts, data scientists, and business leaders, to deliver high-quality data solutions
- Continuously monitor and optimize data pipelines for performance, scalability, and cost-efficiency
- Build and implement monitoring and observability metrics to ensure data quality and detect anomalies in data pipelines
- Maintain clear and comprehensive documentation of data processes and communicate technical concepts effectively to non-technical stakeholders
- Debug and resolve issues related to data ingestion and data warehouse operations
Requirements:
- Minimum of 2 years of experience in data engineering and infrastructure
- Demonstrated track record of stable employment history without frequent job changes
- Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field
- Proficiency in Python programming
- Strong SQL expertise
- Experience with Kubernetes
- Hands-on experience with Apache Airflow
- Knowledge of Scala
- Data warehouse management capabilities
- Expertise in Apache Spark
- Strong experience building and maintaining robust data pipelines and ETL processes
- Analytical skills with the ability to gather business requirements and troubleshoot complex data issues
- Excellent verbal and written communication skills with the ability to convey technical information to non-technical audiences
- Proven ability to work effectively in a collaborative, cross-functional environment
- Experience with cloud platforms such as AWS, GCP, or Azure
- Familiarity with data warehousing technologies such as Deltalake, Azure Fabric, Snowflake, Redshift, or BigQuery
- Knowledge of data governance and data security best practices