Design, develop, and maintain robust, scalable, and reliable data pipelines using Python, PySpark, and Databricks.
Develop complex SQL queries, optimize data models, and manage large datasets within our PostgreSQL databases.
Build and deploy ETL/ELT processes to ingest, transform, and load data from a wide variety of sources.
Implement and manage CI/CD pipelines (e.g., Azure DevOps, Jenkins) for automated testing and deployment of data applications.
Ensure data quality, reliability, and observability by implementing comprehensive logging, monitoring, and alerting.
Collaborate with data scientists, analysts, and other engineers to understand data needs and deliver effective solutions.
Contribute to the evolution of our platform standards and best practices for data engineering, code quality, and testing.
Create and maintain clear technical documentation for data pipelines, architecture, and processes.
Requirements
Bachelor’s degree in computer science, Engineering, or a related technical field.
4+ years of professional experience in a data engineering role.
Expert-level programming skills in Python and its data ecosystem.
Demonstrated expertise in large-scale data processing using PySpark and the Databricks platform.
Advanced proficiency in SQL, with hands-on experience in database design and performance tuning, specifically with PostgreSQL.
Experience with AWS services such as S3 for data storage, Lambda for serverless functions, and EKS for container orchestration.
Solid understanding of CI/CD principles and experience with tools like Azure DevOps or Jenkins.
Proficiency in writing comprehensive tests (unit, integration) using frameworks like pytest.
Strong debugging, performance tuning, and automation skills.
Prior contributions to open-source projects, a technical blog, or a public GitHub profile.
Familiarity with the Machine Learning (ML) lifecycle (MLOps) and an understanding of AI concepts.
Experience integrating models, including GenAI APIs, is a strong plus.
Tech Stack
AWS
Azure
ETL
Jenkins
Postgres
PySpark
Python
SQL
Benefits
Health & Wellness: Health care coverage designed for the mind and body.
Flexible Downtime: Generous time off helps keep you energized for your time on.
Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
Family Friendly Perks: It’s not just about you. S&P Global has perks for your partners and little ones, too, with some best-in-class benefits for families.
Beyond the Basics: From retail discounts to referral incentive awards—small perks can make a big difference.