must have worked at Capital One in the past
Overview
We are seeking a highly skilled Data Engineer to join a growing team focused on building scalable, modern data solutions. This role will be responsible for designing, developing, and optimizing data pipelines and architectures to support advanced analytics and business insights.
The ideal candidate has strong experience working with large-scale data environments, cloud platforms, and modern data tools, with a passion for building efficient, reliable data systems.
Responsibilities
- Design, build, and maintain scalable data pipelines and ETL processes
- Develop and optimize complex SQL queries for large datasets
- Implement data transformations and workflows using Databricks and PySpark
- Build and manage data ingestion frameworks for structured and unstructured data
- Work within a cloud-based environment (AWS) to support data storage and processing
- Orchestrate workflows and pipelines using Apache Airflow
- Ensure data quality, integrity, and performance across systems
- Collaborate with cross-functional teams including analytics, product, and engineering
- Leverage modern tools and AI-assisted development platforms to improve productivity and code quality
Required Skills & Qualifications
- SQL (Expert Level): Strong experience with end-to-end data engineering, including complex queries and performance optimization
- Python: Proven experience in data engineering, including scripting, automation, and pipeline development
- Databricks: Hands-on experience with ETL pipelines, transformations, and scalable processing
- PySpark: Strong experience processing large datasets within Databricks
- AWS: Required experience with S3; familiarity with additional AWS services is a plus
- Data Lakes & Warehousing: Experience with modern data architectures, including Snowflake and scalable ingestion patterns
- Apache Airflow: Experience building, scheduling, and monitoring data workflows
Preferred Qualifications
- Experience using AI-assisted coding tools such as GitHub Copilot or Claude
- Familiarity with modern software development best practices (CI/CD, version control, testing)
- GitHub or portfolio showcasing relevant data or AI-related work
What Sets You Apart
- Ability to work in fast-paced, evolving environments
- Strong problem-solving and analytical skills
- Passion for building efficient, scalable data solutions
- Experience working with large, complex datasets in enterprise environments