College Board is a mission-driven, not-for-profit organization dedicated to excellence in education. They are seeking a Data Engineer to design, build, and operate scalable data platforms that support analytics and AI/ML use cases, collaborating with cross-functional teams to transform data into actionable insights.

Responsibilities:

Design, build, and maintain scalable batch and streaming data pipelines using AWS services such as S3, Glue, Lambda, Kinesis, Step Functions, Redshift, Athena, and DynamoDB
Develop and optimize data models and complex SQL queries to support analytics, reporting, and downstream consumers
Build and operate serverless ETL frameworks for automated ingestion, transformation, and loading of structured and semi-structured data
Implement cloud-first, microservices-based architectures, ensuring high availability, performance, and cost efficiency
Ensure data quality, reliability, and observability through automated testing, validation, monitoring, and alerting
Integrate BI and analytics tool such as QuickSight to enable real-time and self-service analytics
Contribute to CI/CD pipelines, infrastructure automation, and secure development practices to deliver production-grade data systems
Partner with Data Science and AI teams to productionize ML-ready datasets, including training, evaluation, and inference data pipelines
Build and maintain feature pipelines and embedding workflows that support ML models and experimentation
Support MLOps/LLMOps workflows, including dataset versioning, experiment tracking, and capturing inference data for continuous improvement
Enable AI use cases such as recommendation systems, personalization, and retrieval-augmented generation (RAG) through robust data foundations
Apply a thoughtful approach to AI feasibility, fairness, and effectiveness, especially when working with sensitive or regulated data
Participate actively in Agile/Scrum ceremonies, design reviews, and peer code reviews
Collaborate cross-functionally with Product, UX, Infrastructure, and Security teams
Mentor junior engineers by providing guidance on data architecture, coding standards, and best practices
Produce clear documentation, runbooks, and technical guides to support long-term platform sustainability

Requirements:

4+ years of experience in Data Engineering or Software Engineering in a production environment using AWS services such as S3, Glue, Lambda, Athena, DynamoDB, Step Functions, Redshift and Kinesis
Strong proficiency in Python and SQL, including performance tuning for large datasets
1+ years of hands-on experience designing, building, and deploying production-grade ML and generative AI solutions using AWS SageMaker and Amazon Bedrock
Experience designing and operating ETL/ELT pipelines, data models, and analytics-ready datasets
Solid understanding of cloud computing, DevOps, CI/CD, and microservices architectures
Strong security and privacy mindset, especially when working with sensitive data
Demonstrated interest in continuous learning, including keeping up with evolving data engineering and AI/ML best practices
Excellent communication skills with the ability to explain technical concepts to both technical and non-technical stakeholders
A passion for expanding educational and career opportunities and mission-driven work
Authorization to work in the United States for any employer
Curiosity and enthusiasm for emerging technologies, with a willingness to experiment with and adopt new AI-driven solutions and a comfort learning and applying new digital tools independently and proactively
Clear and concise communication skills, written and verbal
A learner's mindset and a commitment to growth: welcoming diverse perspectives, giving and receiving timely, respectful feedback, and continuously improving through iterative learning and user input
A drive for impact and excellence: solving complex problems, making data-informed decisions, prioritizing what matters most, and continuously improving through learning, user input, and external benchmarking
A collaborative and empathetic approach: working across differences, fostering trust, and contributing to a culture of shared success
Experience with event-driven architectures and real-time analytics
Front-end or API experience (e.g., React, Node.js) is a plus
Exposure to observability and monitoring for data pipelines, including freshness, volume, and performance metrics
Experience collaborating with product managers and analytics partners to translate business requirements into well-designed data solutions

Data Engineer, AWS & AI/ML Enablement

Key skills

About this role

Responsibilities:

Requirements: