The Motley Fool is a purpose-driven financial services company focused on making the world smarter, happier, and richer. They are seeking a Senior Data Engineer to design and manage the data infrastructure that powers investment operations, ensuring reliable and timely data for stakeholders.
Responsibilities:
- Design, build, and maintain robust ETL/ELT pipelines using Apache Airflow (MWAA). Author DAGs that handle complex dependencies across external data vendors, internal models, and downstream consumers
- Ingest data from diverse sources including SFTP feeds, REST APIs, flat files, and third-party financial data providers. Normalize and conform data into a consistent analytical model
- Build 'circuit breakers' into pipelines: automated data quality checks that halt downstream processing and alert the team via CloudWatch and Slack when anomalies are detected
- Implement AWS Lambda functions for lightweight, event-driven tasks such as triggering ingestion when files land in S3 or validating data payloads before loading
- Maintain and document the data catalog so institutional knowledge lives in the system, not in your head
- Serve as the subject-matter expert for Snowflake. Design schemas, manage data loading via Stages and Snowpipe, and implement role-based access controls
- Write advanced analytical SQL: window functions, CTEs, recursive queries, pivots; to support investment reporting, performance attribution, and ad-hoc analysis
- Profile and optimize slow-running queries. Leverage clustering keys, micro-partition pruning, materialized views, and result caching to minimize compute cost and maximize performance
- Define and deploy cloud resources using Terraform or AWS CDK. Treat infrastructure as software with version control, peer review, and automated testing
- Help design and maintain CI/CD workflows with GitHub Actions for automated testing, linting, and deployment of data pipelines, infrastructure, and application code
- Partner with investment and business teams to translate questions into data models, dashboards, and reports that drive strategic decisions using Tableau
- Design and build automated pipelines that pull data from source systems and render it into production-ready marketing outputs: one-pagers, pitch decks, email campaigns, and social content
- Be a resource for software engineers to build an AI layer on top of existing data infrastructure, enabling LLMs to securely query fund performance data via APIs and answer natural-language questions for internal stakeholders
Requirements:
- 5+ years of professional Python development
- Comfortable with object-oriented design, data manipulation libraries (pandas, NumPy)
- Familiarity with financial research data vendors and feed/API products such as CapIQ Xpressfeed, FactSet, Bloomberg, Thomson Reuters/Refinitiv/LSEG, Russell or MSCI
- Familiarity with financial business data and feed/API products from Broadridge, Morningstar and custodian banks and fund administrators
- Proven experience designing and operating ETL/ELT pipelines
- Deep expertise in Snowflake architecture (clustering keys, micro-partitions, Snowpipe, Stages)
- Able to write complex analytical SQL, window functions, CTEs, recursive queries, and optimize them for cost and performance
- Hands-on experience with AWS CDK or Terraform
- Exposure to LLM integration patterns: markdown files and prompt engineering
- Working knowledge of the AWS ecosystem: Lambda, ECS Fargate, Step Functions, S3, EventBridge, CloudWatch, and RDS
- Experience with data visualization tools (Tableau, Streamlit, or similar) for self-service analytics
- Background in data governance, data cataloging, or data lineage tooling
- Demonstrated experience maintaining CI/CD pipelines for automated testing and deployment of data engineering and application code using Github actions and Terraform
- Experience profiling and optimizing queries across both OLAP (Snowflake) and OLTP (PostgreSQL/Aurora) systems
- Familiarity with EXPLAIN plans, indexing strategies, and database-level performance tuning
- Practical experience building, evaluating, and deploying ML models
- Familiarity with common frameworks (scikit-learn, XGBoost) and an understanding of when and how to apply ML to business problems