Lighthouse Technology Services is partnering with a client to fill a Senior Data Engineer position. The role involves designing, building, and optimizing scalable data ingestion pipelines within a cloud data platform, focusing on Python development and Snowflake data warehouse administration.
Responsibilities:
- Design, develop, and maintain scalable data ingestion pipelines that move data from various source systems into Snowflake
- Build ingestion pipelines using Python, SQL, and modern ingestion tools such as Informatica IDMC and Fivetran
- Connect to external APIs using Python, designing custom ingestion frameworks where pre-built tools are not sufficient
- Implement data validation logic and quality checks within ingestion pipelines to ensure the integrity of replicated data
- Administer and optimize the Snowflake data warehouse environment, including performance tuning and cost optimization
- Analyze and optimize query performance and data models
- Apply best practices around Snowflake scaling strategies, including understanding and implementing vertical vs. horizontal scaling approaches
- Monitor and improve warehouse performance to ensure pipelines run efficiently with minimal runtime and cost
- Build solutions that replicate data from SQL Server and other enterprise source systems into Snowflake
- Design ingestion frameworks to replicate data from external APIs into Snowflake using Python and external access networks
- Leverage ingestion tools such as Fivetran where appropriate while also developing custom Python-based pipelines when needed
- Build and orchestrate data workflows using AWS services, including: AWS Airflow for pipeline orchestration, S3 for storage, CloudWatch for monitoring and debugging
- Implement robust deployment practices and integrate pipelines with CI/CD processes
- Write clean, maintainable, and well-documented Python code including clear functions, comments, and modular design
- Ensure pipelines are scalable, performant, and fault tolerant
- Document all pipeline designs, ingestion logic, and system integrations
Requirements:
- 10+ years of professional experience in Data Engineering
- Strong Python development expertise with the ability to design, build, and optimize ingestion frameworks
- Advanced SQL skills, including writing efficient and performant queries
- Deep experience working with Snowflake, including administration, performance tuning, and cost optimization
- Hands-on experience building scalable data ingestion pipelines
- Experience with Informatica IDMC for data ingestion
- Experience using Fivetran or similar ingestion tools (must be able to explain hands-on usage)
- Experience replicating data from SQL Server or similar relational systems into Snowflake
- Experience ingesting data from external APIs using Python
- Experience working within AWS cloud environments
- Experience using AWS Airflow for scheduling and orchestration
- Familiarity with AWS CloudWatch for monitoring and debugging
- Experience with CI/CD pipeline implementation and deployment strategies
- Strong understanding of ETL workflows and pipeline design
- Expertise in performance tuning for pipelines and data warehouses
- Knowledge of Snowflake scaling strategies (vertical vs. horizontal scaling)
- Experience implementing data quality validation within ingestion pipelines
- Experience documenting engineering solutions and maintaining technical documentation
- Hands-on experience with Fivetran ingestion architecture
- Experience with Lambda or additional AWS services
- Experience improving existing ingestion frameworks and platform architecture