Serv is a global executive recruitment partner hiring for Mercator.ai, which is focused on building scalable data infrastructure for data-driven decision making. The Staff Data Engineer will lead the design and development of distributed data pipelines, ensuring scalability, reliability, and integration of modern tools to enhance engineering output.

Responsibilities:

Lead the architecture and evolution of scalable, distributed data pipelines, ensuring high availability and performance at scale
Design and implement robust data models to support reporting and advanced data applications
Build and maintain distributed web scraping systems using tools such as Playwright, Selenium, and BeautifulSoup
Develop systems capable of handling anti-scraping measures, proxy rotation, and high-volume data extraction
Integrate AI and LLMs into engineering workflows for code generation, automation, and optimization
Apply prompt engineering techniques to improve data processing, documentation, and troubleshooting
Identify and implement system and process improvements to optimize performance and efficiency
Manage and scale cloud-based data infrastructure, including data warehouses, object storage, and search systems
Deploy and maintain containerized workloads using Kubernetes
Implement data quality monitoring and governance processes to ensure accuracy and reliability
Mentor junior engineers through code reviews, documentation, and knowledge sharing
Communicate technical concepts clearly and provide business context for engineering decisions

Requirements:

5+ years of experience in Data Engineering with a track record of scaling systems
Expert proficiency in Python and advanced SQL, including performance tuning and optimization
Strong experience with workflow orchestration tools such as Airflow or Prefect and transformation tools such as dbt
Proven experience building resilient web scraping systems using Playwright, Selenium, and BeautifulSoup
Deep understanding of relational and NoSQL databases including Postgres, MongoDB, and ElasticSearch
Experience working with large-scale data systems such as BigQuery
Strong proficiency with CI/CD pipelines, Git, and Docker
Experience designing and maintaining distributed systems with high availability and fault tolerance
Experience with GCP or AWS and Kubernetes for infrastructure management
Familiarity with LLMs such as ChatGPT, Claude, or Gemini for engineering workflows
Experience with prompt optimization and AI-assisted development

Staff Data Engineer

Key skills

About this role

Responsibilities:

Requirements: