Mindrift is looking for highly skilled Python Data Scraping Engineers to join the Tendem project and drive specialized data scraping workflows within their hybrid AI + human system. In this freelance role, you will handle data scraping tasks requiring technical precision for web extraction and processing, ensuring accuracy and reliable delivery of structured datasets.
Responsibilities:
- Own end-to-end data extraction workflows across complex websites, ensuring complete coverage, accuracy, and reliable delivery of structured datasets
- Leverage internal tools (Apify, OpenRouter) alongside custom workflows to accelerate data collection, validation, and task execution while meeting defined requirements
- Ensure reliable extraction from dynamic and interactive web sources, adapting approaches as needed to handle JavaScript-rendered content and changing site behavior
- Enforce data quality standards through validation checks, cross-source consistency controls, adherence to formatting specifications, and systematic verification prior to delivery
- Scale scraping operations for large datasets using efficient batching or parallelization, monitor failures, and maintain stability against minor site structure changes
Requirements:
- At least 3 year of relevant experience in data engineering, web scraping, automation, or software development
- Strong experience in Python web scraping (BeautifulSoup, Selenium or similar), including dynamic content (JS, AJAX, infinite scroll) and APIs via proxies
- Proven ability to extract data from complex structures (hierarchies, archived pages, inconsistent HTML)
- Solid background in data cleaning, normalization, and validation, delivering structured datasets (CSV, JSON, Google Sheets)
- Hands-on experience with LLMs and AI frameworks to enhance automation and problem-solving
- Strong attention to detail and commitment to data accuracy
- Self-directed work ethic with ability to troubleshoot independently
- English proficiency: Upper-intermediate (B2) or above
- Bachelor's or Master's Degree in Engineering, Applied Mathematics, Computer Science, or related technical fields
- A link to GitHub