Researching new data sources and designing collection strategies for them
Developing and supporting parsing services across web, mobile, and API targets
Reverse-engineering web and mobile applications and their APIs to enable reliable data collection
Building anti-bot bypass logic and resilient request pipelines
Using LLMs (OpenAI, Anthropic, Gemini) to extract and structure unstructured data
Performing code review, refactoring, and writing technical documentation
Owning a defined area of services end-to-end, from research to production
Requirements
Python: 3+ years of commercial development, strong OOP fundamentals
Web frameworks: 2+ years with FastAPI and/or Flask; solid grasp of REST and GraphQL
Async & background jobs: Celery in production
Web scraping: Hands-on experience with BeautifulSoup, Scrapy, and Playwright (or similar headless tooling)
Web & API reverse engineering: Comfortable inspecting web traffic, replaying and reconstructing private API calls
OSINT: Practical experience applying OSINT methodologies and tools
Anti-bot bypass: Real-world experience defeating anti-bot and anti-scraping protections
Networking: Deep understanding of the HTTP protocol; confident with regular expressions
AI for data extraction: Production experience with LLM APIs (OpenAI GPT, Anthropic Claude, Google Gemini) and prompt engineering for extracting and structuring data
AI-assisted parser development: Practical experience using LLMs to design, write, and accelerate the development of parsers themselves
Databases: SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Redis, Elasticsearch); confident with ORMs
Testing: Pytest, unit and integration testing as a habit
Russian: Advanced level or higher
English: B1 or higher
Tech Stack
ElasticSearch
Flask
GraphQL
MongoDB
MySQL
NoSQL
Postgres
Python
Redis
SQL
Benefits
Remote-first setup: work from anywhere in the world (excluding Russia and Belarus)
Ownership of meaningful services from day one and a clear path to grow into a Lead role
Work on a fast-growing, internationally recognized product used by law enforcement and Fortune 500 companies
A collaborative team environment where your work has direct, visible impact on the product
Modern tech stack with active use of LLMs, headless browsers, and reverse-engineering tooling in production