PortPro is seeking a Director of Software Engineering with expertise in Node.js and large-scale web scraping. This role involves leading the engineering team to design and optimize high-performance web scraping systems while ensuring compliance with legal standards and managing a high-performance team.
Responsibilities:
- Architect, develop, and maintain scalable and distributed web scraping systems using Node.js
- Design and implement data extraction pipelines to process large volumes of structured and unstructured data
- Develop solutions to bypass anti-bot mechanisms, including CAPTCHA handling, session management, fingerprinting, and IP rotation
- Optimize scraping processes for performance, reliability, and efficiency while managing proxy services(residential, datacenter, rotating)
- Oversee data storage and processing strategies, ensuring high availability and consistency
- Collaborate with Product, DevOps, and Data Science teams to integrate extracted data into analytics and business applications
- Implement best practices for microservices, API integrations, and real-time data streaming
- Lead the transition to cloud-native, containerized, and serverless architectures for web scraping
- Ensure compliance with legal and ethical standards (robots.txt, GDPR, CCPA, etc.)
- Optimize cloud resources (AWS, GCP, or Azure) to support high-throughput scraping
- Manage real-time monitoring and alerting systems to detect scraping failures, IP bans, or performance bottlenecks
- Work closely with DevOps teams to optimize CI/CD pipelines, automated deployments, and system scalability
- Lead, mentor, and grow a high-performance engineering team
- Define and execute the technology roadmap, aligning with business objectives
- Foster a culture of continuous learning, collaboration, and innovation
- Implement agile development methodologies (Scrum, Kanban) to optimize project execution
- Ensure code quality, security, and best practices across all engineering efforts
Requirements:
- 10+ years of experience in software engineering, with at least 5+ years in web scraping and large-scale data extraction
- Strong hands-on expertise in Node.js, Puppeteer, Playwright, Cheerio, Selenium, and headless browser automation
- Extensive experience in handling CAPTCHAs, IP rotation, session management, and anti-bot evasion techniques
- Deep knowledge of proxy management (residential, datacenter, rotating, and VPNs)
- Experience with NoSQL/SQL databases (MongoDB, PostgreSQL, Redis, Elasticsearch, etc.)
- Familiarity with data processing frameworks (Kafka, RabbitMQ, Spark, Airflow, etc.)
- Strong experience with CI/CD, containerization (Docker, Kubernetes), and cloud deployment (AWS/GCP/Azure)
- Proven track record of scaling engineering teams and leading complex projects
- Strong problem-solving and debugging skills, especially for scraping challenges and performance bottlenecks
- Excellent communication and stakeholder management skills
- Passion for mentorship, team development, and continuous learning
- Experience with machine learning for data extraction and NLP
- Knowledge of browser fingerprinting and bot detection mechanisms
- Familiarity with enterprise-scale web crawling frameworks (Scrapy, Colly, Apify, etc.)
- Prior leadership experience in data-driven businesses or web scraping startups