Engineering | Jan 5, 2025

Scaling web automations to millions of pages

Marco Rivera·10 min read

When you go from scraping hundreds of pages to millions, everything changes. In this post, we share the architectural decisions and lessons learned from running large-scale web automations on Intuned.

Concurrency management

The naive approach of spinning up thousands of concurrent browsers hits resource limits fast. We built an adaptive concurrency system that scales based on available resources and target website behavior.

Fault tolerance

At scale, failures are inevitable. Individual pages will timeout, proxies will get blocked, and browsers will crash. Our job system automatically retries failed operations with exponential backoff and different proxy configurations.

More articles for you

View all