Scaling from 10 to 100 accounts: when the stack breaks
Scaling from 10 to 100 accounts: when the stack breaks
At 10 accounts, you can manage almost anything manually. you refresh each profile, paste in credentials, rotate proxies by hand, and keep notes in a spreadsheet. it’s tedious but survivable. somewhere between 30 and 50 accounts, that approach collapses. tasks that took an afternoon now take two days, and you’re making mistakes you didn’t make before because the cognitive load is simply too high.
this tutorial is for operators who have proven the model at small scale and want to push to 100 accounts without losing their minds or their accounts. i’ll walk through exactly what tends to break, in what order, and how i rebuilt my stack in Singapore when i hit that wall in early 2025. i’m not going to promise any income numbers, because the return depends entirely on what you’re farming or managing, but i will tell you the infrastructure decisions that made the difference between 15% daily failure rates and under 3%.
the outcome you should expect from following this: a more automated, more observable, and more recoverable account operation. you’ll have a clear picture of where your bottlenecks are, and a working pattern for scaling without linearly increasing your time investment.
what you need
- antidetect browser: Multilogin X or AdsPower. Multilogin X runs around $99/month for 100 browser profiles as of Q1 2026. AdsPower is slightly cheaper at $59/month for the same count. both have well-documented APIs for programmatic profile control.
- proxy pool: minimum 50 residential IPs, ideally one per account or close to it. expect to pay $3-8 per GB for residential traffic.
- VPS or dedicated server: at 100 accounts running any meaningful automation, you need at least 8 vCPU and 32GB RAM. Hetzner CPX51 (€28.49/month as of May 2026) is what i use for the orchestration layer.
- Python 3.11+ and Playwright: for browser automation against the antidetect profiles.
- Redis: for task queuing and session state. free to self-host.
- PostgreSQL: for account metadata, proxy assignments, and health tracking.
- a structured logging setup: Loki + Grafana, or even just structured JSON logs shipped to a file. you cannot debug 100 accounts by reading terminal output.
- time: expect 2-3 weeks to migrate and stabilize if you’re doing this while running existing operations.
step by step
step 1: audit what’s actually failing at your current scale
before you rebuild anything, instrument what you have. add logging to every account action that records: account id, proxy used, action taken, outcome, timestamp. run your current setup for 48 hours and pull the data.
what you’re looking for: failure clustering by proxy subnet, by time of day, by account age, or by action type. in my case, 70% of failures were on a single /24 residential subnet that had been flagged. i didn’t know that until i looked at the data.
import logging
import json
def log_action(account_id, proxy, action, outcome):
entry = {
"account_id": account_id,
"proxy": proxy,
"action": action,
"outcome": outcome,
"ts": datetime.utcnow().isoformat()
}
logging.info(json.dumps(entry))
if it breaks: if you have no logging at all, start with even a CSV append. perfect is the enemy of started.
step 2: move account metadata to a database
a spreadsheet stops working at around 30 accounts. move everything to PostgreSQL. the schema doesn’t need to be complex to start.
create table accounts (
id serial primary key,
platform text not null,
username text not null,
proxy_id integer references proxies(id),
profile_id text,
status text default 'active',
last_action_at timestamptz,
created_at timestamptz default now()
);
create table proxies (
id serial primary key,
host text not null,
port integer not null,
username text,
password text,
type text default 'residential',
last_used_at timestamptz,
failure_count integer default 0
);
populate this from your spreadsheet once and never go back. every script reads from and writes to this database from this point forward.
if it breaks: if your accounts have inconsistent data (missing fields, duplicate usernames), clean before you import. a dirty database is worse than a spreadsheet because you’ll trust it more.
step 3: assign proxies deterministically, not randomly
random proxy assignment sounds smart but it creates correlated failures. if two accounts share a proxy and that proxy gets flagged, both accounts go down at the same time.
assign one proxy per account, statically, and record the assignment. rotate the proxy only on explicit failure, not on a schedule. residential proxy rotation is a deeper topic, but the core rule is: stable assignment first, rotation as recovery.
def get_proxy_for_account(account_id, db):
row = db.execute(
"select p.host, p.port, p.username, p.password "
"from proxies p join accounts a on a.proxy_id = p.id "
"where a.id = %s", (account_id,)
).fetchone()
return row
if it breaks: if you run out of proxies (fewer proxies than accounts), you’ll need to share, but track shared accounts in a group and quarantine the group on failure.
step 4: replace manual browser sessions with API-driven profile launches
both Multilogin and AdsPower expose REST APIs to start a browser profile and return a debugging port. Playwright can attach to that port. this is how you drive 100 browser profiles from a single script without clicking anything.
import requests
from playwright.sync_api import sync_playwright
def launch_profile(profile_id: str) -> int:
resp = requests.get(
f"http://localhost:35000/api/v1/profile/start",
params={"profile_id": profile_id}
)
data = resp.json()
return data["data"]["debug_port"]
def get_browser(port: int):
p = sync_playwright().start()
browser = p.chromium.connect_over_cdp(f"http://localhost:{port}")
return browser, p
launch, act, close. don’t leave profiles open between tasks. open profiles consume memory fast and antidetect software has limits on concurrent sessions at each pricing tier.
if it breaks: if the API returns a port but Playwright can’t connect, check that the antidetect browser is actually running and not backgrounded into a tray state. on linux servers this is a common issue, run the browser in a persistent screen or tmux session.
step 5: build a task queue with Redis
at 100 accounts you can’t run tasks synchronously. you need a queue so work is distributed and retried automatically. RQ is the simplest Python-native option, built on Redis.
pip install rq redis
from redis import Redis
from rq import Queue
q = Queue(connection=Redis())
def enqueue_daily_tasks(accounts):
for acct in accounts:
q.enqueue(run_account_task, acct['id'], retry=Retry(max=3, interval=[10, 30, 60]))
run workers as separate processes. on my Hetzner box i run 8 workers in parallel, which handles 100 accounts in about 40 minutes for light tasks.
if it breaks: if jobs pile up and never complete, check worker logs. common causes are the antidetect browser API being unavailable, or a database connection pool being exhausted.
step 6: implement a health check loop
every account should be checked on a schedule, not just when you’re running tasks. a health check is a lightweight read, not a write. it tells you which accounts are logged out, flagged, or returning errors before you attempt something important.
def health_check(account_id, db):
proxy = get_proxy_for_account(account_id, db)
port = launch_profile(get_profile_id(account_id, db))
browser, p = get_browser(port)
try:
page = browser.new_page()
page.goto("https://platform.example.com/api/me", timeout=15000)
status = page.evaluate("() => document.body.innerText")
db.execute("update accounts set status=%s, last_action_at=now() where id=%s",
("active" if "user" in status else "logged_out", account_id))
except Exception as e:
db.execute("update accounts set status='error', last_action_at=now() where id=%s",
(account_id,))
finally:
browser.close()
p.stop()
run health checks at off-peak hours. i run mine at 3am Singapore time.
if it breaks: if health checks themselves trigger rate limits, reduce frequency or randomise the timing per account within a window.
step 7: build a recovery workflow
accounts will fail. the question is whether you catch it in 10 minutes or 3 days. when an account hits a failure threshold (i use 3 consecutive failures), it should automatically: pause further tasks, flag for review, rotate its proxy assignment, and attempt a re-login if credentials are stored.
don’t attempt automated re-login without careful thought. on some platforms, repeated login attempts from new IPs accelerate bans. log the failure and queue a manual review task instead.
if it breaks: if your recovery workflow itself errors, you’ll have zombie accounts sitting in an undefined state. make sure the status field always gets written even when the recovery fails.
step 8: set up observability
at 100 accounts you need a dashboard. i use Grafana with a PostgreSQL data source because i already have the database. three panels cover 80% of what i need to see: account status breakdown (active / error / logged_out), task success rate over the last 24 hours, and proxy failure count by subnet.
Grafana’s PostgreSQL integration takes about 20 minutes to configure if you already have the data in the right shape.
if it breaks: if queries are slow, add indexes on status and last_action_at. a table scan across 100 rows is fast, but once you’re at 1000 it starts to matter.
common pitfalls
1. scaling proxies slower than accounts. operators often buy 20 proxies and stretch them across 60 accounts. this creates correlated failure, and when one /24 gets flagged you lose a third of your accounts at once. proxy count should stay close to account count, or you need explicit grouping and isolation.
2. not tracking proxy health separately. a proxy can be fine for general browsing but flagged on a specific platform. track failure counts per proxy per platform, not just overall.
3. running everything on one machine without process supervision. if a worker crashes at 2am, your tasks just don’t run. use systemd to manage worker processes so they restart automatically on failure.
4. no staging environment. every change you make at scale hits 100 accounts simultaneously. test automation changes on 5-10 accounts in a separate pool before rolling out. this is especially important for antidetect browser profile configurations. if you’re looking for comparisons on antidetect tooling, antidetectreview.org/blog/ has detailed breakdowns of how different browsers handle fingerprint injection.
5. treating all accounts equally. some accounts are older, warmer, and more valuable. apply more conservative automation to high-value accounts and treat newer ones as disposable. this is a simple flag in your accounts table, but most operators don’t add it until after they’ve burned a valuable account with aggressive automation.
scaling this
10 to 100 accounts: the changes above get you here. the main investment is infrastructure time, not money. total monthly cost at this scale is roughly $200-300 (antidetect subscription, proxies, VPS).
100 to 1000 accounts: the bottleneck shifts from logic to throughput. you’ll need to shard your worker pool, possibly across multiple machines. Multilogin X’s team tier (around $299/month) supports more concurrent profiles. proxy costs become significant, $500-1000/month for residential at this scale. you’ll also need to think about credential storage properly, a secrets manager like HashiCorp Vault rather than a plaintext database field.
1000+ accounts: at this point you’re dealing with enterprise infrastructure questions. horizontal scaling for workers, read replicas for your database, and likely a dedicated proxy provider relationship rather than retail plans. the automation logic itself rarely changes, but the operational overhead of managing failures at scale becomes its own full-time function. see our guide on automating account recovery workflows for what that looks like in practice.
the other thing that changes at 1000+ is detection surface. research from Princeton’s Web Transparency and Accountability Project has documented how platforms use behavioural signals, not just fingerprints, to identify automation. your browser fingerprint can be perfect and you can still get flagged for interaction patterns that don’t match human behaviour. the higher you go, the more you need to think about timing distributions, scroll patterns, and session length variance.
where to go next
- how to set up proxy rotation without losing sessions: covers rotating residential proxies in a way that doesn’t log accounts out mid-session, including sticky session configuration.
- antidetect browser comparison: Multilogin vs AdsPower vs Gologin: a detailed breakdown of profile API quality, fingerprint coverage, and pricing at different account counts.
- /blog/: full index of operator guides on this site.
Written by Xavier Fok
disclosure: this article may contain affiliate links. if you buy through them we may earn a commission at no extra cost to you. verdicts are independent of payouts. last reviewed by Xavier Fok on 2026-05-19.