# CAPTCHAs

CAPTCHA pages mean the target site detected something off about your request — not necessarily a "bot," just that something doesn't add up. helodata supplies the IP; the rest is your client's job.

## Common detection signals (in order of importance)

1. **IP reputation** — has this IP been hammering the target recently?
2. **Header consistency** — UA, Accept-Language, Sec-CH-UA-\* must agree with each other and with the IP's country
3. **TLS fingerprint** — JA3 / JA4 hash matching a known bot library
4. **Behavior** — request pattern, timing, mouse / scroll on JS pages

## Quick wins

In rough order of effort vs payoff:

### 1. Use a session, not rotation, for multi-step flows

Bot detection counts how many requests came from each IP. A sticky session looks like one user across the flow; pure rotation looks like a million one-page users — that's a classic bot signal.

```
helo_s1a2b3c4d5e-type-res-region-us-session-job1-sesstime-30
```

### 2. Align headers with the IP geo

If your exit IP is in Germany, your headers should look German:

```
Accept-Language: de-DE,de;q=0.9,en;q=0.8
User-Agent: Mozilla/5.0 (... a modern browser UA from a major OS ...)
```

Mismatched `Accept-Language` (e.g. `zh-CN` from a US IP) is a strong signal.

### 3. Send a complete browser header set

A bare `requests.get()` sends fewer headers than any real browser. Pad it:

```python
HEADERS = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ...",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br",
    "DNT": "1",
    "Connection": "keep-alive",
    "Upgrade-Insecure-Requests": "1",
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "none",
    "Sec-Fetch-User": "?1",
}
```

### 4. Use a JS engine for JS-heavy sites

Targets that ship behavioral fingerprinting (`navigator.webdriver`, mouse movements, canvas) need a real browser — use [Playwright](/integrations/scraping-tools/playwright.md) or [Puppeteer](/integrations/scraping-tools/puppeteer.md) with stealth plugins.

### 5. Switch to mobile

If the target trusts mobile devices more (a common pattern), switch from `type-res` to `type-mob`. Pair with a carrier ASN:

```
helo_s1a2b3c4d5e-type-mob-region-us-asn-6167-session-x-sesstime-60
```

## What about CAPTCHA solvers?

helodata doesn't solve CAPTCHAs. Common patterns to combine:

* **2Captcha / Anti-Captcha / CapSolver** — pay-per-solve services that return the token
* **`puppeteer-extra-plugin-recaptcha`** — Puppeteer plugin that wires the token in
* **Browser-fingerprint-aware antidetect browsers** — [AdsPower](/integrations/antidetect-browsers/adspower.md), [Multilogin](/integrations/antidetect-browsers/multilogin.md) — these handle the fingerprint side, leaving you to fetch the token

For Cloudflare's invisible challenges (`1020`, JS challenge), a real Chromium with stealth almost always beats a CAPTCHA solver.

## When the target *just* hates the IP

If you've matched headers, fingerprint, and behavior and still see a CAPTCHA — that exit IP is burnt for this target. **Rotate**:

* Gateway: drop the session, get a fresh IP
* ISP: switch to another IP from your batch; consider requesting a replacement if it persists across multiple IPs from the same batch ([replace IP](/products/overview-2/ip-management.md#replace-a-dead-ip))

## Coming soon

helodata's [Web Unblocker](/products/overview-4.md) handles all of the above transparently — JS rendering, header rotation, CAPTCHA, retries. Join the waitlist if this is the pattern you keep solving.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.helodata.com/troubleshooting/captchas.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
