
Introduction: The Modern Web Scraping Arms Race 🏹
You’ve been there. You meticulously craft a web scraper or browser automation script using Playwright, Puppeteer, or maybe good ol’ Selenium. It purrs like a kitten on your local machine, grabbing the data you need. Then, you deploy it to the wild, and… BAM! 🧱 403 Forbidden. CAPTCHA walls spring up. Cryptic messages demand you “Please prove you are human” or “Press and Hold”. Your bot, once a nimble data ninja, is now stuck in digital quicksand. Sound familiar? 😩

Welcome to the modern web scraping arms race. Websites aren’t just checking IP addresses or basic User-Agent strings anymore. They’re deploying sophisticated defense systems, with industry heavyweights like PerimeterX (now part of HUMAN Security) leading the charge. These systems employ advanced techniques like browser fingerprinting and behavioral analysis to distinguish legitimate human users from automated bots.
The game has changed significantly from the days when simple IP rotation and User-Agent spoofing were enough. Modern anti-bot systems delve deep into the characteristics and actions of the client connecting to the server. This evolution necessitates a more sophisticated approach from developers building scrapers and automation tools. Older, simpler evasion tactics are increasingly proving inadequate against these advanced defenses.
But don’t despair! This post is your field guide to navigating this complex battlefield. We’ll dissect how these detection systems work, focusing on fingerprinting techniques and the methods used by PerimeterX/HUMAN. Most importantly, we’ll explore practical strategies and code examples to help your scrapers blend in and avoid detection. Let’s dive in! 🏊♂️
What is Browser Fingerprinting? (And Why Websites Use It) 🤔
At its core, browser fingerprinting is a technique used by websites to identify and track users without relying on traditional methods like cookies. Instead of storing a unique identifier on your machine (like a cookie), fingerprinting collects a diverse set of characteristics that your browser and device naturally reveal. Think of things like your operating system, browser version, installed fonts, screen resolution, language settings, time zone, and even subtle details about how your hardware renders graphics or processes audio.

The magic happens when these individual pieces of information, often not unique on their own, are combined. Just like a human fingerprint is unique due to the specific pattern of ridges and whorls, the combination of dozens of browser and device characteristics can create a highly distinctive digital “fingerprint” or “hash”. It’s statistically rare for two different users on different devices to have the exact same combination of all these attributes.
The process typically works like this:
- A visitor lands on a webpage.
- A script, usually written in JavaScript, runs silently in the background.
- This script collects various data points exposed by the browser’s APIs or through specific tests (like asking the browser to render a hidden image).
- The collected data is often processed through a hashing function to generate a compact, unique identifier.
- This fingerprint hash is stored server-side and used to recognize the browser on subsequent visits or across different sites.
This “stateless” nature is what makes fingerprinting fundamentally different from cookies. Because no persistent data needs to be stored on the user’s device, it happens covertly, often without the user’s knowledge or explicit consent. Users generally lack easy controls to view, clear, or block this data collection, unlike cookies.
Why Websites Use Fingerprinting
Websites employ fingerprinting for several reasons:
- User Tracking & Analytics: Identifying unique and returning visitors for website analytics, even if they clear cookies or use private browsing modes.
- Personalization & Marketing: Tailoring website content, offers, or advertisements based on the inferred user profile or browsing history.
- Security & Fraud Prevention: Detecting malicious activities like account takeover attempts, payment fraud, or identifying users trying to circumvent restrictions by creating multiple accounts.
- Bot Detection: Our primary focus – distinguishing automated scripts (like scrapers) from genuine human users to protect resources and data.
This dual nature of fingerprinting—serving both tracking/marketing and security/anti-bot purposes—creates a complex landscape. Techniques designed to enhance user privacy by modifying or blocking fingerprinting signals can sometimes make a browser (or a bot mimicking one) stand out more if not implemented carefully, as they deviate from the expected patterns of typical users. Simply disabling fingerprinting APIs is often not a viable evasion strategy; the goal is usually to mimic a common, consistent, and realistic fingerprint.
Furthermore, browser fingerprints aren’t always perfectly unique or eternally stable. Browser updates, operating system patches, or changes in hardware configuration can alter a device’s fingerprint over time. Consequently, detection systems often rely on statistical identification rather than exact matches. Many advanced systems, like PerimeterX, calculate a “trust score” based on the fingerprint and other factors, rather than making a simple block/allow decision. This probabilistic approach means evasion isn’t necessarily about creating a single, perfect, unchanging fake fingerprint, but rather maintaining one that appears consistent enough and falls within the bounds of expected variations for a legitimate user profile.
Key Fingerprinting Vectors You Need to Know 🕵️♀️
A browser fingerprint isn’t monolithic; it’s a composite score built from numerous individual signals or data points. Anti-bot systems scrutinize these signals, looking for inconsistencies or characteristics typical of automated browsers. Understanding the most common vectors is crucial for effective evasion.

Here’s a breakdown of the heavy hitters:
User-Agent & HTTP Headers
What: The User-Agent string is a standard HTTP header identifying the browser, version, and operating system. Other headers like Accept-Language, Accept-Encoding, Connection, Referer, etc., provide additional context about the request and browser capabilities.
Why It Matters: Using default User-Agents from HTTP libraries (like python-requests or node-fetch) is an immediate red flag. Missing standard headers or inconsistencies (e.g., a Chrome UA with Firefox-specific headers) are easily detected.
Evasion: Rotate through a list of realistic, up-to-date User-Agent strings from common browsers. Crucially, ensure all associated headers are present and match the profile implied by the User-Agent. Tools like httpbin.org can help compare your scraper’s headers against a real browser’s.
Canvas Fingerprinting
What: Exploits the HTML5 <canvas> element. JavaScript instructs the browser to draw specific 2D graphics or text onto a hidden canvas. Minor variations in the underlying graphics hardware (GPU), drivers, operating system, and browser rendering engine cause the resulting image to differ slightly from device to device. The script then reads the pixel data of this rendered image (often using the toDataURL() method) and generates a hash from it.
Why It Matters: Highly effective due to its sensitivity to hardware and software stack variations, making it a popular and potent fingerprinting vector.
Evasion: Simply blocking the canvas API can be detected. Common techniques involve either adding random “noise” to the canvas pixel data before it’s read or intercepting the toDataURL() call and returning a predefined, fake image data string.
WebGL Fingerprinting
What: Uses the Web Graphics Library (WebGL) API, designed for rendering 3D graphics in the browser. Similar to canvas fingerprinting, scripts instruct the browser to render complex 3D scenes off-screen. The way these scenes are rendered reveals detailed information about the GPU, graphics drivers, and other hardware capabilities. Specific parameters like UNMASKED_VENDOR_WEBGL and UNMASKED_RENDERER_WEBGL can expose the exact GPU model and vendor.
Why It Matters: Provides deep hardware-level insights with high entropy (uniqueness), making it a powerful identifier.
Evasion: Very challenging due to the complexity of the rendering pipeline. Disabling WebGL is an option but easily detectable and breaks sites that require it. Effective spoofing typically requires specialized tools or plugins that can provide consistent, realistic WebGL parameters matching a specific device profile. Manually patching requires extreme care.
Font Fingerprinting
What: Websites use JavaScript to detect the list of fonts installed on the user’s system. This can be done by iterating through a predefined list of font names and checking if the browser can render text using them, or by measuring the dimensions of rendered text. Operating systems come with default fonts, but users often install additional custom fonts.
Why It Matters: The specific combination of installed system and user fonts can be highly unique across different devices.
Evasion: Requires presenting a font list that is consistent with the operating system claimed in the User-Agent. Anti-detect browsers or stealth plugins often manage this by hiding unique fonts or providing a standard list. Avoid having unusual fonts in the scraper’s execution environment.
AudioContext Fingerprinting
What: Leverages the Web Audio API. A script generates a specific audio signal (often inaudible), processes it using an AudioContext object, and analyzes the resulting output waveform or frequency data. Subtle differences in the audio hardware (sound card, drivers) and software stack (OS, browser implementation, CPU architecture) lead to minute variations in the processed audio signal, which can be hashed into a fingerprint.
Why It Matters: Captures unique nuances of the device’s audio processing capabilities, adding another layer to the fingerprint.
Evasion: Involves adding random noise to the audio data or spoofing the output results. This is often handled by comprehensive stealth plugins or anti-detect browsers due to the complexity of manual patching.
WebRTC Leaks
What: Web Real-Time Communication (WebRTC) is a set of APIs enabling peer-to-peer communication directly between browsers (e.g., for video calls). A side effect is that these APIs can expose the user’s local IP address and potentially public IP address, even if they are using a proxy or VPN.
Why It Matters: A critical leak that can completely bypass IP masking attempts via proxies, revealing the scraper’s true origin IP.
Evasion: Disabling WebRTC entirely (though this itself can be a fingerprinting signal) or using browser extensions, specific browser configurations (like in Brave), or automation tool settings that control or prevent IP leakage through WebRTC.
Navigator Properties
What: The window.navigator JavaScript object exposes a wealth of information about the browser environment. Key properties include navigator.platform (OS info), navigator.vendor (browser vendor), navigator.plugins and navigator.mimeTypes (installed plugins), navigator.deviceMemory, navigator.hardwareConcurrency (hardware specs), navigator.languages (preferred languages), and the notorious navigator.webdriver flag.
Why It Matters: Automated browsers controlled by tools like Selenium, Puppeteer, or Playwright often have tell-tale default values for these properties. For instance, navigator.webdriver is typically true in automated contexts but false or undefined in normal browsers. The plugins array might be empty in headless mode, unlike in a regular browser. Inconsistencies between these properties (e.g., platform showing ‘Linux’ while the User-Agent claims ‘Win32’) are strong indicators of spoofing or automation.
Evasion: Requires patching these properties using JavaScript injection (e.g., page.add_init_script in Playwright, page.evaluateOnNewDocument in Puppeteer) or relying on stealth plugins that handle these overrides automatically.
IP Address & Network Info
What: The client’s public IP address reveals geolocation, ISP, and connection type (datacenter, residential, mobile). Furthermore, the characteristics of the network connection itself, particularly the Transport Layer Security (TLS) handshake used to establish a secure HTTPS connection, can be fingerprinted (known as TLS or JA3 fingerprinting). Different libraries and operating systems negotiate TLS differently.
Why It Matters: IPs originating from datacenters are highly suspicious and easily flagged by anti-bot systems. Excessive traffic from a single IP triggers rate limiting or blocks. A TLS fingerprint that matches a common HTTP library (like Python’s requests or Node.js’s https module) instead of a real browser is a dead giveaway for automation.
Evasion: Use high-quality residential or mobile proxies to make traffic appear to originate from real user devices. Rotate IPs frequently to distribute load and avoid rate limits. For TLS fingerprinting, use tools designed to mimic browser TLS handshakes (like curl-impersonate) or leverage browser automation tools (Puppeteer, Playwright) which inherently use the browser’s TLS stack.
Screen Resolution, Timezone, Language, etc.
What: JavaScript can access various configuration details like screen dimensions (screen.width, screen.height), color depth (screen.colorDepth), system time zone (Intl.DateTimeFormat().resolvedOptions().timeZone), and preferred languages (navigator.language, navigator.languages).
Why It Matters: These contribute to the overall uniqueness of the fingerprint. More importantly, inconsistencies between these values and other signals are suspicious. For example, a timezone inconsistent with the geolocation derived from the IP address, or navigator.languages not matching the Accept-Language HTTP header, raises flags. Uncommon screen resolutions might also indicate a virtualized or headless environment.
Evasion: Set realistic values for viewport size, timezone, and language using browser launch options or JavaScript patching. Ensure these settings are consistent with the assumed profile (User-Agent, IP geolocation).
The sheer number and diversity of these signals highlight a critical point: effective evasion requires consistency. Just changing the User-Agent isn’t enough. The navigator.platform, WebGL renderer, default fonts, typical HTTP headers, and even the TLS handshake should all align with the claimed browser and OS profile. Discrepancies across these different layers are what sophisticated detection systems excel at finding.
Furthermore, the effectiveness of fingerprinting hinges on entropy – the measure of how much variability or uniqueness a signal provides across a population. Complex vectors like Canvas rendering, WebGL details, and comprehensive font lists generally possess higher entropy than simpler signals like the basic User-Agent string. This explains why anti-bot systems invest heavily in analyzing these high-entropy signals, and consequently, why robust evasion strategies must address them effectively.
To help consolidate this information, here’s a summary of these common fingerprinting vectors:
| Vector | Data Collected | Why Useful for Detection | Common Evasion Approach |
|---|---|---|---|
| User-Agent & Headers | Browser/OS/Version, Language, Encoding, etc. | Default library UAs, missing/inconsistent headers are giveaways | Rotate realistic UAs, ensure header consistency with profile |
| Canvas Fingerprinting | Pixel data from rendering hidden 2D graphics/text | Sensitive to GPU/driver/OS/browser variations, high entropy | Add noise to pixel data, return fake toDataURL() result, use plugins |
| WebGL Fingerprinting | Details from rendering hidden 3D scenes (GPU model/vendor, drivers, params) | Deep hardware info, very high entropy | Difficult; use plugins/tools (e.g., playwright-with-fingerprints), careful patching |
| Font Fingerprinting | List of installed system/user fonts | Combination of fonts can be highly unique | Mask unique fonts, provide standard list consistent with OS via plugins/tools |
| AudioContext Fingerprint | Output from processing standardized audio signal | Captures subtle audio hardware/software stack differences | Add noise, spoof results, use plugins. Manual patching is complex |
| WebRTC Leaks | Local/Public IP address exposure via WebRTC APIs | Bypasses proxy/VPN masking, reveals true origin IP | Disable WebRTC (detectable), use browser settings/extensions/plugins to control leak |
| Navigator Properties | webdriver, platform, plugins, languages, vendor, hardware specs | Automation tools have defaults (webdriver=true, empty plugins); inconsistencies | Patch properties via JS injection (evaluateOnNewDocument), use stealth plugins |
| IP Address & Network | Geolocation, ISP, connection type (datacenter/res/mobile), TLS handshake | Datacenter IPs easily flagged, high traffic rates suspicious, library TLS signatures | Use rotating residential/mobile proxies, mimic browser TLS |
| Screen/Timezone/Lang | Screen resolution, color depth, timezone, language settings | Contribute to uniqueness, inconsistencies with other signals (IP, headers) are flags | Set realistic, consistent values via launch options or JS patching |
Enter the Boss Level: PerimeterX (HUMAN Security) 🛡️
Now that we understand the building blocks of browser fingerprinting, let’s talk about one of the major players putting these techniques into practice: PerimeterX, now operating under the HUMAN Security brand. You’ll find their defenses guarding the gates of many popular websites across e-commerce, travel, finance, and more. If you’re hitting persistent blocks that seem more sophisticated than simple IP bans, there’s a good chance you’re dealing with HUMAN.

HUMAN Security doesn’t rely on a single trick. It employs a multi-layered, sophisticated approach to bot detection, making it a formidable challenge for scrapers. Here’s how they typically operate:
Advanced Fingerprinting
HUMAN utilizes a wide array of the fingerprinting vectors we discussed, collecting hundreds of signals through its client-side JavaScript sensor (often identifiable by network requests to paths containing /_px or cookies like _px3). This includes:
- Detailed JavaScript fingerprinting (Canvas, WebGL, Audio, Fonts, Navigator properties, etc.)
- IP address analysis (reputation, type - datacenter vs. residential, geolocation consistency, traffic volume)
- HTTP header scrutiny (consistency, presence of expected headers, order)
- TLS handshake analysis
- Device features (rendering capabilities, window objects, plugins/extensions)
- Attacker-specific signatures and comparison against known bad actor profiles
- Their own cookie-based identifier (HUMAN ID) used alongside stateless fingerprinting
Behavioral Analysis (The Game Changer)
This is where HUMAN truly differentiates itself. It doesn’t just look at what the browser is; it meticulously analyzes how the browser behaves over time. Using machine learning models trained on massive datasets of real human and bot interactions (reportedly over 200 ML algorithms as of late 2022), HUMAN looks for anomalies and patterns indicative of automation. Key behavioral signals include:
- Mouse Movements: Trajectory, speed, acceleration/deceleration, click patterns, time between mousedown/mouseup, use of the mouse as a reading aid. Real human movement is often described as “chaotic” or “organic” compared to the more predictable or non-existent movements of bots.
- Keyboard Interactions: Typing speed, rhythm, cadence, intervals between keydown/keyup events. Humans exhibit specific patterns, like typing subsequent identical letters faster.
- Scrolling: Speed, patterns, correlation with reading speed.
- Touch Events: Tapping patterns, pressure (if available) on mobile devices.
- Navigation Patterns: Humans tend to browse in somewhat unpredictable ways, while bots often follow linear paths or access URLs directly.
- Interaction Timing & Cadence: Latency between actions, overall session duration (bot sessions are often very short or unnaturally long). Humans exhibit natural delays as they visually process information and react.
- Resource Loading: Whether the client loads images, CSS, and other resources like a normal browser.
Predictive Modeling & Dynamic Trust Score
HUMAN’s cloud-based detector processes these hundreds of fingerprinting and behavioral signals in real-time. It combines them to calculate a dynamic risk or “trust” score for each visitor session. This score isn’t static; it’s continuously updated based on ongoing interactions and behavior. Based on this score, HUMAN decides the appropriate action:
- Allow: If the score indicates a legitimate human user.
- Challenge: If the score is uncertain, present a CAPTCHA or a specific challenge like the “Press and Hold” button.
- Block: If the score strongly indicates a bot, deny access, often with a 403 Forbidden error.
HUMAN might also employ honeypots or serve deceptive content to further confirm bot activity if suspicion arises.

The proactive nature of HUMAN’s sensor, collecting live data, combined with its reliance on continuously updated ML models for real-time behavioral analysis, makes it significantly harder to bypass than static WAF rules or simpler fingerprinting checks. There’s no fixed set of rules to crack; the system learns and adapts.
Crucially, the heavy emphasis on behavioral analysis represents a higher barrier than fingerprinting alone. Even if a scraper manages to present a perfect, seemingly legitimate browser fingerprint (perhaps by using a real profile), unnatural interaction patterns—like clicking links instantly, typing at superhuman speed, or navigating pages in a perfectly linear sequence—can still betray its automated nature and cause the trust score to plummet during the session. Passing the initial fingerprint check is merely the first hurdle; surviving the ongoing behavioral scrutiny is the real challenge.
Fighting Back: Anti-Fingerprinting Techniques for Devs 🤺
Facing down systems like HUMAN Security can feel daunting, but developers aren’t powerless. By understanding the detection vectors and employing a combination of techniques, you can significantly improve your scraper’s chances of flying under the radar. We’ll focus on strategies applicable to popular automation tools like Playwright and Puppeteer, which offer the necessary control over the browser environment.

Laying the Foundation: Proxies and Headers
Before diving into complex JavaScript patching, get the basics right:
High-Quality Proxies: Essential for masking your scraper’s origin IP and distributing load.
- Type: Prioritize Residential or Mobile proxies. These IPs belong to real consumer devices and ISPs, making them far less suspicious than easily identifiable Datacenter IPs.
- Rotation: Use a large pool of proxies and rotate them frequently. For tasks requiring session persistence (like logins), use “sticky” sessions that maintain the same IP for a short duration, but still rotate IPs periodically across different sessions. Avoid free or public proxies – they are unreliable and quickly banned. Reputable providers are key.
Realistic HTTP Headers: Don’t let default headers give you away.
- User-Agent: Maintain a list of current, common User-Agent strings (e.g., latest Chrome, Firefox, Safari on various OSs) and rotate them.
- Consistency: This is critical. Ensure all other standard headers (Accept, Accept-Language, Accept-Encoding, Sec-Ch-Ua client hints, etc.) are present and match the browser profile implied by the selected User-Agent. Use tools like httpbin.org to verify your scraper’s headers against a real browser’s request.
Patching Leaks in Automation Tools (Playwright/Puppeteer)
Headless browsers controlled by automation tools often leak information through JavaScript properties. Patching these leaks is crucial.
The navigator.webdriver Flag: The classic giveaway. It’s true in automated browsers.
Fix: Use JavaScript injection to override the getter and make it return false.
// Patch navigator.webdriver to return false
Object.defineProperty(navigator, 'webdriver', { get: () => false });
Spoofing Canvas Fingerprints: To counter rendering variations.
Fix 1 (Fake Data): Intercept toDataURL() calls matching known fingerprinting dimensions/types and return a static, pre-generated base64 string representing a common canvas result. Simple, but the fake value might be known.
Fix 2 (Noise): Modify the canvas pixel data slightly (add random noise) before toDataURL() is called. Aims for plausible variation. More complex to implement correctly.
Code (Puppeteer - Fake Data Example):
// Inject on new document using evaluateOnNewDocument
await page.evaluateOnNewDocument(() => {
const originalToDataURL = HTMLCanvasElement.prototype.toDataURL;
HTMLCanvasElement.prototype.toDataURL = function(type, encoderOptions) {
// Detect fingerprinting attempt (example dimensions from research)
if (type === 'image/png' && this.width === 209 && this.height === 25) {
console.log('Faking canvas fingerprint!');
// Return a pre-determined fake base64 image string
return 'data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+A8AAQUBAScY42YAAAAASUVORK5CYII=';
}
// Otherwise, call the original function
return originalToDataURL.apply(this, arguments);
};
});
Note: Tools like puppeteer-with-fingerprints automate this.
Spoofing WebGL Fingerprints: Tackling GPU/driver details.
Fix: This is complex. Disabling WebGL is detectable. Manual patching requires overriding specific parameters (UNMASKED_VENDOR_WEBGL, UNMASKED_RENDERER_WEBGL), attributes, extensions, and shaders with values that are consistent with a real hardware profile. Random values will fail. Using specialized plugins/tools (playwright-with-fingerprints, Camoufox, Kameleo) that apply verified fingerprints is highly recommended.
Code (Conceptual JS - Manual Patching - Highly Simplified & Risky):
// Inject on new document - VERY simplified concept
await page.evaluateOnNewDocument(() => {
try {
const getParameter = WebGLRenderingContext.prototype.getParameter;
WebGLRenderingContext.prototype.getParameter = function(parameter) {
const ext = this.getExtension('WEBGL_debug_renderer_info');
if (ext) {
if (parameter === ext.UNMASKED_VENDOR_WEBGL) {
return 'Intel Inc.'; // Spoofed vendor - MUST BE CONSISTENT
}
if (parameter === ext.UNMASKED_RENDERER_WEBGL) {
// Spoofed renderer - MUST MATCH VENDOR & PROFILE
return 'ANGLE (Intel Inc., Intel(R) Iris(TM) Plus Graphics OpenGL Engine, OpenGL 4.1)';
}
}
return getParameter.apply(this, arguments);
};
//... MANY other parameters and functions would need patching...
} catch (e) { console.error('WebGL spoofing failed:', e); }
});
Tackling Font & Audio Fingerprints: Addressing unique font sets and audio nuances.
Fix: Use plugins (puppeteer-extra-plugin-stealth, Camoufox) or anti-detect tools to manage fonts (presenting a standard OS list). For audio, add noise or use plugins (playwright-with-fingerprints, stealth plugins) to modify the AudioContext signature.
Patching Other Leaks (Permissions, Plugins, Languages, etc.): Closing remaining gaps.
Fix: Mock navigator.plugins and navigator.mimeTypes with realistic data. Patch Notification.permission based on HTTPS status. Recreate the window.chrome object if needed (for Chrome). Override navigator.languages to match Accept-Language header. Use browser launch arguments like ignoreDefaultArgs (Puppeteer/Playwright) or excludeSwitches (Selenium) to remove revealing flags. Ensure language consistency across headers, JS, and Intl objects. Stealth plugins aim to cover many of these.
Code (Puppeteer - Mocking plugins & languages):
// Inject on new document using evaluateOnNewDocument
await page.evaluateOnNewDocument(() => {
// Mock navigator.plugins with realistic data
Object.defineProperty(navigator, 'plugins', {
get: () => [
{ name: 'Chrome PDF Viewer', filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai', description: '', mimeTypes: [{ type: 'application/pdf', suffixes: 'pdf', description: '' }] },
// Add more common plugins if necessary for realism
],
});
// Mock navigator.languages to match expected locale (e.g., 'en-US')
Object.defineProperty(navigator, 'language', {
get: () => 'en-US', // Should match Accept-Language header
});
Object.defineProperty(navigator, 'languages', {
get: () => ['en-US', 'en'], // Should match Accept-Language header
});
});
Leveraging Stealth Plugins & Tools
Manually patching every potential leak is tedious and error-prone. This is where stealth plugins and specialized tools come in, bundling multiple evasions.

Popular Options:
- puppeteer-extra-plugin-stealth: Widely used for Puppeteer. Applies numerous patches for navigator.webdriver, UA, WebGL, plugins, codecs, permissions, etc.
- playwright-stealth: A similar plugin aiming to provide similar bundled evasions for Playwright.
- playwright-with-fingerprints / puppeteer-with-fingerprints: Use the FingerprintSwitcher service to fetch and apply real browser fingerprints (Canvas, WebGL, Audio, Fonts, Navigator props, Screen, etc.) to Playwright/Puppeteer instances. Offers high fingerprint realism but has limitations (free tier Windows-only, doesn’t handle behavior).
- undetected-chromedriver: A patched version of ChromeDriver for Selenium designed to be less detectable.
- Patchright: A drop-in replacement patch for Playwright aiming for better undetectability.
- Camoufox: A stealth-focused Firefox build packaged with a Playwright-like Python API, featuring extensive fingerprint spoofing (including fonts, WebGL if enabled) and stealth patches.
How They Work
These tools typically use a combination of JavaScript injection to override properties, modification of browser launch arguments, and potentially request interception to present a more human-like profile.
Effectiveness & Limitations
Stealth plugins can significantly improve success rates against websites using basic or intermediate fingerprinting checks, often allowing passage through tests like bot.sannysoft.com or improving scores on CreepJS. However, they are often not sufficient on their own against advanced, multi-layered systems like PerimeterX/HUMAN, which incorporate sophisticated behavioral analysis or detect subtle inconsistencies missed by the plugins. Furthermore, plugins can become outdated as detection techniques evolve, or they might even introduce their own detectable artifacts. There is no silver bullet; even the best current plugins can be detected by sufficiently advanced systems.
Code Example (puppeteer-extra-plugin-stealth):
// npm install puppeteer-extra puppeteer-extra-plugin-stealth
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
// Add the stealth plugin - it applies multiple evasions automatically
puppeteer.use(StealthPlugin());
(async () => {
// Launch Puppeteer as usual, but use the 'puppeteer-extra' instance
const browser = await puppeteer.launch({ headless: 'new' }); // Recommended: use 'new' headless mode
const page = await browser.newPage();
console.log('Testing page with Stealth plugin enabled...');
await page.goto('https://bot.sannysoft.com/'); // A common fingerprinting test site
await page.screenshot({ path: 'sannysoft_stealth_test.png' });
console.log('Screenshot saved to sannysoft_stealth_test.png');
await browser.close();
})();
This example shows the simplicity of adding the plugin. It automatically applies patches listed in its evasion modules.
Code Example (playwright-with-fingerprints):
// npm install playwright playwright-with-fingerprints
// npx playwright install chromium (or other browsers)
const { plugin } = require('playwright-with-fingerprints');
// Instantiate the plugin service. Use an empty string '' for the free key.
plugin.setServiceKey('');
(async () => {
try {
// 1. Fetch a real browser fingerprint from the FingerprintSwitcher service
console.log('Fetching a Chrome on Windows fingerprint...');
const fingerprint = await plugin.fetch({ tags: ['Microsoft Windows', 'Chrome'] });
// The 'fingerprint' variable now holds a string containing all necessary spoofing data.
// You could save this string to a file and reuse it later with plugin.useFingerprint().
// 2. Apply the fetched fingerprint before launching the browser
plugin.useFingerprint(fingerprint);
console.log('Fingerprint applied.');
// 3. Launch the browser using the plugin's launch method
const browser = await plugin.launch(); // Use plugin.launch()
const page = await browser.newPage();
console.log('Testing page with fingerprint plugin...');
await page.goto('https://browserleaks.com/canvas'); // Test canvas fingerprint
const canvasSignature = await page.$eval('#crc', (el) => el.innerText);
console.log(`Canvas Signature: ${canvasSignature}`); // Should differ on subsequent runs with new fingerprints
await page.screenshot({ path: 'browserleaks_fingerprint_test.png' });
await browser.close();
} catch (error) {
console.error('Error using playwright-with-fingerprints:', error);
}
})();
This example demonstrates fetching and applying a real fingerprint profile. While powerful for mimicking static properties, remember its limitations regarding behavioral analysis and OS support in the free tier.
Another factor to consider is the headless vs. headful trade-off. Running automation tools in standard headless mode is faster, uses fewer resources, and scales better. However, headless browsers have inherent differences that make them easier to detect. Running in headful mode (even within a virtual display environment like Xvfb on Linux to hide the UI) often significantly improves detection scores on fingerprinting tests like CreepJS, sometimes achieving near-perfect scores where headless fails. This presents a choice: optimize for performance and scalability (headless, higher detection risk) or for stealth (headful, slower, more resource-intensive).
Here’s a comparative overview of some common stealth tools and plugins:
| Tool/Plugin Name | Base Library | Key Evasion Features | Effectiveness Notes | Limitations |
|---|---|---|---|---|
| puppeteer-extra-plugin-stealth | Puppeteer (via extra) | Bundles many evasions (webdriver, UA, WebGL, plugins, codecs, iframe, permissions, etc.) | Good against basic/intermediate FP tests. Can be detected by advanced systems/tests (e.g., Cloudflare, CreepJS) | May become outdated; not foolproof against sophisticated behavioral analysis |
| playwright-stealth | Playwright | Aims to provide similar bundled evasions for Playwright | Effectiveness varies; community maintenance might lag behind Puppeteer version | Less mature/documented than Puppeteer counterpart |
| playwright-with-fingerprints | Playwright | Applies real browser fingerprints (Canvas, WebGL, Audio, Fonts, Navigator, Screen etc.) via FingerprintSwitcher service | Excellent for realistic static fingerprint. Passes basic FP tests | Free tier Windows only. Doesn’t handle behavioral analysis. Insufficient alone against advanced anti-bots like HUMAN |
| puppeteer-with-fingerprints | Puppeteer | Applies real browser fingerprints (as above) for Puppeteer | Excellent for realistic static fingerprint. Passes basic FP tests | Free tier Windows only. Doesn’t handle behavioral analysis |
| undetected-chromedriver | Selenium (ChromeDriver) | Patches ChromeDriver to remove known automation tells | Improves headless score vs CreepJS but still detected. Headful mode better | Focuses on ChromeDriver specifics; behavioral analysis still a factor |
| Patchright | Playwright | Drop-in replacement patch for Playwright aiming for undetectability | Improves headless score vs CreepJS but still detected. Headful mode better | Effectiveness depends on patches applied; behavioral analysis remains |
| Camoufox | Playwright-like API | Custom Firefox build + Python API. Extensive spoofing (Navigator, Screen, Geo, Fonts, WebGL optional), stealth patches | Aims for high stealth, good scores vs CreepJS reported | Firefox-based. WebGL spoofing needs manual config (no rotation library). Requires separate browser build |
Putting It All Together: Strategies Against PerimeterX/HUMAN 🧩
So, how do you apply all this knowledge specifically against a sophisticated system like PerimeterX/HUMAN? It requires a multi-layered strategy addressing both its fingerprinting capabilities and its crucial behavioral analysis component.

Here’s a layered approach:
Build a Solid Foundation
- Proxies: Non-negotiable. Use high-quality, rotating residential or mobile proxies from a reputable provider. Distribute your requests across a large pool.
- Headers: Ensure realistic, consistent, and rotated HTTP headers, including User-Agent and all associated headers (Accept-Language, Sec-Ch-Ua, etc.), matching a common browser profile.
Forge a Realistic Fingerprint
- Stealth Plugins: Use a robust plugin like puppeteer-extra-plugin-stealth or playwright-with-fingerprints as your starting point to cover the most common leaks and apply realistic properties.
- Consistency is Key: Whether using plugins or manual patching, ensure all fingerprint elements (UA, platform, WebGL renderer, fonts, screen size, language, timezone) align coherently. Avoid random or contradictory values.
- Consider Advanced Tools: Explore options like playwright-with-fingerprints for applying verified real fingerprints, or look into commercial scraping APIs/browsers designed specifically to handle advanced fingerprinting.
Mimic Human Behavior (Crucial for HUMAN)
This is arguably the hardest part but essential for bypassing ML-based behavioral detection.
- Timing: Introduce realistic, randomized delays between actions like page loads, clicks, and typing. Avoid fixed sleep() calls. Humans don’t act with perfect, uniform timing.
- Mouse Movements: Simulate plausible mouse movements. Even random movements across the page are better than none. Move the cursor towards elements before clicking.
- Scrolling: Implement natural scrolling behavior (e.g., scroll down the page gradually) instead of instantly jumping to elements.
- Interaction: Interact with page elements logically. Don’t just extract data; click buttons, navigate menus occasionally if it fits a human pattern.
- Navigation: Avoid hitting target data pages directly every time. Simulate a more natural browsing path by visiting intermediate pages (e.g., homepage -> category page -> product page).
- Resource Loading: Ensure your browser automation setup loads necessary resources like images and CSS, as real browsers do. Headless modes sometimes optimize by skipping these.
- Warm-up: Consider having scrapers perform some innocuous actions (like visiting the homepage or browsing non-critical sections) before attempting to access sensitive data or perform actions that might trigger heightened scrutiny.
Scale with Rotation
To combat behavioral profiling over time, distribute your scraping tasks across many different IP addresses and many different browser fingerprint profiles. Avoid establishing a long-term, recognizable pattern associated with a single identity.
Adapt and Evolve
PerimeterX/HUMAN and other anti-bot systems are constantly updated. Monitor your scrapers for new blocking patterns, error messages, or challenges. Stay informed about new detection techniques and evasion tools by following security blogs and developer communities. Be prepared to adjust your strategies.
The strong emphasis HUMAN places on behavioral analysis leads to a significant realization: simply having an undetectable fingerprint is often not enough. A bot that looks perfectly human statically but acts robotically will likely still be caught. Passing the initial checks gets you in the door, but mimicking human interaction patterns is key to staying there.
This complexity—managing rotating proxies, maintaining consistent and realistic fingerprints across dozens of parameters, and simulating nuanced human behavior—is substantial. It helps explain the growing popularity of commercial web scraping APIs and services (like ZenRows, ScrapFly, Bright Data, Oxylabs, ScraperAPI). These platforms often abstract away the anti-bot bypass complexities, promising reliable access even to heavily protected sites by handling the proxy rotation, fingerprint generation, and sometimes even the CAPTCHA solving and behavioral aspects themselves. For developers needing data without becoming full-time anti-bot experts, these integrated solutions represent an increasingly attractive alternative to DIY scraping against top-tier defenses.
Conclusion: Staying Stealthy in 2025 (It’s a Marathon, Not a Sprint) 🏃💨
Navigating the world of web scraping in 2025 means confronting increasingly sophisticated defenses. Systems like PerimeterX/HUMAN Security have moved far beyond simple IP blocks, employing intricate browser fingerprinting (analyzing Canvas, WebGL, audio, fonts, navigator properties, and more) coupled with powerful machine learning-driven behavioral analysis.

Successfully bypassing these systems requires a multi-faceted approach:
- Foundation: Start with high-quality rotating residential/mobile proxies and meticulously crafted, consistent HTTP headers.
- Fingerprint Realism: Utilize robust stealth plugins (like puppeteer-extra-plugin-stealth or playwright-with-fingerprints) or carefully implement manual patches to present a coherent and common browser profile, paying close attention to consistency across all signals.
- Behavioral Mimicry: This is paramount against systems like HUMAN. Implement randomized delays, simulate mouse movements and scrolling, adopt natural navigation patterns, and ensure resource loading mirrors real browser behavior.
Remember, bot detection and evasion is a continuous cat-and-mouse game. Techniques evolve on both sides. What works today might be detected tomorrow. Continuous learning, experimentation, monitoring your scrapers’ success rates, and adapting your strategies are essential for long-term success.
While the challenge is significant, armed with the right knowledge and tools, developers can still navigate this complex landscape. Be persistent, test thoroughly, and always scrape responsibly and ethically.
Happy (stealthy) scraping! 🥷