How It Works

Learn how Wryn's intelligent scraping platform works under the hood.

Architecture Overview

Wryn combines multiple technologies to provide reliable, scalable web scraping:

REST API - Simple interface for making scrape requests
Browser Pool - Managed headless Chrome instances
Proxy Network - Residential and datacenter proxies worldwide
AI Extraction - Machine learning models for data extraction
Data Pipeline - Cleaning, validation, and formatting

Request Flow

1. API Request

You send a simple HTTP POST request:

curl -X POST https://api.wryn.io/v1/<end_point> \
  -H "x-api-key: wryn_live_1234567890abcdefghijklmnopqrstuvwxyz" \
  -H "Content-Type: application/json" \
  -d '{
  "url": "https://example.com/product/",
  "action": "auto_listing",
  "engine": "stealth_mode",
  "timeout_ms": 45000,
  "retries": 2,
  "extract_main_content": true
}'

2. Smart Routing

Wryn analyzes the target website and selects optimal settings:

Website fingerprint - Domain, technology stack, anti-bot measures
Request priority - Urgent vs. batch processing
Resource allocation - Browser type, proxy location, concurrency

3. Browser Execution

A headless Chrome instance is allocated from our pool:

Unique fingerprint - Randomized browser profiles to avoid detection
Proxy selection - Residential proxy from target region
JavaScript rendering - Full page load with AJAX requests
Screenshot capture - Optional visual verification

4. Page Interaction

The browser navigates and interacts with the page:

Wait for elements - Smart waiting for dynamic content
Handle popups - Automatic cookie consent, modals, ads
Scroll & pagination - Auto-scroll for infinite scroll pages
Form filling - Login, search, filters (when configured)

5. Data Extraction

AI-powered extraction finds and structures your data:

Field detection - Automatically locate requested fields
Schema inference - Understand data types and relationships
Multi-page extraction - Follow links for complete datasets
Validation - Ensure data quality and completeness

6. Response Delivery

Cleaned data is returned in your preferred format:

{
  "status": "success",
  "data": {
    "title": "Premium Wireless Headphones",
    "price": "$299.99",
    "description": "High-quality noise-canceling headphones..."
  },
  "metadata": {
    "scraped_at": "2025-12-06T10:30:00Z",
    "response_time": 2.4
  }
}

Anti-Bot Bypass

Wryn uses multiple techniques to bypass anti-bot protection:

Residential Proxies

10M+ residential IPs across 195 countries
Automatic rotation on detection
Geographic targeting for regional content

Browser Fingerprinting

Randomized user agents, screen resolutions, languages
Canvas, WebGL, and audio fingerprint randomization
Realistic mouse movements and timing

CAPTCHA Solving

Automatic CAPTCHA detection
Integration with solving services
Smart retry strategies

Rate Limit Handling

Automatic request throttling
Distributed request timing
Session reuse for efficiency

JavaScript Rendering

Modern websites rely heavily on JavaScript. Wryn handles this automatically:

Single Page Applications (SPAs)

React, Vue, Angular applications
Wait for AJAX requests to complete
Handle dynamic routing

Infinite Scroll

Auto-scroll to load more content
Detect end of content
Capture all items

Dynamic Content

Observe DOM mutations
Wait for elements to appear
Handle lazy-loaded images

Error Handling & Retries

Wryn automatically handles failures:

Automatic Retries

Network errors: 3 retries with exponential backoff
Rate limits: Intelligent throttling and retry
Blocked requests: Proxy rotation and retry

Failure Classification

Temporary - Network issues, rate limits (auto-retry)
Permanent - Invalid URL, content not found (return error)
Partial - Some data extracted (return with warnings)

Webhook Notifications

Get notified of scrape completion or errors:

{
  "event": "scrape.completed",
  "scrape_id": "scr_abc123",
  "status": "success",
  "url": "https://example.com/product/123"
}

Data Quality

Wryn ensures high-quality data output:

Validation

Type checking (numbers, dates, URLs)
Required field verification
Format normalization

Cleaning

HTML tag removal
Whitespace normalization
Character encoding fixes

Enrichment

Currency conversion
Date parsing and timezone handling
Image URL resolution

Scalability

Wryn handles any scale automatically:

Request Queueing

Priority queue for urgent requests
Batch processing for large jobs
Automatic load balancing

Parallel Processing

Concurrent browser instances
Distributed across regions
Smart resource allocation

Caching

Intelligent caching for repeated requests
Configurable TTL
Cache invalidation options

Security

Your data and credentials are protected:

Encryption

TLS 1.3 for all API requests
Encrypted storage for sensitive data
Secure credential management

Privacy

No data retention beyond processing
GDPR compliant
Credential isolation per account

Authentication

API key authentication
IP whitelisting (optional)
Request signing (enterprise)

Next Steps

Getting Started - Make your first scrape request
API Reference - Detailed API documentation
Guides - Best practices and advanced techniques

Architecture Overview​

Request Flow​

1. API Request​

2. Smart Routing​

3. Browser Execution​

4. Page Interaction​

5. Data Extraction​

6. Response Delivery​

Anti-Bot Bypass​

Residential Proxies​

Browser Fingerprinting​

CAPTCHA Solving​

Rate Limit Handling​

JavaScript Rendering​

Single Page Applications (SPAs)​

Infinite Scroll​

Dynamic Content​

Error Handling & Retries​

Automatic Retries​

Failure Classification​

Webhook Notifications​

Data Quality​

Validation​

Cleaning​

Enrichment​

Scalability​

Request Queueing​

Parallel Processing​

Caching​

Security​

Encryption​

Privacy​

Authentication​

Next Steps​