How It Works
Learn how Wryn's intelligent scraping platform works under the hood.
Architecture Overview
Wryn combines multiple technologies to provide reliable, scalable web scraping:
- REST API - Simple interface for making scrape requests
- Browser Pool - Managed headless Chrome instances
- Proxy Network - Residential and datacenter proxies worldwide
- AI Extraction - Machine learning models for data extraction
- Data Pipeline - Cleaning, validation, and formatting
Request Flow
1. API Request
You send a simple HTTP POST request:
curl -X POST https://api.wryn.io/v1/<end_point> \
-H "x-api-key: wryn_live_1234567890abcdefghijklmnopqrstuvwxyz" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/product/",
"action": "auto_listing",
"engine": "stealth_mode",
"timeout_ms": 45000,
"retries": 2,
"extract_main_content": true
}'
2. Smart Routing
Wryn analyzes the target website and selects optimal settings:
- Website fingerprint - Domain, technology stack, anti-bot measures
- Request priority - Urgent vs. batch processing
- Resource allocation - Browser type, proxy location, concurrency
3. Browser Execution
A headless Chrome instance is allocated from our pool:
- Unique fingerprint - Randomized browser profiles to avoid detection
- Proxy selection - Residential proxy from target region
- JavaScript rendering - Full page load with AJAX requests
- Screenshot capture - Optional visual verification
4. Page Interaction
The browser navigates and interacts with the page:
- Wait for elements - Smart waiting for dynamic content
- Handle popups - Automatic cookie consent, modals, ads
- Scroll & pagination - Auto-scroll for infinite scroll pages
- Form filling - Login, search, filters (when configured)
5. Data Extraction
AI-powered extraction finds and structures your data:
- Field detection - Automatically locate requested fields
- Schema inference - Understand data types and relationships
- Multi-page extraction - Follow links for complete datasets
- Validation - Ensure data quality and completeness
6. Response Delivery
Cleaned data is returned in your preferred format:
{
"status": "success",
"data": {
"title": "Premium Wireless Headphones",
"price": "$299.99",
"description": "High-quality noise-canceling headphones..."
},
"metadata": {
"scraped_at": "2025-12-06T10:30:00Z",
"response_time": 2.4
}
}
Anti-Bot Bypass
Wryn uses multiple techniques to bypass anti-bot protection:
Residential Proxies
- 10M+ residential IPs across 195 countries
- Automatic rotation on detection
- Geographic targeting for regional content
Browser Fingerprinting
- Randomized user agents, screen resolutions, languages
- Canvas, WebGL, and audio fingerprint randomization
- Realistic mouse movements and timing
CAPTCHA Solving
- Automatic CAPTCHA detection
- Integration with solving services
- Smart retry strategies
Rate Limit Handling
- Automatic request throttling
- Distributed request timing
- Session reuse for efficiency
JavaScript Rendering
Modern websites rely heavily on JavaScript. Wryn handles this automatically:
Single Page Applications (SPAs)
- React, Vue, Angular applications
- Wait for AJAX requests to complete
- Handle dynamic routing
Infinite Scroll
- Auto-scroll to load more content
- Detect end of content
- Capture all items
Dynamic Content
- Observe DOM mutations
- Wait for elements to appear
- Handle lazy-loaded images
Error Handling & Retries
Wryn automatically handles failures:
Automatic Retries
- Network errors: 3 retries with exponential backoff
- Rate limits: Intelligent throttling and retry
- Blocked requests: Proxy rotation and retry
Failure Classification
- Temporary - Network issues, rate limits (auto-retry)
- Permanent - Invalid URL, content not found (return error)
- Partial - Some data extracted (return with warnings)
Webhook Notifications
Get notified of scrape completion or errors:
{
"event": "scrape.completed",
"scrape_id": "scr_abc123",
"status": "success",
"url": "https://example.com/product/123"
}
Data Quality
Wryn ensures high-quality data output:
Validation
- Type checking (numbers, dates, URLs)
- Required field verification
- Format normalization
Cleaning
- HTML tag removal
- Whitespace normalization
- Character encoding fixes
Enrichment
- Currency conversion
- Date parsing and timezone handling
- Image URL resolution
Scalability
Wryn handles any scale automatically:
Request Queueing
- Priority queue for urgent requests
- Batch processing for large jobs
- Automatic load balancing
Parallel Processing
- Concurrent browser instances
- Distributed across regions
- Smart resource allocation
Caching
- Intelligent caching for repeated requests
- Configurable TTL
- Cache invalidation options
Security
Your data and credentials are protected:
Encryption
- TLS 1.3 for all API requests
- Encrypted storage for sensitive data
- Secure credential management
Privacy
- No data retention beyond processing
- GDPR compliant
- Credential isolation per account
Authentication
- API key authentication
- IP whitelisting (optional)
- Request signing (enterprise)
Next Steps
- Getting Started - Make your first scrape request
- API Reference - Detailed API documentation
- Guides - Best practices and advanced techniques