A community-driven registry for the Claude Code ecosystem. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Web scraping MCP server for AI agents — screenshots, content extraction, structured scraping
The #1 most requested utility for AI agents — professional web scraping, screenshots, and content extraction via MCP + REST API.
🌐 Clean Content Extraction — Extract readable text/markdown from any webpage (like Readability)
🎯 Structured Scraping — Extract specific data using CSS selectors
📸 Screenshots — Capture full-page or viewport screenshots with Playwright
🔗 Link Extraction — Get all links from a page with optional regex filtering
📋 Metadata Extraction — Extract title, description, Open Graph tags, favicon, etc
🔍 Google Search — Search Google and get results programmatically
Add to your MCP settings file (cline_mcp_settings.json or similar):
{
"mcpServers": {
"agent-scraper": {
"url": "https://agent-scraper-mcp.onrender.com/mcp"
}
}
}
Base URL: https://agent-scraper-mcp.onrender.com
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/scrape_url \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/article",
"format": "markdown"
}'
Response:
{
"success": true,
"url": "https://example.com/article",
"title": "Article Title",
"content": "# Article Title\n\nClean markdown content...",
"format": "markdown"
}
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/scrape_structured \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/product",
"selectors": {
"title": "h1.product-title",
"price": ".price",
"reviews": ".review-text"
}
}'
Response:
{
"success": true,
"url": "https://example.com/product",
"data": {
"title": "Product Name",
"price": "$29.99",
"reviews": ["Great product!", "Worth the money"]
}
}
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/screenshot_url \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"width": 1280,
"height": 720,
"full_page": false
}'
Response:
{
"success": true,
"url": "https://example.com",
"image": "iVBORw0KGgoAAAANSUhEUgAA...",
"width": 1280,
"height": 720,
"full_page": false
}
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/extract_links \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"filter": "https://example.com/blog/.*"
}'
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/extract_meta \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/search_google \
-H "Content-Type: application/json" \
-d '{
"query": "python web scraping",
"num_results": 10
}'
After free tier exhausted:
Payment via HTTP 402 with crypto wallet:
0x8E844a7De89d7CfBFe9B4453E65935A22F146aBBX-Payment header with payment proofscrape_urlExtract clean, readable content from any webpage (like Readability).
Parameters:
url (string, required): URL to scrapeformat (string, optional): Output format — text, markdown, or html (default: markdown)Returns: {success, url, title, content, format}
scrape_structuredExtract specific data using CSS selectors.
Parameters:
url (string, required): URL to scrapeselectors (object, required): Dict of name → CSS selectorReturns: {success, url, data}
Example selectors:
{
"title": "h1.post-title",
"author": ".author-name",
"price": "span.price",
"images": "img.product-image"
}
screenshot_urlCapture a screenshot of any webpage.
Parameters:
url (string, required): URL to screenshotwidth (int, optional): Viewport width (default: 1280)height (int, optional): Viewport height (default: 720)full_page (bool, optional): Capture full scrollable page (default: false)Returns: {success, url, image, width, height, full_page}
Image is base64-encoded PNG.
extract_linksExtract all links from a webpage.
Parameters:
url (string, required): URL to scrapefilter (string, optional): Regex pattern to filter URLsReturns: {success, url, links, count}
Links array contains {text, href} objects.
extract_metaExtract metadata from a webpage.
Parameters:
url (string, required): URL to scrapeReturns: {success, url, meta}
Meta object includes:
title: Page titledescription: Meta descriptioncanonical: Canonical URLfavicon: Favicon URLog: Open Graph tagstwitter: Twitter Card tagssearch_googleSearch Google and get results.
Parameters:
query (string, required): Search querynum_results (int, optional): Number of results (default: 10)Returns: {success, query, results, count}
Results array contains {title, url, snippet} objects.
# Clone repo
git clone https://github.com/aparajithn/agent-scraper-mcp.git
cd agent-scraper-mcp
# Install dependencies
pip install -e ".[dev]"
# Install Playwright browsers
playwright install chromium --with-deps
# Run server
uvicorn src.main:app --reload --port 8080
# Initialize
curl -X POST http://localhost:8080/mcp -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0.0"}}}'
# List tools
curl -X POST http://localhost:8080/mcp -d '{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}'
# Call scrape_url
curl -X POST http://localhost:8080/mcp -d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"scrape_url","arguments":{"url":"https://example.com","format":"markdown"}}}'
# Build
docker build -t agent-scraper-mcp .
# Run
docker run -p 8080:8080 -e PUBLIC_HOST=localhost agent-scraper-mcp
Deployed on Render (free tier):
agent-scraper-mcpaparajithn/agent-scraper-mcpEnvironment variables:
PUBLIC_HOST: agent-scraper-mcp.onrender.comX402_WALLET_ADDRESS: 0x8E844a7De89d7CfBFe9B4453E65935A22F146aBBMIT License — see LICENSE for details.
Built for AI agents by AI engineers 🤖
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots
Secure MCP server for MySQL database interaction, queries, and schema management
English-first Korean equity intelligence MCP — DART filings, foreign-holder 5%-rule flows, activist filings, KRX news. F