A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
📄 Production-ready MCP server for PDF processing - 5-10x faster with parallel processing and 94%+ test coverage
Production-ready PDF processing server for AI agents
PDF inspection • PDF search • Agent document map • Accessibility report • Visual evidence • Region crops • Configured OCR
PDF Reader MCP is a production-ready Model Context Protocol server that empowers AI agents with structured, local-first PDF processing capabilities. Inspect PDFs before extraction, search text evidence with page and bbox provenance, render page-level visual evidence, crop bbox-grounded page regions, run configured OCR for scanned-page text layers, then extract a full agent document map, accessibility report, text, Markdown, semantic citation chunks, images, tables, annotations, outlines, structure trees, form fields, attachment metadata, and agent-ready document elements with strong performance and reliability.
The Problem:
// Traditional PDF processing
- Sequential page processing (slow)
- No natural content ordering
- Complex path handling
- Poor error isolation
The Solution:
// PDF Reader MCP
- Preflight PDF inspection for agent extraction planning 🔎
- MCP-native PDF search with snippets and bbox evidence 🔎
- Bounded page rendering for visual evidence and OCR routing 🖼️
- Bbox-grounded region crops for source evidence 🔍
- Configured local OCR provider for scanned-page text layers 🔡
- 5-10x faster parallel processing ⚡
- Full agent document map linking pages, elements, chunks, layout, safety, and geometry 🧭
- Semantic document AST for page/section/paragraph/list/table/image traversal 🌳
- PDF trust report for content safety, layout, table, and link-risk routing 🛡️
- Accessibility report for tagged-PDF coverage, headings, images, forms, links, and permissions ♿
- Structured element output for agent workflows 🧩
- Table quality diagnostics with inferred cell spans and continuation candidates 📊
- Markdown rendering for RAG and summarization 📝
- Citation-ready semantic/table/page chunks 🔗
- Layout diagnostics with reading-order confidence 📐
- Outlines, annotations, structure trees, forms, attachments, labels, and permission signals 🗂️
- Column-aware reading order 📐
- Flexible path support (absolute/relative) 🎯
- Per-page error resilience 🛡️
- CI-backed quality ✅
Result: Production-ready PDF processing that scales.
read_pdf arguments for agent workflowsinspect_pdf plans extraction, search_pdf finds text evidence, render_page returns visual evidence, extract_regions crops source evidence, analyze_regions enriches visual regions, ocr_pages runs configured OCR, read_pdf performs extractionReal-world performance from production testing:
| Operation | Ops/sec | Performance | Use Case |
|---|---|---|---|
| Error handling | 12,933 | ⚡⚡⚡⚡⚡ | Validation & safety |
| Extract full text | 5,575 | ⚡⚡⚡⚡ | Document analysis |
| Extract page | 5,329 | ⚡⚡⚡⚡ | Single page ops |
| Multiple pages | 5,242 | ⚡⚡⚡⚡ | Batch processing |
| Metadata only | 4,912 | ⚡⚡⚡ | Quick inspection |
| Document | Sequential | Parallel | Speedup |
|---|---|---|---|
| 10-page PDF | ~2s | ~0.3s | 5-8x faster |
| 50-page PDF | ~10s | ~1s | 10x faster |
| 100+ pages | ~20s | ~2s | Linear scaling with CPU cores |
Benchmarks vary based on PDF complexity and system resources.
claude mcp add pdf-reader -- npx @sylphx/pdf-reader-mcp
Add to claude_desktop_config.json:
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"]
}
}
}
~/Library/Application Support/Claude/claude_desktop_config.json%APPDATA%\Claude\claude_desktop_config.json~/.config/Claude/claude_desktop_config.jsoncode --add-mcp '{"name":"pdf-reader","command":"npx","args":["@sylphx/pdf-reader-mcp"]}'
npx @sylphx/pdf-reader-mcpAdd to your Windsurf MCP config:
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"]
}
}
}
Add to Cline's MCP settings:
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"]
}
}
}
npx, Args: @sylphx/pdf-reader-mcpAdd the server in Settings → MCP Servers → Add Server with command npx and args @sylphx/pdf-reader-mcp. See Ontheia's compatible MCP servers for the full list.
npx -y @smithery/cli install @sylphx/pdf-reader-mcp --client claude
# Quick start - zero installation
npx @sylphx/pdf-reader-mcp
# Or install globally
npm install -g @sylphx/pdf-reader-mcp
Use inspect_pdf when an agent needs to decide how to process an unfamiliar
PDF. It samples a bounded number of pages, detects selectable-text versus
image-like pages, surfaces document signals, and recommends useful read_pdf
arguments without extracting image bytes.
{
"sources": [{
"path": "documents/report.pdf"
}],
"sample_pages": 5,
"include_metadata": true
}
Result:
digital_text, scanned_or_image_only, or mixed_text_and_scanread_pdf arguments for citation chunks, safety findings, tables, or OCR triageUse search_pdf when an agent needs to locate text evidence before deciding
whether to read a whole page, crop a region, or cite a result.
{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-20"
}],
"query": "risk controls",
"whole_word": true,
"max_matches_per_source": 10
}
Response includes:
profile: "pdf_search_results" and effective search optionsrender_page or extract_regionsmax_pages default 100 and max_matches_per_source default 50{
"sources": [{
"path": "documents/report.pdf"
}],
"include_full_text": true,
"include_metadata": true,
"include_page_count": true
}
Result:
{
"sources": [{
"path": "documents/manual.pdf",
"pages": "1-5,10,15-20"
}],
"include_full_text": true
}
{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-3"
}],
"include_elements": true,
"include_metadata": true,
"include_page_count": true
}
Response includes:
p1-text-1Use include_document_map when an agent needs one navigable PDF structure
instead of separate page, element, chunk, layout, and safety outputs.
{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-5"
}],
"include_document_map": true,
"include_full_text": false
}
Response includes:
Use include_document_ast when an agent needs a navigable semantic tree rather
than reconstructing document structure from flat text items.
{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-5"
}],
"include_document_ast": true,
"include_full_text": false
}
Response includes:
document_ast root with page, section, paragraph, list item, table, and image nodeselement_ids, chunk_ids, bounding boxes, confidence, and semantic roles where availableelements, chunks, or tables output unless those options are requestedUse include_text_layer when an agent needs deterministic line and word
references instead of only full text. It exposes page text, line records, word
records, page-level character ranges, best-effort bounding boxes, and
provenance from the same extracted text-content pass.
{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-5"
}],
"include_text_layer": true,
"include_full_text": false
}
Response includes:
text_layer object with one page record per selected pagechar_start/char_end, and line bounding boxes when availablefull_text or raw page_contents outputUse include_trust_report when an agent needs one local risk summary before
using extracted PDF content as instructions, evidence, or retrieval context.
{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-5"
}],
"include_trust_report": true,
"include_full_text": false
}
Response includes:
Use include_accessibility_report when an agent needs a deterministic view of
tagged-PDF and accessibility-relevant structure before relying on the document
for navigation, form filling, summarization, or assisted reading workflows.
{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-5"
}],
"include_accessibility_report": true,
"include_full_text": false
}
Response includes:
copy_for_accessibilityUse render_page when an agent needs to inspect the original page image,
prepare OCR routing, or verify visual layout without stuffing base64 into JSON.
{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-2"
}],
"scale": 2,
"max_pages": 2
}
Response includes:
include_image is truemax_pages default 5, and max_pixels_per_page default 16MPUse extract_regions when an agent has a table, figure, chart, formula, or
citation bounding box and needs a focused crop from the original page.
{
"sources": [{
"path": "documents/report.pdf",
"regions": [{
"id": "table-1",
"page": 1,
"bounding_box": { "left": 72, "bottom": 420, "right": 540, "top": 620 },
"padding": 8
}]
}],
"scale": 2,
"max_regions": 20
}
Response includes:
include_image is truemax_regions default 20 and max_pixels_per_page default 16MPUse analyze_regions when an agent has a crop target for a table, chart,
formula, figure, or image and wants a normalized local-provider result linked
back to source pixels. The provider is configured by environment variables, not
by request arguments.
{
"sources": [{
"path": "documents/report.pdf",
"regions": [{
"id": "chart-1",
"page": 2,
"bounding_box": { "left": 72, "bottom": 240, "right": 540, "top": 520 },
"padding": 8
}]
}],
"scale": 2,
"max_regions": 10,
"languages": ["eng"]
}
Response includes:
profile: "region_analysis" and the effective analysis optionskind, description, text, Markdown, confidence, normalized table rows, formula fields, chart data points, warnings, and provenance when supplied by the providersource_crop_evidence_id, source bounding box, crop pixel bounds, and scale for every analyzed regionmax_regions default 20, max_pixels_per_page default 16MP, and timeout_ms default 60 seconds per regionUse ocr_pages after inspect_pdf flags scanned or sparse pages, or when an
agent needs a text layer from pages that have little selectable text. The
server renders bounded page images and passes each temporary PNG to the
configured local OCR command.
{
"sources": [{
"path": "documents/scanned-report.pdf",
"pages": "1-3"
}],
"scale": 2,
"max_pages": 3,
"languages": ["eng"]
}
Response includes:
profile: "ocr_text_layer" and the effective OCR optionssource_render_evidence_id linking each OCR page back to the page render used as OCR inputmax_pages default 5, max_pixels_per_page default 16MP, and timeout_ms default 60 seconds per page{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-5"
}],
"include_markdown": true,
"include_full_text": false
}
Response includes:
include_tables is enabled{
"sources": [{
"path": "documents/report.pdf",
"pages": "1-5"
}],
"include_chunks": true,
"include_semantic_hints": true,
"include_tables": true,
"include_full_text": false
}
Response includes:
p1-chunk-1page, semantic, size, and table{
"sources": [{
"path": "documents/spec.pdf",
"pages": "1-5"
}],
"include_outline": true,
"include_annotations": true,
"include_page_labels": true,
"include_permissions": true,
"include_structure_tree": true,
"include_form_fields": true,
"include_attachments": true
}
Response includes, when available:
// Windows - Both formats work!
{
"sources": [{
"path": "C:\\Users\\John\\Documents\\report.pdf"
}],
"include_full_text": true
}
// Unix/Mac
{
"sources": [{
"path": "/home/user/documents/contract.pdf"
}],
"include_full_text": true
}
No more "Absolute paths are not allowed" errors!
{
"sources": [{
"path": "presentation.pdf",
"pages": [1, 2, 3]
}],
"include_images": true,
"include_full_text": true
}
Response includes:
{
"sources": [
{ "path": "C:\\Reports\\Q1.pdf", "pages": "1-10" },
{ "path": "/home/user/Q2.pdf", "pages": "1-10" },
{ "url": "https://example.com/Q3.pdf" }
],
"include_full_text": true
}
⚡ All PDFs processed in parallel automatically!
read_pdf optionsinclude_document_map returns a single agent-ready map that links pages,
structured elements, citation chunks, layout diagnostics, content safety
findings, routing signals, and page geometry. It is designed for agents that
need to navigate the original PDF evidence without manually stitching together
separate response fields.
The map is performance-bounded: it reuses the same extraction path, keeps image bytes out of JSON, and provides page-level routing signals such as low-confidence pages and pages that likely need OCR.
include_accessibility_report returns a deterministic report for tagged-PDF
coverage, page structure trees, heading roles, image alt-text verifiability,
form field labels, link labels, mark info, and copy_for_accessibility
permissions. It gives agents routing guidance without claiming PDF/UA
certification or forcing raw structure outputs into top-level JSON.
ocr_pages renders selected PDF pages and sends those temporary PNGs to a
local OCR command configured by environment variables. This keeps the default
TypeScript package private and dependency-bounded while giving teams a real
scanned PDF path when they already run Tesseract, PaddleOCR, a local HTTP shim,
or an internal OCR binary. MCP_PDF_OCR_PRESET=tesseract provides a built-in
Tesseract command template without bundling an OCR model.
The OCR provider is env-only, not request-controlled. Tool responses normalize provider output into page text, confidence, optional word boxes, language, render evidence IDs, and provenance. Image bytes are not embedded in the JSON response.
inspect_pdf adds a bounded planning tool for agent workflows. It samples
up to 20 pages per source, counts selectable text and image paint operations,
surfaces document-level signals, and returns a recommendation with the next
best read_pdf arguments.
Inspection is intentionally low overhead: it does not decode image bytes and it
does not perform OCR. When sampled pages look scanned or image-only, the tool
marks needs_ocr: true so agents do not mistake an image-based PDF for a text
extraction failure. It also reports safe optional-provider readiness for
ocr_pages and analyze_regions without exposing local command paths.
include_layout_diagnostics adds deterministic page-level signals for layout
profile, reading-order model, confidence, column count, positioned item ratio,
and warnings. This helps agents decide when local extraction is safe for RAG and
when a page should be routed to a heavier parser, OCR/vision workflow, or human
review.
include_elements adds structured document elements to the JSON response while keeping the existing text, metadata, image, and table outputs backward compatible.
{
"sources": [{ "path": "report.pdf" }],
"include_elements": true,
"include_semantic_hints": true
}
Elements include stable IDs, page numbers, provenance, and best-effort bounding boxes where available. Image bytes stay out of the JSON summary so MCP clients can keep context payloads manageable.
include_semantic_hints adds deterministic heading/list/paragraph hints to text elements, with confidence and signals, without claiming a full semantic parser.
include_markdown adds page-aware Markdown for workflows that need clean text context without manually rebuilding sections from raw page text.
include_html adds an escaped HTML rendering for previews, export workflows, and downstream conversion.
The extraction pipeline also separates distant same-line text into independent segments before ordering, which improves multi-column PDFs without requiring any extra configuration.
include_chunks adds citation-ready chunks with stable IDs, strategy labels, element references, and best-effort bounding boxes for downstream retrieval and citation workflows. When include_semantic_hints is also enabled, chunks split on deterministic heading boundaries; table chunks are emitted when table extraction is requested.
include_outline, include_annotations, include_page_labels, include_page_geometry, include_permissions, include_structure_tree, include_form_fields, and include_attachments expose additional document signals without changing the default response shape.
include_safety_findings adds deterministic findings for common prompt-injection patterns, tiny text, and off-page text so agents can inspect risky document content before using it as instructions.
// ✅ Windows
{ "path": "C:\\Users\\John\\Documents\\report.pdf" }
{ "path": "C:/Users/John/Documents/report.pdf" }
// ✅ Unix/Mac
{ "path": "/home/john/documents/report.pdf" }
{ "path": "/Users/john/Documents/report.pdf" }
// ✅ Relative (still works)
{ "path": "documents/report.pdf" }
Other Improvements:
v1.2.0 - Content Ordering
v1.1.0 - Image Extraction & Performance
inspect_pdf ToolPlan PDF extraction before running a heavier read. This is useful for agents that need to choose between metadata review, citation-ready extraction, mixed PDF handling, or OCR-capable workflows.
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | Array | List of PDF sources to inspect | Required |
sample_pages | number | Maximum pages to sample per source, capped at 20 | 5 |
include_metadata | boolean | Include PDF metadata and info objects | true |
| Field | Description |
|---|---|
profile | digital_text, scanned_or_image_only, mixed_text_and_scan, low_text_or_form, or unknown |
sampled_pages | Pages used for the bounded inspection sample |
page_signals | Text chars, text items, token estimate, image paint operations, and scan/low-text flags |
document_signals | Outline, labels, permissions, forms, attachments, and structure-tree availability |
recommendation | Suggested workflow, OCR need, reason, and ready-to-use read_pdf arguments |
provider_status | Safe readiness metadata for optional ocr_pages and analyze_regions providers without command paths |
render_page ToolRender selected pages as PNG visual evidence. This gives agents a page image they can inspect or route to OCR/vision workflows while keeping binary content out of the JSON summary.
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | Array | List of PDF sources to render | Required |
scale | number | Render scale relative to PDF points, from 0.25 to 4 | 2 |
max_pages | number | Maximum pages to render per source, capped at 20 | 5 |
max_pixels_per_page | number | Maximum rendered pixels per page, capped at 64MP | 16000000 |
include_image | boolean | Return PNG pages as MCP image parts | true |
{
"sources": [{ "path": "report.pdf", "pages": "1-2" }],
"scale": 2,
"max_pages": 2
}
The first content part is JSON metadata with profile: "page_render_evidence".
Rendered PNG data is returned as subsequent MCP image parts and referenced by
image_content_index.
search_pdf ToolSearch extracted PDF text using bounded literal matching and return evidence that agents can cite or route into visual tools.
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | Array | List of PDF sources to search | Required |
query | string | Literal text query to search for | Required |
case_sensitive | boolean | Use case-sensitive matching | false |
whole_word | boolean | Match only whole words using ASCII word boundaries | false |
max_pages | number | Maximum pages to search per source, capped at 1000 | 100 |
max_matches_per_source | number | Maximum matches returned per source, capped at 500 | 50 |
context_chars | number | Context characters around each match, capped at 1000 | 120 |
{
"sources": [{ "path": "report.pdf", "pages": "1-20" }],
"query": "risk controls",
"whole_word": true,
"max_matches_per_source": 10
}
The first content part is JSON metadata with profile: "pdf_search_results".
Matches include page number, matched text, snippet, match offsets, text-item
index, optional text-item bounding box, and provenance. Search uses literal
matching only; request payloads do not accept arbitrary regular expressions.
extract_regions ToolCrop selected PDF-coordinate page regions as PNG visual evidence. This is useful when an agent has bounding boxes from the document map, table detector, or downstream layout workflow and needs focused source evidence.
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | Array | List of PDF sources with regions to crop | Required |
scale | number | Render scale used before cropping, from 0.25 to 4 | 2 |
max_regions | number | Maximum regions to crop per source, capped at 100 | 20 |
max_pixels_per_page | number | Maximum rendered pixels per page before cropping, capped at 64MP | 16000000 |
include_image | boolean | Return cropped regions as MCP image parts | true |
Each region uses PDF coordinates:
{
"id": "figure-1",
"page": 1,
"bounding_box": { "left": 72, "bottom": 420, "right": 540, "top": 620 },
"padding": 8
}
The first content part is JSON metadata with profile: "region_crop_evidence". Cropped PNG data is returned as subsequent MCP image
parts and referenced by image_content_index.
analyze_regions ToolAnalyze selected PDF-coordinate page regions with a configured local provider. This is useful for visual table recognition, chart-to-data enrichment, formula recognition, figure descriptions, and image captions while keeping every result linked to a crop evidence ID.
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | Array | List of PDF sources with regions to analyze | Required |
scale | number | Render scale used before cropping and analysis, from 0.25 to 4 | 2 |
max_regions | number | Maximum regions to analyze per source, capped at 100 | 20 |
max_pixels_per_page | number | Maximum rendered pixels per page before cropping, capped at 64MP | 16000000 |
timeout_ms | number | Timeout per analyzed region in milliseconds, capped at 300000 | 60000 |
max_output_chars | number | Maximum provider output characters returned per region | 200000 |
languages | string[] | Optional language tags passed to the configured provider | - |
| Variable | Description |
|---|---|
MCP_PDF_REGION_ANALYSIS_COMMAND | Absolute or PATH-resolved command used for visual region analysis. Required to enable analyze_regions. |
MCP_PDF_REGION_ANALYSIS_ARGS_JSON | Optional JSON string array of command arguments. Must include {input} and may also use {page}, {source}, {region_id}, {evidence_id}, {left}, {bottom}, {right}, {top}, {language}, and {languages} placeholders. Defaults to ["{input}"]. |
Provider stdout may be plain text or JSON:
{
"kind": "table",
"description": "Quarterly revenue table",
"text": "Q1 revenue...",
"markdown": "| Quarter | Revenue |",
"confidence": 0.91,
"table": {
"rows": [["Quarter", "Revenue"], ["Q1", "$1.2M"]],
"confidence": 0.9
},
"formula": {
"latex": "E = mc^2",
"confidence": 0.82
},
"chart": {
"title": "Revenue by quarter",
"summary": "Revenue rises across the period.",
"data_points": [{ "label": "Q1", "value": 1.2 }],
"confidence": 0.78
},
"warnings": ["Low contrast axis labels"]
}
The first content part is JSON metadata with profile: "region_analysis".
Each analysis includes source_crop_evidence_id, source bounding box, crop
pixel bounds, scale, provider, provenance, and normalized fields supplied by
the local provider. The request cannot select an executable.
ocr_pages ToolRun selected rendered pages through a configured local OCR provider and return a normalized OCR text layer. The provider is configured through environment variables so an MCP request cannot choose arbitrary commands.
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | Array | List of PDF sources to OCR | Required |
scale | number | Render scale used before OCR, from 0.25 to 4 | 2 |
max_pages | number | Maximum pages to OCR per source, capped at 20 | 5 |
max_pixels_per_page | number | Maximum rendered pixels per page before OCR, capped at 64MP | 16000000 |
timeout_ms | number | Timeout per OCR page in milliseconds, capped at 300000 | 60000 |
max_output_chars | number | Maximum OCR text characters returned per page | 200000 |
languages | string[] | Optional OCR language tags passed to the configured provider | - |
| Variable | Description |
|---|---|
MCP_PDF_OCR_PRESET | Optional built-in command template. Supported value: tesseract. |
MCP_PDF_OCR_COMMAND | Absolute or PATH-resolved command used for OCR. Required unless MCP_PDF_OCR_PRESET is set. Overrides the preset command when both are set. |
MCP_PDF_OCR_ARGS_JSON | Optional JSON string array of command arguments. Must include {input} and may also use {page}, {source}, {language}, {languages}, and {languages_tesseract} placeholders. Defaults to the preset template or ["{input}"]. |
Provider stdout may be plain text or JSON:
{
"text": "Recognized text",
"confidence": 0.93,
"language": "eng",
"words": [{
"text": "Recognized",
"confidence": 0.95,
"bounding_box": { "left": 10, "bottom": 20, "right": 90, "top": 40 }
}]
}
The first content part is JSON metadata with profile: "ocr_text_layer".
OCR results reference the render evidence ID used to create each temporary page
image. The default package does not bundle an OCR model or call a cloud OCR
service.
read_pdf ToolThe extraction tool that handles PDF content, structure, citations, images, tables, and document signals.
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | Array | List of PDF sources to process | Required |
include_full_text | boolean | Extract full text content | false |
include_metadata | boolean | Extract PDF metadata | true |
include_page_count | boolean | Include total page count | true |
include_images | boolean | Extract embedded images | false |
include_tables | boolean | Detect tables with rows, cell metadata, confidence, quality diagnostics, inferred spans, continuation candidates, and best-effort geometry | false |
include_document_map | boolean | Include an agent document map that links pages, elements, chunks, layout diagnostics, safety findings, routing signals, and page geometry | false |
include_document_ast | boolean | Include a semantic document AST with page, section, paragraph, list item, table, and image nodes linked to element/chunk evidence | false |
include_trust_report | boolean | Include a consolidated trust report for content safety, layout uncertainty, sparse/scanned pages, table quality, and external links | false |
include_accessibility_report | boolean | Include a deterministic accessibility report for tagged-PDF coverage, structure trees, headings, images, forms, links, and accessibility permissions | false |
include_elements | boolean | Include structured document elements for agent workflows | false |
include_semantic_hints | boolean | Include deterministic heading/list/paragraph hints on text elements | false |
include_markdown | boolean | Include page-aware Markdown for RAG and summarization | false |
include_html | boolean | Include escaped page-aware HTML for preview/export workflows | false |
include_chunks | boolean | Include page, semantic, size, and table chunks with source references | false |
include_text_layer | boolean | Include line and word records with page-level character ranges, best-effort bounding boxes, and provenance | false |
include_layout_diagnostics | boolean | Include page layout profiles, reading-order confidence, column signals, and warnings | false |
include_outline | boolean | Include PDF outline/bookmarks when available | false |
include_annotations | boolean | Include safe annotation summaries for selected pages | false |
include_page_labels | boolean | Include PDF page labels when available | false |
include_page_geometry | boolean | Include page viewport geometry and PDF view boxes | false |
include_permissions | boolean | Include permission labels and mark info when available | false |
include_structure_tree | boolean | Include tagged PDF structure trees for selected pages when available | false |
include_form_fields | boolean | Include PDF form field summaries when available | false |
include_attachments | boolean | Include embedded attachment metadata without attachment bytes | false |
include_safety_findings | boolean | Include deterministic content safety findings for agent workflows | false |
{
path?: string; // Local file path (absolute or relative)
url?: string; // HTTP/HTTPS URL to PDF
pages?: string | number[]; // Pages to extract: "1-5,10" or [1,2,3]
}
Metadata only (fast):
{
"sources": [{ "path": "large.pdf" }],
"include_metadata": true,
"include_page_count": true,
"include_full_text": false
}
From URL:
{
"sources": [{
"url": "https://arxiv.org/pdf/2301.00001.pdf"
}],
"include_full_text": true
}
Page ranges:
{
"sources": [{
"path": "manual.pdf",
"pages": "1-5,10-15,20" // Pages 1,2,3,4,5,10,11,12,13,14,15,20
}]
}
Structured elements:
{
"sources": [{ "path": "report.pdf", "pages": "1-3" }],
"include_elements": true,
"include_metadata": true
}
Elements are designed for agent workflows that need stable page references, provenance, and best-effort coordinates for citation-ready downstream processing.
Agent document map:
{
"sources": [{ "path": "report.pdf", "pages": "1-5" }],
"include_document_map": true,
"include_full_text": false
}
The document map is designed for agents that need one navigable structure for pages, elements, chunks, layout confidence, safety findings, routing signals, and page geometry without embedding image bytes in JSON.
Content is returned in natural reading order using Y-coordinates plus deterministic column segmentation:
Document Layout:
┌─────────────────────┐
│ [Title] Y:100 │
│ [Image] Y:150 │
│ [Text] Y:400 │
│ [Photo A] Y:500 │
│ [Photo B] Y:550 │
└─────────────────────┘
Response Order:
[
{ type: "text", text: "Title..." },
{ type: "image", data: "..." },
{ type: "text", text: "..." },
{ type: "image", data: "..." },
{ type: "image", data: "..." }
]
Benefits:
Enable extraction:
{
"sources": [{ "path": "manual.pdf" }],
"include_images": true
}
Response format:
{
"images": [{
"page": 1,
"index": 0,
"width": 1920,
"height": 1080,
"format": "rgb",
"data": "base64-encoded-png..."
}]
}
Supported formats: RGB, RGBA, Grayscale Auto-detected: JPEG, PNG, and other embedded formats
Absolute paths (v1.3.0+) - Direct file access:
{ "path": "C:\\Users\\John\\file.pdf" }
{ "path": "/home/user/file.pdf" }
Relative paths - Workspace files:
{ "path": "docs/report.pdf" }
{ "path": "./2024/Q1.pdf" }
Configure working directory:
{
"mcpServers": {
"pdf-reader-mcp": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"],
"cwd": "/path/to/documents"
}
}
}
Strategy 1: Page ranges
{ "sources": [{ "path": "big.pdf", "pages": "1-20" }] }
Strategy 2: Progressive loading
// Step 1: Get page count
{ "sources": [{ "path": "big.pdf" }], "include_full_text": false }
// Step 2: Extract sections
{ "sources": [{ "path": "big.pdf", "pages": "50-75" }] }
Strategy 3: Parallel batching
{
"sources": [
{ "path": "big.pdf", "pages": "1-50" },
{ "path": "big.pdf", "pages": "51-100" }
]
}
By default the server can read any local file the host process can access and fetch any HTTP(S) URL. When running outside a sandbox you should restrict it to a specific working set.
Use --allow-dir (repeatable) or the MCP_PDF_ALLOWED_DIRS env var (: or , separated). Once set, all path sources must resolve inside one of the allowed directories — relative paths, absolute paths, and .. traversal are all checked after resolution.
# CLI flags
npx @sylphx/pdf-reader-mcp --allow-dir=/srv/pdfs --allow-dir=/data/reports
# Environment
MCP_PDF_ALLOWED_DIRS="/srv/pdfs:/data/reports" npx @sylphx/pdf-reader-mcp
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp", "--allow-dir=/srv/pdfs"]
}
}
}
# Block all URL sources
npx @sylphx/pdf-reader-mcp --no-http
MCP_PDF_ALLOW_HTTP=false npx @sylphx/pdf-reader-mcp
# Allowlist hosts (everything else rejected)
npx @sylphx/pdf-reader-mcp --allow-host=cdn.example.com --allow-host=files.internal
MCP_PDF_ALLOWED_HOSTS="cdn.example.com,files.internal" npx @sylphx/pdf-reader-mcp
| Setting | CLI flag | Environment variable | Default |
|---|---|---|---|
| Filesystem allowlist | --allow-dir=<path> (repeatable) | MCP_PDF_ALLOWED_DIRS (: or , separated) | unrestricted |
| Disable HTTP | --no-http | MCP_PDF_ALLOW_HTTP=false | enabled |
| HTTP host allowlist | --allow-host=<host> (repeatable) | MCP_PDF_ALLOWED_HOSTS (, separated) | any host |
Denied requests fail fast with an Access denied error before any disk read or network call.
Solution: Upgrade to v1.3.0+
npm update @sylphx/pdf-reader-mcp
Restart your MCP client completely.
Causes:
Solutions:
Use absolute path:
{ "path": "C:\\Full\\Path\\file.pdf" }
Or configure cwd:
{
"pdf-reader-mcp": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"],
"cwd": "/path/to/docs"
}
}
Solution:
npm cache clean --force
rm -rf node_modules package-lock.json
npm install @sylphx/pdf-reader-mcp@latest
Restart MCP client completely.
By default, PDF Reader MCP uses stdio transport for local use. You can also run it as an HTTP server for remote access from multiple machines.
# Run as HTTP server on port 8080
MCP_TRANSPORT=http npx @sylphx/pdf-reader-mcp
| Variable | Default | Description |
|---|---|---|
MCP_TRANSPORT | stdio | Transport type: stdio or http |
MCP_HTTP_PORT | 8080 | HTTP server port |
MCP_HTTP_HOST | 0.0.0.0 | HTTP server hostname |
MCP_API_KEY | - | Optional API key for authentication |
MCP_PDF_OCR_PRESET | - | Optional OCR preset. Supported value: tesseract |
MCP_PDF_OCR_COMMAND | - | Optional local OCR command used by ocr_pages |
MCP_PDF_OCR_ARGS_JSON | ["{input}"] | Optional JSON string array of OCR command arguments. Must include {input}. |
MCP_PDF_REGION_ANALYSIS_COMMAND | - | Optional local visual-region analysis command used by analyze_regions |
MCP_PDF_REGION_ANALYSIS_ARGS_JSON | ["{input}"] | Optional JSON string array of region analysis command arguments. Must include {input}. |
FROM oven/bun:1
WORKDIR /app
RUN bun add @sylphx/pdf-reader-mcp
ENV MCP_TRANSPORT=http
ENV MCP_HTTP_PORT=8080
EXPOSE 8080
CMD ["bun", "node_modules/@sylphx/pdf-reader-mcp/dist/index.js"]
{
"servers": {
"pdf-reader": {
"type": "http",
"url": "https://your-server.com/mcp",
"headers": {
"X-API-Key": "your-api-key"
}
}
}
}
| Endpoint | Method | Description |
|---|---|---|
/mcp | POST | JSON-RPC endpoint |
/mcp/health | GET | Health check |
| Component | Technology |
|---|---|
| Runtime | Node.js 22+ ESM |
| PDF Engine | PDF.js (Mozilla) |
| Validation | Vex + JSON Schema |
| Protocol | MCP SDK |
| Language | TypeScript (strict) |
| Testing | Bun test suite |
| Quality | Biome (50x faster) |
| CI/CD | GitHub Actions |
any types, strict modePrerequisites:
bun@1.3.1)Setup:
git clone https://github.com/SylphxAI/pdf-reader-mcp.git
cd pdf-reader-mcp
bun install && bun run build
Scripts:
bun run build # Build with bunup
bun test # Run the test suite
bun run test:cov # Run coverage
bun run check # Lint + format
bun run check:fix # Auto-fix
bun run benchmark # Reproducible local performance benchmark
Quality:
Quick Start:
git checkout -b feature/awesomebun testbun run check:fixCommit Format:
feat(images): add WebP support
fix(paths): handle UNC paths
docs(readme): update examples
See CONTRIBUTING.md
✅ Completed
🚀 Next
Vote at Discussions
Featured on:
Local-first • Agent-ready • Battle-tested
Show Your Support: ⭐ Star • 👀 Watch • 🐛 Report bugs • 💡 Suggest features • 🔀 Contribute
CI-backed quality • Structured extraction • Production ready
MIT © Sylphx
Built with:
Special thanks to the open source community ❤️
This project uses the following @sylphx packages:
MCP server integration for DaVinci Resolve Studio
A trilingual (繁中 / English / 简中) learning roadmap for agentic AI: from LLM basics to multi-agent systems, with 240+ cura
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots