Server data from the Official MCP Registry
MCP server for extracting text, images, tables, links, annotations, and metadata from PDF files.
MCP server for extracting text, images, tables, links, annotations, and metadata from PDF files.
This PDF reader MCP server is well-structured with proper input validation, appropriate permission scoping, and no malicious patterns. Authentication is not required (as expected for a local PDF processing tool), and all file operations are restricted to PDF files with validated paths. Minor code quality observations around exception handling do not materially impact security. Supply chain analysis found 3 known vulnerabilities in dependencies (0 critical, 3 high severity). Package verification found 1 issue.
6 files analyzed · 9 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-xvvln-pdf-reader-mcp": {
"args": [
"pdf-insight-mcp"
],
"command": "uvx"
}
}
}From the project's GitHub README.
一个用于读取和分析 PDF 文件的 MCP 服务器。它可以为支持 MCP(Model Context Protocol)的客户端提供 PDF 文本、页面图片、表格、链接、批注、目录、元数据和基础文本统计。
A PDF-focused MCP server for extracting text, rendered pages, tables, links, annotations, outlines, metadata, and text statistics from PDF files.
pdf-reader-mcpio.github.Xvvln/pdf-reader-mcppdf-insight-mcppdf-reader-mcp and pdf-insight-mcppdf-reader-mcp is the project name. The PyPI package is published as pdf-insight-mcp because the pdf-reader-mcp package name is not available on PyPI.
| Tool | What it does |
|---|---|
get_pdf_info | Read document metadata, page count, file size, and encryption status. |
read_pdf_as_text | Extract text from selected pages with page and character limits. |
read_pdf_as_images | Render selected pages as base64-encoded images. |
get_pdf_outline | Read bookmarks and outline entries. |
search_pdf_text | Search text and return per-match page context. |
extract_pdf_tables | Extract structured tables when PyMuPDF can detect them. |
extract_pdf_images | Extract embedded PDF images. |
get_pdf_page_info | Inspect one page's size, text, images, links, and rotation. |
extract_pdf_links | Extract external URLs and internal page jumps. |
get_pdf_annotations | Read comments, highlights, and annotation metadata. |
get_pdf_text_stats | Compute text, line, paragraph, and scan-likelihood stats. |
compare_pdf_pages | Compare text similarity between two pages. |
Install uv if you do not already have it:
curl -LsSf https://astral.sh/uv/install.sh | sh
Run the server directly from PyPI:
uvx pdf-insight-mcp
Or install it first:
python -m pip install pdf-insight-mcp
pdf-reader-mcp
Use the published PyPI package:
{
"mcpServers": {
"pdf-reader": {
"command": "uvx",
"args": ["pdf-insight-mcp"]
}
}
}
Use a local checkout for development:
{
"mcpServers": {
"pdf-reader": {
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/pdf-reader-mcp",
"run",
"pdf-reader-mcp"
]
}
}
}
Replace /absolute/path/to/pdf-reader-mcp with the absolute path to this repository on your machine.
Ask your MCP client to call tools with an absolute PDF path. Example requests:
Read /Users/me/Documents/report.pdf as text.
Search /Users/me/Documents/report.pdf for "baseline characteristics".
Render pages 1-3 of /Users/me/Documents/report.pdf as images.
Extract links and annotations from /Users/me/Documents/review.pdf.
For large PDFs, prefer small page ranges first. For scanned or layout-sensitive PDFs, use read_pdf_as_images with a small pages range and moderate dpi.
read_pdf_as_text defaults to at most 50 pages and 200000 returned characters.read_pdf_as_images rejects requests above 20 pages.read_pdf_as_images defaults to an overall image payload cap of about 20 MB.extract_pdf_images returns at most 20 embedded images but reports the actual detected total.Install dependencies:
uv sync --extra dev
Run tests:
uv run pytest -q
Build the package:
uv build
uvx twine check dist/*
Run the local server:
uv run pdf-reader-mcp
Releases are published through GitHub Actions.
Before the first release, configure PyPI Trusted Publishing with:
PyPI project name: pdf-insight-mcp
Owner: Xvvln
Repository name: pdf-reader-mcp
Workflow filename: publish.yml
Environment name: leave empty
Then release by bumping versions in pyproject.toml and server.json, committing the change, and pushing a version tag:
git tag vX.Y.Z
git push origin main --tags
The Publish workflow runs tests, builds the Python package, publishes to PyPI, authenticates to the MCP Registry with GitHub OIDC, and publishes server.json.
MIT
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Modelcontextprotocol · Developer Tools
Web content fetching and conversion for efficient LLM usage
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.