Server data from the Official MCP Registry
One MCP, many parsers. Routes between markitdown, Docling, and LlamaParse. Plus an interpret tool…
One MCP, many parsers. Routes between markitdown, Docling, and LlamaParse. Plus an interpret tool…
Valid MCP server (0 strong, 3 medium validity signals). No known CVEs in dependencies. Imported from the Official MCP Registry. Trust signals: trusted author (3/3 approved).
13 files analyzed · No issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: LLAMAPARSE_API_KEY
From the project's GitHub README.
One MCP, many parsers. Default markitdown (free, fast, MIT). Escalate to Docling (table-heavy, scanned PDFs) or LlamaParse (cloud, BYOK) when markitdown's quality isn't enough. Plus an interpret tool that pipes parsed markdown into Claude for "summarize / extract X" so you stop juggling parsers and anthropic skills.
Open Claude Code, paste:
/plugin marketplace add adelaidasofia/parse-mcp
/plugin install parse-mcp@parse-mcp
Manual install (pre-plugin-marketplace). See SETUP.md for full details.
pip3 install --break-system-packages -r requirements.txt
pip3 install --break-system-packages 'markitdown[pdf,docx,pptx,xlsx]'
Then register the server in your client's .mcp.json:
{
"mcpServers": {
"parse": {
"command": "python3",
"args": ["/absolute/path/to/parse-mcp/server.py"]
}
}
}
| Tool | What it does |
|---|---|
parse(source, backend?, hints?) | File path or http(s) URL to markdown. Router picks backend, falls back on empty/error. Returns markdown plus a chain of every backend attempted. |
parse_url(url, backend?) | Shortcut for HTTP(S) inputs. Same return shape as parse. |
parse_to_vault(source, vault_folder?, backend?, overwrite?) | Parse + write the result as a markdown note in the vault. Default folder: <VAULT_ROOT>/📥 Inbox/Converted/. Frontmatter records source, format, backend, latency, bytes_in. Replaces the standalone markitdown_to_vault.py shell script. |
interpret(source, instruction, backend?, model?, max_tokens?) | Parse first, then ask Claude over the parsed markdown. Cache hits reuse parsed text for free input tokens. |
list_backends() | Which backends are installed + which are missing. Diagnostic. |
benchmark(source) | Run every available backend on the same input. Compare latency + output side by side. |
chunk_text(text, doc_type?, target_tokens?, max_tokens?, min_tokens?) | Chunk parsed markdown into retrieval-ready pieces using a doc-type-aware chunker. doc_type="auto" (default) runs structural detection and picks one of paper / book / manual / qa / resume / table / default. Each chunker honors document shape (e.g., paper keeps the abstract whole; manual never merges across numbered sections; qa pairs each question with its answer). Returns chunks + the resolved doc_type. See chunkers/ package. |
detect_doc_type(text) | Diagnostic. Run structural heuristics over markdown and return the doc_type that chunk_text would pick. |
pip install docling). Best for complex tables (97.9% on benchmark) + scanned PDFs. Downloads model weights on first run.pip install llama-cloud-services + LLAMA_CLOUD_API_KEY). Cloud, cleanest output on visually-complex PDFs.parse(source) with no backend arg: router picks based on file format, falls back if backend errors or returns empty.parse(source, backend="docling"): force a specific backend, no fallback. Diagnostic mode.The routing table above used to be a guess. tests/eval/ turns it into data: a
synthetic fixture corpus (16 documents across digital PDF, scanned/image-only
PDF, table-heavy, multi-column, and raster image classes) with derived
ground-truth markdown, scored against each backend's output on three
OmniDocBench / PubTabNet metrics — text edit distance, table TEDS
(tree-edit-distance similarity), and reading-order. All scores are quality in
[0, 1], higher is better.
Headline result (full table: tests/eval/parse_fidelity_matrix.md):
| doc-class | markitdown (text) | docling (text) |
|---|---|---|
| digital_pdf | 0.95 | 0.97 |
| table_heavy | 0.93 | 0.89 |
| scanned_pdf | 0.00 | 0.87 |
| image | 0.00 | 0.90 |
| multicolumn | 0.35 | 1.00 |
markitdown is great on clean digital text and digital tables (free, fast, deterministic) but has no OCR — it scores zero on scanned PDFs and images — and it interleaves multi-column layouts. docling wins every class via OCR + layout analysis, at the cost of model-weight downloads. That is the evidence behind the format-preference chain (escalate image/scanned/multi-column to docling first).
Run it:
pip install docling # the escalation backend under test
python tests/eval/generate_fixtures.py # rebuild the corpus (needs fpdf2 + Pillow)
make eval # -> parse_fidelity_matrix.{md,json}
The matrix records its provenance (backend + python versions + a fixture-set
hash), so a stale result is visible — regenerate with make eval whenever a
parse backend is upgraded or retuned. It also reports median latency per
backend (the cost axis): the highest-fidelity backend (docling) is far slower
than the default, so the router escalates to it rather than defaulting to it.
The scorer's metric tests are pure-Python and backend-free, so pytest tests/
gates them in CI with only the base (markitdown) install — a routing regression
that breaks the "markitdown has no OCR" assumption fails the build.
FastMCP v3.2.3+, stdio transport, Python 3.13+. Registered in [VAULT_ROOT]/.mcp.json. No daemons, no listeners, no model weights downloaded by default.
See SETUP.md for install + per-backend opt-in.
Same author, same architecture pattern (FastMCP, draft+confirm on writes, vault auto-export where applicable):
This plugin sends a single anonymous install signal to myceliumai.co the first time it loads in a Claude Code session on a given machine.
What is sent:
slack-mcp)0.1.0)What is NOT sent:
Why: Helps the maintainer know which plugins people actually install, so attention goes to the ones that get used.
Opt out: Set the environment variable MYCELIUM_NO_PING=1 before launching Claude Code. The hook will skip the network call entirely. Already-pinged installs leave a sentinel at ~/.mycelium/onboarded-<plugin> — delete it if you want to reset state.
MIT. See LICENSE.
Full install or team version at diazroa.com.
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Search and install MCP servers from inside your AI client.
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.