Server data from the Official MCP Registry
URL to LLM-ready markdown — plus per-page category, page_structure, and query-driven highlights.
URL to LLM-ready markdown — plus per-page category, page_structure, and query-driven highlights.
A well-structured MCP server for web content extraction with solid security practices. Authentication is properly required via API key (OCTEN_API_KEY environment variable), input validation is comprehensive, and the codebase is clean with no malicious patterns or dangerous operations. Minor code quality observations around error handling and logging do not significantly impact the overall security posture. Supply chain analysis found 2 known vulnerabilities in dependencies (0 critical, 2 high severity). Package verification found 1 issue.
4 files analyzed · 7 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: OCTEN_API_KEY
Environment variable: OCTEN_API_URL
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-octen-team-octen-mcp": {
"env": {
"OCTEN_API_KEY": "your-octen-api-key-here",
"OCTEN_API_URL": "your-octen-api-url-here"
},
"args": [
"-y",
"octen-mcp"
],
"command": "npx"
}
}
}From the project's GitHub README.
MCP server for Octen Extract — turn any URL into clean, LLM-ready markdown. Plug into Claude / Cursor / VS Code / Windsurf and let the model pull the live web.
Most extract tools (Firecrawl, Exa web_fetch, Tavily extract) hand you the page body. Octen gives you more, per page, in one call:
highlights — pass a query and get the most relevant snippets ranked, not the whole page (cheaper context, better signal)category — topical classification {primary, secondary}page_structure — page typology {primary, secondary} (article / product / listing / index / …)You need an Octen API key — grab one at octen.ai.
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"octen": {
"command": "npx",
"args": ["-y", "octen-mcp"],
"env": {
"OCTEN_API_KEY": "your-key-here"
}
}
}
}
Quit and reopen Claude Desktop. Ask "fetch octen.ai and summarize" — Claude routes to the extract tool automatically.
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"octen": {
"command": "npx",
"args": ["-y", "octen-mcp"],
"env": { "OCTEN_API_KEY": "your-key-here" }
}
}
}
The button prompts you for the API key on click — no manual editing needed. Or add to .vscode/mcp.json in your workspace:
{
"servers": {
"octen": {
"command": "npx",
"args": ["-y", "octen-mcp"],
"env": { "OCTEN_API_KEY": "your-key-here" }
}
}
}
One line, no JSON editing:
claude mcp add --scope user octen \
-e OCTEN_API_KEY=your-key-here \
-- npx -y octen-mcp
--scope user makes it available from any directory. Verify with claude mcp list — should show octen: ✓ Connected.
Same npx -y octen-mcp command with OCTEN_API_KEY env — works in any MCP-compatible client.
extract| Param | Type | Default | Description |
|---|---|---|---|
urls | string[] | required | 1–20 URLs per call. Bare hosts like octen.ai are auto-prefixed with https://. |
query | string | none | Intent-focused keywords. When set, results contain highlights instead of full_content. Max 500 chars. |
max_age_seconds | int | 86400 | Cache TTL in seconds (min 300). Lower this for time-sensitive pages (news, prices). |
format | markdown | text | markdown | Output content format. |
timeout | int | 30 | Per-URL extraction timeout, 1–60 seconds. |
include_images | bool | false | Include image URLs found on each page. |
include_videos | bool | false | Include video URLs found on each page. |
include_audio | bool | false | Include audio URLs found on each page. |
include_favicon | bool | false | Include each page's favicon URL. |
Full API reference: docs.octen.ai/api-reference/extract.
Fetch octen.ai and summarize the main product features.Compare the positioning of firecrawl.dev and octen.ai.What does the Hacker News front page say right now? Pull the top 5 story titles.Search 'pricing' across firecrawl.dev — return only the relevant highlights. (triggers query → highlights)Real web pages fail in messy ways. Octen surfaces structured signals so your LLM agent can decide what to do, instead of guessing from an empty markdown blob.
| Scenario | Example URL | Octen response | Why it's useful |
|---|---|---|---|
| Hard 404 | https://httpbin.org/status/404 | status: failed, error_message: "Target returned HTTP 404" | Agent knows the URL is dead — no need to retry. |
| Server error (5xx) | https://httpbin.org/status/500 | status: failed, error_message: "Target server error (HTTP 500)" | Distinguishes server-side outage from client-side dead page — can be safely retried later. |
| DNS failure / dead domain | https://nonexistent-zzz-fake-xyz.invalid | status: failed, error_message: "Failed to resolve domain" | Distinguishes "domain doesn't exist" from "page doesn't exist" — different remediation. |
| Login wall / no main content | https://github.com/login | status: success, title: "Build software better, together", page_structure: "No Main Content", full_content: 602 bytes | ✨ Even when the request succeeds and there's a title, page_structure flags pages with no real body. Agents can branch on this instead of feeding the LLM a useless login splash. |
The last row is the Octen-specific win: most extract tools would return status: success + a short body for that login wall and your agent has no signal it's garbage. Octen's page_structure classifier tells you upfront.
| Variable | Required | Default | Notes |
|---|---|---|---|
OCTEN_API_KEY | ✅ | — | Get one at octen.ai |
OCTEN_API_URL | optional | https://api.octen.ai | Override for staging or self-hosted |
git clone https://github.com/Octen-Team/octen-mcp.git
cd octen-mcp
npm install
npm run build
OCTEN_API_KEY=<key> npm run inspect # opens MCP Inspector
If your client also has a built-in web-fetch tool, drop a hint in Claude Desktop's Customize / Project Instructions:
When the user asks to fetch or extract content from a URL, prefer the
extracttool from theoctenMCP server. Usequerywhenever the user is looking for something specific on the page (returns ranked highlights, not the whole body).
With the hint in place, a single tool call classifies three mixed URLs (article / homepage / discussion) in one shot:

MIT © Octen
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Web content fetching and conversion for efficient LLM usage
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.