Server data from the Official MCP Registry
Keyless multi-engine web search, fetch, and clean-Markdown read for AI agents. No API keys.
Keyless multi-engine web search, fetch, and clean-Markdown read for AI agents. No API keys.
Valid MCP server (1 strong, 2 medium validity signals). No known CVEs in dependencies. Package registry verified. Imported from the Official MCP Registry.
5 files analyzed · 1 issue found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-hec-ovi-web-search": {
"args": [
"websearch-skill"
],
"command": "uvx"
}
}
}From the project's GitHub README.
Open-source multi-engine web search and content extraction for AI agents, built as isolated layers connected only by versioned JSON Schema contracts.
Status: early. This is
v0.1.0, the first public release (2026-06-22). The keyless search, the clean-Markdown reader, and the five agent tools work today and are covered by the test suite and a live in-harness check. The hard anti-bot tiers, the opt-in proxy egress, and local rerank are not built yet (see Roadmap). It is new, so pin a version and try it in a sandbox before wiring it into anything sensitive.
A self-hosted web search and page reader for AI agents, with no API keys and no paywall. Point it at a query and it fans out across many search engines, fuses and dedups the results, then fetches and extracts pages into clean Markdown. Everything runs locally; queries do not go to a vendor.
Three commands are the whole surface:
web_search finds pages: ranked, deduplicated results across engines, each with a reusable handle.web_fetch reads a page: clean Markdown, fenced as untrusted, paginated so a long page never overflows context.web_open pages back through a document you already fetched, from cache, without hitting the network again.Two extra keyless tools cover what general web search does not:
arxiv searches arXiv papers and returns structured metadata (authors, abstract, categories, abstract and PDF links).github searches GitHub repositories and returns typed fields you can sort on (stars, language, topics).Search works the moment you install it. The default engine is the keyless ddgs metasearch library, which by itself spans Google, Brave, DuckDuckGo, Yandex, Yahoo, Startpage, Mojeek, and Wikipedia. Each query is served by whichever of those respond fastest. No API key, no account, no service to run, and no engine flags: the agent surface (web-search) is plug-and-play. Picking a subset of underlying engines (or adding Bing by name) is a power-user knob on the lower-level search command via --ddgs-backends google,brave,mojeek, not on web-search.
For broader and more reliable search you can run your own SearXNG (hundreds of engines, your own server, still no keys), and the router fuses it with ddgs and de-correlates the engines they share. A one-command Docker setup is in docker/searxng/; see Self-hosting SearXNG below. Public SearXNG instances are deliberately not a default: most disable the JSON API and rate-limit automated clients, so depending on them would break on a fresh install.
It is MIT-licensed and open source, and the keyless default is just the floor. You can stack a self-hosted SearXNG, keyed engines (Brave, Exa, Tavily), or a paid egress adapter on top, all behind the same contracts, none of them required.
The scope is a Pareto trade rather than a clean sweep. Among 2026 agentic-search APIs the top tier is statistically tied on result quality, so cost and latency are the real differentiators; the scorecard below is how I measure that, axis by axis. Where this helps is cost (about zero at the software layer), privacy and self-hosting, multi-engine recall plus dedup, clean extraction, and the unprotected majority of the web. Hard anti-bot at scale is out of scope: it stays a swappable paid egress adapter, because residential proxies and captcha solving have no reliable free equivalent.
| Layer | What it does | Status |
|---|---|---|
| Layer 1: Search | Multi-engine router (keyless ddgs across many engines, optional self-hosted SearXNG), canonicalize, dedup, de-correlated RRF fusion | Built |
| Layer 2A: Fetch + Extract | Tiered fetch (httpx, curl_cffi impersonation), Trafilatura extraction to Markdown + metadata | Built |
| Layer 2B: Format + Store | Paginated Markdown + JSON sidecar, progressive-disclosure index/resolver, MinHash dedup, ephemeral SQLite-FTS5 store | Built |
| Layer 3: Agent I/O | Consolidated web_search/web_fetch/web_open, untrusted-content fence, optional MCP stdio server, SKILL.md | Built |
| Extra tools | Keyless arxiv (paper search) and github (repo search), standalone over the same Envelope | Built |
Contracts are frozen as JSON Schema 2020-12: envelope@1.0.0, search@1.0.0, fetch@1.1.0, extract@1.0.0, format@1.0.0, store@1.0.0, agent-io@1.0.0, arxiv@1.0.0, github@1.0.0. Every response is wrapped in one Envelope { contract_version, ok, data, error, meta }.
A thin router fans a normalized request out to per-engine adapters behind an EngineAdapter port, canonicalizes URLs, dedups with provenance merge, and fuses results with provenance-aware weighted Reciprocal Rank Fusion (RRF, k=60). The keyless default is ddgs, a metasearch library that is itself multi-engine (Google, Brave, DuckDuckGo, Yandex, Yahoo, Startpage, Mojeek, Wikipedia, with Bing and others selectable). On this lower-level search command, --ddgs-backends forces a subset and --engines picks which adapters (ddgs, searxng) run; the agent-facing web-search command takes none of those and just uses the keyless default. Point WEBSEARCH_SEARXNG_URL at a self-hosted SearXNG to add it as a second, broader engine, and the router fuses both. Keyed engines (Brave, Exa, Tavily, and others) are a planned drop-in behind the same port.
ddgs and SearXNG do the same job (query many engines and merge) in different forms: ddgs is a keyless Python library that runs in-process, while SearXNG is a separate server you host that covers far more engines. They are interchangeable adapters behind this port; run either, or both fused. SearXNG only overlaps with this search layer, it does not fetch or extract pages, so it never replaces the rest of the tool.
The load-bearing decision is de-correlation. SearXNG and ddgs both lean on the same upstream crawlers (Google, Bing), so a naive union lets a consensus bonus amplify the same crawler agreeing with itself, and the fused ranking can end up worse than a single well-tuned engine. The fix: a result's sources are grouped by correlation_group, each group contributes one RRF term at its best rank, and the consensus bonus scales only with the number of distinct groups. SearXNG and ddgs agreeing counts as one independent vote; SearXNG and a neural index agreeing counts as two. The router records the de-correlation in a warning.
Every result carries full per-engine provenance (which engine returned it, at what rank), so the ranking is auditable.
Two decoupled sub-ports, each independently swappable.
Fetch escalates by tier and escalates only when it detects an anti-bot block. Tier 0 is plain httpx; on a detected challenge it escalates to curl_cffi (browser TLS/JA3 impersonation). It does not escalate on a 404 or on a terminal block (rate limit, auth required, legal/geo block), since a stealthier client from the same egress will not help. Block detection reads header markers first, then gated body markers for Cloudflare, DataDome, PerimeterX, Akamai, and Imperva (Imperva can return a block with HTTP 200, so its body markers are scanned on any status). Browser and stealth tiers (Crawl4AI, nodriver) are named in the contract enum but stay opt-in, not in the base install.
Extract defaults to Trafilatura, a heuristic extractor. Per the May 2026 WCXB benchmark, heuristic extractors beat neural extractors on both quality and cost (neural runs tens to hundreds of times more expensive), and Trafilatura lands at about 0.79 F1 on CPU in roughly 100ms; article pages saturate around 0.93 F1. The adapter parses the raw HTML once with lxml to recover the raw schema.org JSON-LD blocks and og:type (Trafilatura folds JSON-LD into metadata and never exposes the raw blocks), runs Trafilatura for the Markdown body and plain text plus metadata, then computes:
quality_score (0..1) from runtime signals (text density, word-count saturation, paragraph count, inverse link density, JSON-LD presence, clean title) with hard vetoes for soft-404s and shells; below about 0.80 a page is a fallback candidate.page_type resolved from JSON-LD @type, then og:type, then URL shape.Neural extract engines (crawl4ai, jina_readerlm, and others) are named in the contract enum but stay opt-in. The default dependency closure is permissive: Apache-2.0 (trafilatura) plus MIT/BSD/MPL deps.
There is no output-length cap anywhere. content_markdown is never truncated; --max-bytes is a transport guard only, not a content cap.
Two decoupled sub-ports that turn results into something an agent can actually read, and keep the full pages around for follow-up.
Format takes vendor-neutral results (the union of what Layer 1 and Layer 2A produce) and renders one layout-stable Markdown document plus a parallel JSON sidecar holding the same data. Results are ordered by descending relevance and paginated, because lost-in-the-middle is still real in 2026 even on long-context models. Near-duplicate dedup runs first: byte-exact (normalized SHA-256), then a pure-Python MinHash over word 4-gram shingles (128 permutations, Jaccard 0.9) clustered with union-find. The best-scored page in a cluster is kept and the rest are recorded as dropped_duplicates, so the fold is auditable. The 0.9 threshold is deliberately conservative: it folds genuinely near-identical pages (mirrors, syndication), not merely topically similar ones, so two pages that just read alike will not collapse at the default. Lower jaccard_threshold for boilerplate-heavy corpora. Dedup is pure Python by design (no datasketch), so the default install stays dependency-light. Progressive disclosure chooses how much to inline:
auto (default) inlines full bodies when the page fits a token budget, otherwise switches to an index (a preview plus a stable id).index always shows a preview and a resolve hint; full always inlines.The optional anthropic_search_result_blocks view maps 1:1 onto Anthropic search_result content blocks (source as a bare string, at least one non-empty text block, citations all-or-nothing). It is off by default and is a derived view, not the canonical shape, so installing into a non-Anthropic harness costs nothing.
There is no output-length cap here either. The sidecar carries the full body verbatim in every mode, and the store keeps the full Markdown. body_char_budget only offloads the rendered Markdown view to the resolver (with a hint), and --no-truncate turns even that off. Nothing is summarized or discarded.
Store is an ephemeral page index behind a PageIndex port (add / search / get / resolve_index). There is no database for the per-query result set, which is tens of rows of plain Python; the store is used only as the progressive-disclosure index over fetched pages. The default adapter is SQLite FTS5 over an in-memory connection: it ships in the Python stdlib with BM25 and needs no third-party package. FTS5 is not compiled into every SQLite build, so the adapter probes for it at runtime and falls back to a pure-Python BM25 index that returns the identical shapes. Adds are idempotent on url plus content hash, an arbitrary query is escaped so FTS5 operators never raise a syntax error, and persistence is just passing a file path. A vector or Rust backend (sqlite-vec, tantivy) plugs in behind the same port, opt-in.
One consolidated surface over Layers 1, 2A, and 2B: web_search (find), web_fetch (read a URL), and web_open (page through an already-fetched document). Each returns the same Envelope. The identical core is exposed three ways: the websearch web-search / web-fetch / web-open CLI, a FastMCP stdio server (websearch mcp, bundled in the base install), and a portable SKILL.md written to the Agent Skills standard (name plus description, so it loads in Claude Code, Codex, OpenCode, and others).
The cross-layer key is a human-readable handle (site~shorthash, for example en.wikipedia.org~3a1f9c2b5e6f), never an opaque UUID. web_fetch indexes the full page into the Layer 2B store and returns one token-budget page; web_open pages through the rest from that store, by handle, without re-fetching. The split is lossless: pagination is progressive disclosure, not a cap, and the whole body stays reachable page by page. The lower-level search / fetch / open commands remain as the per-layer surfaces for debugging and composition.
Fetched page text is untrusted, so web_fetch and web_open wrap each page in a fence (see Security). On the MCP face the page also rides the tool-result channel, which models are trained to treat with skepticism.
Two standalone tools cover sources that general web search handles poorly, both keyless and over the same Envelope:
arxiv searches arXiv via the official Atom API and returns structured papers (title, authors, abstract, categories, abstract and PDF links). It supports field-targeted search (--field title|author|abstract) and sorting by date or relevance, uses GET so it benefits from arXiv's cache, and backs off on the 2026 rate limiting.github searches GitHub repositories via the unauthenticated REST API and returns typed fields (full name, stars, forks, language, topics, updated date). Unauthenticated search is about 10 requests per minute; on a rate limit it returns a clean rate_limited error instead of hammering. Code search needs a token and is intentionally left out of the keyless path.uv run websearch arxiv "diffusion models for protein design" --max-results 5 --sort-by submittedDate
uv run websearch github "vector database" --language Rust --sort stars --per-page 10
Reddit and X (Twitter) have no keyless, terms-clean search path in 2026 (Reddit's anonymous JSON endpoints return 403 as of May 2026; X needs a paid API or a logged-in account), so there is deliberately no dedicated tool for them. Search the open web with a site filter instead:
uv run websearch web-search "rust async" --site reddit.com
uv run websearch web-search "frontier model release" --site x.com
Pick the route that matches how you use it. Everything is keyless and needs internet; the only hard requirement is uv. Full per-harness instructions (Claude Code, Codex, OpenCode, Cursor, Hermes, OpenClaw, the MCP registry, and PyPI publishing) are in docs/INSTALL.md.
As a CLI tool, no install (uvx builds an ephemeral env and runs it):
# straight from git today (no PyPI needed):
uvx --from git+https://github.com/hec-ovi/websearch-skill websearch web-search "your query"
# once published to PyPI, the short form:
uvx websearch-skill web-search "your query"
As an agent skill across 40+ agents via the skills CLI (it installs the skills/web-search/ directory into each detected agent):
npx skills add hec-ovi/websearch-skill # all detected agents
npx skills add hec-ovi/websearch-skill -a claude-code -a codex -s web-search
As a Claude Code plugin (bundles the skill and the MCP server in one install):
/plugin marketplace add hec-ovi/websearch-skill
/plugin install web-search@websearch-skill
As an MCP server (FastMCP stdio, bundled in the base install). Point any MCP client at:
{ "mcpServers": { "web-search": { "command": "uvx", "args": ["websearch-skill", "mcp"] } } }
It is also published in the MCP Registry as io.github.hec-ovi/web-search, so registry-aware clients can discover and install it by name.
From source (development, uv-native):
git clone https://github.com/hec-ovi/websearch-skill
cd websearch-skill
uv sync
uv run websearch web-search "your query"
Search works with no setup: the keyless ddgs metasearch is the default. The agent-facing web-search (Layer 3) needs no engine flags. The lower-level search (Layer 1) is for debugging and power use, and is the only command that takes --engines, --ddgs-backends, and --no-ddgs:
# Layer 1: search (keyless, multi-engine via ddgs)
uv run websearch search "rust ownership" --json
# force specific keyless engines (search command only, not web-search)
uv run websearch search "rust ownership" --ddgs-backends google,brave,mojeek
# add a self-hosted SearXNG (see docker/searxng/) as a second, broader engine
export WEBSEARCH_SEARXNG_URL=http://localhost:8080
uv run websearch search "rust ownership" --engines searxng,ddgs
# Layer 2A: fetch + extract one page to clean Markdown
uv run websearch fetch "https://en.wikipedia.org/wiki/Rust_(programming_language)" --json
# Layer 2B: open several pages into one paginated, deduped, LLM-ready document,
# and full-text search the passages across them
uv run websearch open \
"https://en.wikipedia.org/wiki/Rust_(programming_language)" \
"https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html" \
--search "ownership borrow checker"
# Layer 3: the consolidated, fenced, handle-keyed agent face
uv run websearch web-search "rust ownership" --json
uv run websearch web-fetch "https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html" \
--page-size-tokens 4000 --persist-path /tmp/idx.sqlite
# page through the rest of a fetched doc by its handle, from cache (no refetch)
uv run websearch web-open "doc.rust-lang.org~<hash>" --page 2 --persist-path /tmp/idx.sqlite
# extra keyless tools: arXiv papers and GitHub repos
uv run websearch arxiv "mixture of experts scaling laws" --max-results 5
uv run websearch github "llm agent framework" --language Python --sort stars
# or run as an MCP server (FastMCP stdio; bundled, no extra needed)
uv run websearch mcp
Every command prints a compact human view by default, or the raw JSON Envelope with --json (exit 0 on success, 1 on a request-level error). For the fetch command, --output-format {markdown,text,json} selects the body representation the human view prints (text emits the plain-text rendering), and --quiet prints only the extracted body, for piping. For the open command, --mode {auto,index,full} controls progressive disclosure, --no-truncate inlines every full body, --search QUERY runs a BM25 passage search over the opened pages, and --anthropic-blocks adds the Anthropic search_result view to the sidecar. The agent-facing web-search takes --max-results, --detail, --freshness, --site, --language, --country, --safesearch, --offset, and --searxng-url; the engine-selection flags (--engines, --ddgs-backends, --no-ddgs) live only on the lower-level search command, alongside --searxng-url (or WEBSEARCH_SEARXNG_URL). See uv run websearch <command> --help for the full flag list.
A fetch --json response looks like:
{
"contract_version": "1.0.0",
"ok": true,
"data": {
"source": { "final_url": "https://...", "status": 200, "fetched_via": "http", "blocked": false },
"result": {
"title": "Rust (programming language)",
"page_type": "article",
"quality_score": 0.91,
"word_count": 8123,
"content_markdown": "# Rust ...",
"metadata": { "og_type": "article" }
}
},
"error": null,
"meta": { "layer": "extract", "backend": "http", "elapsed_ms": 412 }
}
As a library:
from websearch.layer1_search import build_router, SearchRequest
router = build_router(searxng_url="http://localhost:8080", enable_ddgs=True)
envelope = router.search(SearchRequest(query="rust ownership", count=10))
for r in envelope.data["results"]:
print(r["fused_score"], r["url"], [s["engine"] for s in r["sources"]])
Layer 2B (format + store) the same way. run returns an Envelope; its data is the JSON FormatPayload, so markdown and the lossless sidecar are plain dict access:
from websearch.layer2_format import (
FormatRequest, ResultInput, build_format_pipeline,
build_page_index, StoreConfig, PageInput, SearchPageRequest,
)
env = build_format_pipeline().run(
FormatRequest(
query="rust ownership",
results=[ResultInput(url="https://x.test/a", title="A", score=0.9, body_markdown="# A ...")],
)
)
document = env.data["markdown"] # the layout-stable document
sidecar = env.data["sidecar"] # identical data; full bodies verbatim, never capped
# an ephemeral page index for passage search and resolve-by-id
index = build_page_index(StoreConfig()) # in-memory SQLite FTS5 (BM25)
index.add([PageInput(url="https://x.test/a", markdown="# A ...")])
hits = index.search(SearchPageRequest(query="borrow checker"))
You never need this; the keyless ddgs engines work out of the box. Run your own SearXNG when you want the broadest, most reliable search: hundreds of engines, your own server, no public rate limits, and still no API keys.
docker compose -f docker/searxng/docker-compose.yml up -d
export WEBSEARCH_SEARXNG_URL=http://localhost:8080
uv run websearch web-search "your query" # now fuses SearXNG + ddgs
The config in docker/searxng/ is one container with the JSON API enabled and the bot limiter off (it is private and only your tool queries it), so no Valkey/Redis is needed. That folder's README has the details and a checklist for before you expose it beyond localhost.
A fetch tool an agent can point anywhere is an SSRF and prompt-injection surface, so:
169.254.169.254 cloud-metadata endpoint), reserved, and multicast addresses. Redirects are followed manually with the same check on each hop, so a public URL cannot redirect into the internal network. Override per request with --allow-private-hosts for deliberate internal fetches.web_fetch and web_open wrap each page in a fence built from the 2026 primary-source guidance: a per-instance 128-bit random nonce in the open and close markers (so injected text cannot forge the close), a data-only directive, and neutralization of any copy of the marker inside the body, with optional datamarking (--datamark) for higher resistance. This reduces, but does not eliminate, indirect prompt injection: it prevents the boundary breakout, not persuasion. The real guarantees are channel separation (the MCP face delivers content through the tool-result channel), least privilege, and cutting exfiltration paths. The lower-level fetch command still returns the clean body unmodified, so piping and composition stay clean.Each layer is a folder with a port (a capability-named interface) and one or more adapters behind it, connected only by versioned JSON Schema 2020-12 contracts. Port fields are capability-named (snippet, fused_score, sources); a backend's native shape is mapped onto the port inside that backend's adapter. The default deployment runs in-process for speed, and because layers are coupled only through their contracts, a layer can later move to a subprocess or a local service without its neighbors changing. Additive contract changes are MINOR (consumers ignore unknown fields); a removal, rename, or type change is MAJOR, and consumer-driven contract tests fail any producer change that breaks a recorded fixture. Full design in docs/ARCHITECTURE.md.
Rather than a single claim that it is better, here is how I measure it, axis by axis, including where it is only at parity or behind:
| Axis | This tool | Notes |
|---|---|---|
| Retrieval quality | At top-tier parity on the common case | 2026 leaders are statistically tied on quality |
| Freshness | On-demand recency filter | per-engine, best-effort |
| Extraction recall / noise | Competitive (Trafilatura ~0.79 F1, articles ~0.93) | heuristic beats neural on quality and cost |
| Anti-bot success | About 70 to 90% without paid proxies | near-cloud with a plugged-in residential adapter |
| End-to-end latency | The unprotected common case (where leaders actually differ) | multi-engine + fuse adds some cost |
| Cost | About zero at the software layer | infra documented separately |
| Citation accuracy | Source-anchored, deduped results | no fabricated URLs |
Honest scope: top agentic-search APIs are tied on quality, and hard anti-bot at scale is irreducibly paid (even Firecrawl scores about 34% on independently tested protected sites). This project concedes the protected long tail to a swappable paid egress adapter and does not claim to beat everything. The egress adapter, when added, is scoped to search geo-targeting and rate-limit rotation, not anti-bot for page fetches (commercial VPNs use datacenter IPs that anti-bot systems flag).
A same-query, same-moment head-to-head against the web search built into Claude Code (a hosted, paid baseline) is written up in docs/BENCHMARK.md, with the exact commands so you can rerun it. The short version: on finding relevant, fresh pages the two are comparable; this tool adds clean on-demand extraction, multi-engine fusion, the arxiv and github tools, and runs free and locally, while the hosted search is frictionless and writes a summary in one call. It wins on cost, privacy, and control; on raw retrieval the gap is small.
Built: harness packaging ships the SKILL.md plus the bundled tool via npx skills add, a Claude Code plugin and marketplace, the MCP registry (server.json), and PyPI/uvx, with per-harness MCP registration documented in docs/INSTALL.md.
Planned, not built yet:
EngineAdapter port; an optional neural index.uv sync # install deps (including the dev group)
uv run pytest # 423 tests
uv run ruff check .
CI runs ruff and pytest on Python 3.11, 3.12, and 3.13 via uv. The contract tests validate real output against the frozen JSON Schemas, so a change that breaks a contract shape fails CI. Build one isolated layer at a time, against its versioned contract; adding or swapping an engine or an extractor touches only its adapter module.
MIT. See LICENSE. Optional anti-bot tiers that depend on AGPL components (for example nodriver) stay as out-of-band adapters you install separately, not bundled into the MIT core.
Be the first to review this server!
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption