Is Internal Knowledge Base free?

Yes, Internal Knowledge Base is free to use.

How do I install Internal Knowledge Base?

Internal Knowledge Base is a local plugin. Install it using the provided package and add the generated configuration snippet to your AI app's MCP config file. Then restart your AI app.

Is Internal Knowledge Base safe to use?

Yes. Internal Knowledge Base passed MCP Marketplace's automated security scan with a score of 10/10 (low risk). Every server on MCP Marketplace is security-scanned before it's listed; see the full security report on this page for the findings and permissions.

What AI apps work with Internal Knowledge Base?

Internal Knowledge Base uses the Model Context Protocol (MCP) and works with any MCP-compatible AI app, including Claude, ChatGPT / Codex, Gemini, Copilot, Cursor, and more.

Back to Browse

Internal Knowledge Base MCP Server

by Kimsb2429

Developer ToolsLow Risk10.0MCP RegistryLocal

Free

Server data from the Official MCP Registry

Production-ready RAG + MCP demo: eval-in-CI merge gate, Langfuse traces, structure-aware chunking.

About

Production-ready RAG + MCP demo: eval-in-CI merge gate, Langfuse traces, structure-aware chunking.

Security Report

10.0

Low Risk10.0Low Risk

Valid MCP server (2 strong, 3 medium validity signals). No known CVEs in dependencies. Imported from the Official MCP Registry.

5 files analyzed · No issues found

Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.

Permissions Required

This plugin requests these system permissions. Most are normal for its category.

HTTP Network Access

Connects to external APIs or services over the internet.

env_vars

Check that this permission is expected for this type of plugin.

file_system

Check that this permission is expected for this type of plugin.

Documentation

View on GitHub

From the project's GitHub README.

Enterprise Internal Knowledge Base — Production-Ready RAG + MCP

A public Retrieval-Augmented Generation pipeline exposed as an MCP server. Sample content from Veterans Affairs education manuals.

The repo implements evaluation, observability, and structure-aware ingestion. Cost/latency tuning, tenant-level access control, and other production concerns are discussed in the article linked below.

📖 Full writeup in Towards AI: Enterprise Internal Knowledge Base RAG MCP: POC-to-Production

Why this exists

RAG demos tend to focus on the quality of the retrieval pipeline, without recognizing that production RAG fails on the next ten steps: prompt or model changes that pass code review but tank answer quality, cost and latency drift that cannot be traced to specific queries, cross-tenant leakage that only surfaces in audit. This repo shows what catching them looks like in practice.

The corpus is public (VA Education manuals — 238 documents, 9,000+ chunks) so anyone can clone, run, and adapt the pipeline.

Quickstart

git clone https://github.com/kimsb2429/internal-knowledge-base
cd internal-knowledge-base

# 1. Start Postgres + pgvector
docker compose up -d

# 2. Python env + dependencies
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 3. Restore corpus fixture (~2 min — 238 docs + 9k chunks pre-embedded)
docker exec -i ikb_pgvector pg_restore -U ikb -d ikb < evals/fixture_v1.dump

# 4. Smoke-test the MCP server
python scripts/test_mcp_server.py     # 7/7 tests pass

# 5. Start the MCP server (stdio transport)
python scripts/mcp_server.py

Consuming from Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "ikb": {
      "command": "python",
      "args": ["/absolute/path/to/internal-knowledge-base/scripts/mcp_server.py"]
    }
  }
}

Then ask Claude things like "What RPO handles GI Bill claims in Texas?" — the MCP server returns ranked chunks with citations.

Architecture

Ingestion (one-time per corpus):

graph LR
    A[KnowVA crawler<br/>HTML + PDF] --> B[Source-specific<br/>preprocessor]
    B --> C[Structure-aware<br/>chunker]
    C --> D[mxbai-embed-large<br/>local, 1024-dim]
    D --> E[(pgvector)]
    F[Anthropic Contextual<br/>Retrieval] -.-> E
    E -.-> F
    style E fill:#e1f5fe

Query (per MCP tool call):

graph LR
    A[Claude Desktop<br/>MCP client] --> B[FastMCP server]
    B --> C[pgvector top-K]
    C --> D[Reranker<br/>mxbai or FlashRank]
    D --> E[Claude Sonnet<br/>generation]
    E --> A
    E --> F[Langfuse trace]
    style F fill:#fff9c4

Stack:

Vector store: Postgres + pgvector (Docker, port 5433); content_tsv GIN index for hybrid-ready
Embeddings: mxbai-embed-large (1024 dims, local via sentence-transformers) — $0 API cost
Reranker: mxbai-rerank-base-v2 (full eval) / FlashRank MiniLM (CI fast mode, 22M ONNX, ~2s/query)
Generation: Claude Sonnet
MCP server: FastMCP 3.2.4 — Tools (query), Resources (document://{source_id}), Prompts (cite_from_chunks)
Observability: Langfuse Cloud, per-trace public sharing
Eval: DeepEval + 110-query golden set + GitHub Actions merge gate

Eval scores

Full 110-question golden set, contextualized chunks + reranker:

Metric	Score
Faithfulness	0.95
Answer Relevance	0.91
Context Precision	0.61
Context Recall	0.52
Context Relevance	0.56

🔗 Live Langfuse trace (public, no login).

Notable result: Anthropic's Contextual Retrieval pattern produced modest lift on top of reranking (+4.8pp AnsRel, +4.1pp CtxPrec) at this scale — well short of the +35% recall their published numbers suggested. Reported as found; juiced numbers would defeat the point.

Eval-in-CI as a merge gate

Every PR runs the golden set in fast mode (FlashRank reranker, ~3-4 min wall, $0.30 in Sonnet calls) against a fixture DB. PRs that regress more than ±5pp on top1/topk/keyword_recall, or +10pp on idk_rate, are blocked.

Forever-artifact: PR #5 — a deliberate failing-then-passing PR. Red CI catches a 20pp top1 regression; green CI confirms the fix. The Actions tab is the proof.

Workflow: .github/workflows/eval-gate.yml.

What this repo doesn't cover

A few production-shape items are seams, not implementations:

Multi-tenant scoping — auth_context parameter present on every MCP tool, typed, currently unused (labels the SSO/ACL seam)
Ingestion concurrency — single-threaded chunker + embedder; production would use a modulus-distributed worker pool
Hybrid search wiring — content_tsv GIN index is live; BM25 + RRF fusion at query time stays a post-launch addition

The writeup linked above covers these topics.

Repo layout

docs/                    Research, evidence base, deep-dives
data/                    Crawled corpus + golden query set
scripts/
  crawl_knowva.py            eGain v11 API crawler
  enrich_metadata.py         Headings, ACL, authority tier, content_category
  knowva_preprocess.py       Source-specific HTML normalization
  chunk_documents.py         Structure-aware splitter (preserves table colspan/rowspan)
  embed_and_store.py         mxbai-embed-large → pgvector
  contextualize_chunks.py    Anthropic Batches API for Contextual Retrieval
  rerank.py                  mxbai-rerank + FlashRank
  retrieve.py / generate.py  RAG path
  mcp_server.py              FastMCP exposure
  run_eval.py / score_eval.py / check_regression.py   Eval harness + CI gate
evals/                   Fixture DB dump + baseline JSON
.github/workflows/       eval-gate.yml — merge-gate workflow

Reproducing from raw corpus (~30 min)

Each script is idempotent and resume-safe.

python scripts/crawl_knowva.py            # Crawl raw HTML (skip if data/knowva_manuals/articles/ exists)
python scripts/enrich_metadata.py         # Add headings, ACL, authority tier
python scripts/knowva_preprocess.py       # Normalize HTML quirks
python scripts/chunk_documents.py         # Structure-aware split
python scripts/embed_and_store.py         # mxbai → pgvector
python scripts/contextualize_chunks.py    # Anthropic Batches API (~$12, optional but recommended)

Then python scripts/run_eval.py --fast to verify the eval baseline reproduces.

License

MIT — see LICENSE.

Reviews

No reviews yet

Be the first to review this server!

More Developer Tools MCP Servers

Fetch

Free

by Modelcontextprotocol · Developer Tools

Web content fetching and conversion for efficient LLM usage

Toleno

Free

by Toleno · Developer Tools

Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.

mcp-creator-python

Free

by mcp-marketplace · Developer Tools

Create, build, and publish Python MCP servers to PyPI — conversationally.

Internal Knowledge Base MCP Server

About

Security Report

Permissions Required

Documentation

Enterprise Internal Knowledge Base — Production-Ready RAG + MCP

Why this exists

Quickstart

Consuming from Claude Desktop

Architecture

Eval scores

Eval-in-CI as a merge gate

What this repo doesn't cover

Repo layout

Reproducing from raw corpus (~30 min)

Further reading

License

Reviews

No reviews yet

More Developer Tools MCP Servers

Fetch

Toleno

mcp-creator-python

MarkItDown

MCP Marketplace

FinAgent