Server data from the Official MCP Registry
Production-ready RAG + MCP demo: eval-in-CI merge gate, Langfuse traces, structure-aware chunking.
Production-ready RAG + MCP demo: eval-in-CI merge gate, Langfuse traces, structure-aware chunking.
Valid MCP server (2 strong, 3 medium validity signals). No known CVEs in dependencies. Imported from the Official MCP Registry.
5 files analyzed · No issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
From the project's GitHub README.
A public Retrieval-Augmented Generation pipeline exposed as an MCP server. Sample content from Veterans Affairs education manuals.
The repo implements evaluation, observability, and structure-aware ingestion. Cost/latency tuning, tenant-level access control, and other production concerns are discussed in the article linked below.
📖 Full writeup on Medium: Enterprise Internal Knowledge Base RAG MCP: POC-to-Production
RAG demos tend to focus on the quality of the retrieval pipeline, without recognizing that production RAG fails on the next ten steps: prompt or model changes that pass code review but tank answer quality, cost and latency drift that cannot be traced to specific queries, cross-tenant leakage that only surfaces in audit. This repo shows what catching them looks like in practice.
The corpus is public (VA Education manuals — 238 documents, 9,000+ chunks) so anyone can clone, run, and adapt the pipeline.
git clone https://github.com/kimsb2429/internal-knowledge-base
cd internal-knowledge-base
# 1. Start Postgres + pgvector
docker compose up -d
# 2. Python env + dependencies
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# 3. Restore corpus fixture (~2 min — 238 docs + 9k chunks pre-embedded)
docker exec -i ikb_pgvector pg_restore -U ikb -d ikb < evals/fixture_v1.dump
# 4. Smoke-test the MCP server
python scripts/test_mcp_server.py # 7/7 tests pass
# 5. Start the MCP server (stdio transport)
python scripts/mcp_server.py
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"ikb": {
"command": "python",
"args": ["/absolute/path/to/internal-knowledge-base/scripts/mcp_server.py"]
}
}
}
Then ask Claude things like "What RPO handles GI Bill claims in Texas?" — the MCP server returns ranked chunks with citations.
Ingestion (one-time per corpus):
graph LR
A[KnowVA crawler<br/>HTML + PDF] --> B[Source-specific<br/>preprocessor]
B --> C[Structure-aware<br/>chunker]
C --> D[mxbai-embed-large<br/>local, 1024-dim]
D --> E[(pgvector)]
F[Anthropic Contextual<br/>Retrieval] -.-> E
E -.-> F
style E fill:#e1f5fe
Query (per MCP tool call):
graph LR
A[Claude Desktop<br/>MCP client] --> B[FastMCP server]
B --> C[pgvector top-K]
C --> D[Reranker<br/>mxbai or FlashRank]
D --> E[Claude Sonnet<br/>generation]
E --> A
E --> F[Langfuse trace]
style F fill:#fff9c4
Stack:
content_tsv GIN index for hybrid-readyquery), Resources (document://{source_id}), Prompts (cite_from_chunks)Full 110-question golden set, contextualized chunks + reranker:
| Metric | Score |
|---|---|
| Faithfulness | 0.95 |
| Answer Relevance | 0.91 |
| Context Precision | 0.61 |
| Context Recall | 0.52 |
| Context Relevance | 0.56 |
🔗 Live Langfuse trace (public, no login).
Notable result: Anthropic's Contextual Retrieval pattern produced modest lift on top of reranking (+4.8pp AnsRel, +4.1pp CtxPrec) at this scale — well short of the +35% recall their published numbers suggested. Reported as found; juiced numbers would defeat the point.
Every PR runs the golden set in fast mode (FlashRank reranker, ~3-4 min wall, $0.30 in Sonnet calls) against a fixture DB. PRs that regress more than ±5pp on top1/topk/keyword_recall, or +10pp on idk_rate, are blocked.
Forever-artifact: PR #5 — a deliberate failing-then-passing PR. Red CI catches a 20pp top1 regression; green CI confirms the fix. The Actions tab is the proof.
Workflow: .github/workflows/eval-gate.yml.
A few production-shape items are seams, not implementations:
auth_context parameter present on every MCP tool, typed, currently unused (labels the SSO/ACL seam)content_tsv GIN index is live; BM25 + RRF fusion at query time stays a post-launch additionThe writeup linked above covers these topics.
docs/ Research, evidence base, deep-dives
data/ Crawled corpus + golden query set
scripts/
crawl_knowva.py eGain v11 API crawler
enrich_metadata.py Headings, ACL, authority tier, content_category
knowva_preprocess.py Source-specific HTML normalization
chunk_documents.py Structure-aware splitter (preserves table colspan/rowspan)
embed_and_store.py mxbai-embed-large → pgvector
contextualize_chunks.py Anthropic Batches API for Contextual Retrieval
rerank.py mxbai-rerank + FlashRank
retrieve.py / generate.py RAG path
mcp_server.py FastMCP exposure
run_eval.py / score_eval.py / check_regression.py Eval harness + CI gate
evals/ Fixture DB dump + baseline JSON
.github/workflows/ eval-gate.yml — merge-gate workflow
Each script is idempotent and resume-safe.
python scripts/crawl_knowva.py # Crawl raw HTML (skip if data/knowva_manuals/articles/ exists)
python scripts/enrich_metadata.py # Add headings, ACL, authority tier
python scripts/knowva_preprocess.py # Normalize HTML quirks
python scripts/chunk_documents.py # Structure-aware split
python scripts/embed_and_store.py # mxbai → pgvector
python scripts/contextualize_chunks.py # Anthropic Batches API (~$12, optional but recommended)
Then python scripts/run_eval.py --fast to verify the eval baseline reproduces.
docs/2026-04-11-engineering-rag-evidence-and-howtos.md — engineering analysis, evidence base, Zero-to-MCP plandocs/2026-04-12-rag-pipeline-buy-vs-build.md — buy-vs-build map per pipeline stagedocs/deep-dive/2026-04-16-docs-vs-code-rag-adjudication.md — when unified RAG stops workingMIT — see LICENSE.
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.