Server data from the Official MCP Registry
Hardened Scholar MCP for deep academic research (Scopus, OpenAlex, Unpaywall) with PDF vision.
Hardened Scholar MCP for deep academic research (Scopus, OpenAlex, Unpaywall) with PDF vision.
Valid MCP server (2 strong, 3 medium validity signals). 4 known CVEs in dependencies (0 critical, 3 high severity) Imported from the Official MCP Registry. 1 finding(s) downgraded by scanner intelligence.
5 files analyzed · 5 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: SCOPUS_API_KEY
Environment variable: CONTACT_EMAIL
Environment variable: SCOPUS_INST_TOKEN
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-mlintangmz2765-scholar": {
"env": {
"CONTACT_EMAIL": "your-contact-email-here",
"SCOPUS_API_KEY": "your-scopus-api-key-here",
"SCOPUS_INST_TOKEN": "your-scopus-inst-token-here"
},
"args": [
"scholar-academic-mcp"
],
"command": "uvx"
}
}
}From the project's GitHub README.
A Model Context Protocol (MCP) server providing structured access to scientific literature databases. It serves as a unified interface for Scopus, OpenAlex, and Unpaywall, enabling AI agents to perform systematic paper discovery, author disambiguation, citation lineage tracking, and multimodal Content extraction.
Unified Literature Search
Author Identification & Metrics
Citation Lineage Tracking
Structured & Multimodal Extraction
Topic Mapping & Field Analysis
Access Management & Fallbacks
graph TD
A[LLM Agent] -->|MCP Protocol| B(Scholar MCP Server)
B --> C{Database Router}
C -->|Primary| D[Scopus API]
C -->|Fallback| E[OpenAlex API]
C -->|DOI Resolver| F[Unpaywall API]
C -->|Citations| P[CrossRef API]
D --> G{Access Check}
E --> G
F --> G
P --> G
G -->|Open Access| H[PDF Buffer Download]
G -->|Closed Access| I[Human-in-the-Loop Prompt]
H --> J[PyMuPDF Text Extractor]
H --> K[PyMuPDF Vision Renderer]
J --> L[Return Context to LLM]
K --> L
I --> L
B --> M{Author Router}
M -->|Profile| N[OpenAlex Authors API]
M -->|Metrics| O[Scopus Author API]
N --> L
O --> L
The fastest way to use the server is directly via PyPI:
pip install scholar-academic-mcp
# Clone the repository
git clone https://github.com/mlintangmz2765/Scholar-MCP.git
cd Scholar-MCP
# Setup virtual environment
python -m venv venv
.\venv\Scripts\activate # Windows
source venv/bin/activate # Unix
# Install in editable mode
pip install -e .
| Variable | Required | Description |
|---|---|---|
SCOPUS_API_KEY | Yes | Elsevier API key for Scopus search and author retrieval. |
SCOPUS_INST_TOKEN | No | Institutional token for full abstract access via Scopus. |
CONTACT_EMAIL | Yes | Email for OpenAlex/Unpaywall polite-pool API routing. |
Add the following to your configuration file (e.g., claude_desktop_config.json):
{
"mcpServers": {
"scholar-academic-mcp": {
"command": "scholar-academic-mcp",
"env": {
"SCOPUS_API_KEY": "your_scopus_api_key",
"SCOPUS_INST_TOKEN": "your_optional_inst_token",
"CONTACT_EMAIL": "your_email@domain.com"
}
}
}
}
Once configured, your AI agent can perform complex research workflows. Below are representative examples of tool inputs and structured outputs.
Prompt: "Find recent papers about 'Transformer architectures' published after 2022 using Scopus."
Tool Call: search_papers_tool(query="TITLE-ABS-KEY(Transformer architectures) AND PUBYEAR > 2022", limit=3)
Output:
Found 3 papers via Scopus:
- [SCOPUS_ID:85184...] Attention is All You Need? A Survey of Transformer Variants
Authors: Smith, J., Doe, A.
Date: 2024-01-15 | DOI: 10.1016/j.artint.2023.104012
Prompt: "I need to see the diagram for the neural network architecture on page 3 of this URL."
Tool Call: get_full_text_visual_tool(url="https://arxiv.org/pdf/1706.03762.pdf", max_pages=3)
Output:
[Text] "Successfully rendered 3 pages visually..."[Image] (PNG data of page 1)[Image] (PNG data of page 2)[Image] (PNG data of page 3 - containing the architecture diagram)Prompt: "Help me understand the subfields and domains related to 'Generative AI'."
Tool Call: search_topics_tool(query="Generative AI")
Output:
Found 1 topics for 'Generative AI':
- Artificial Intelligence
Hierarchy: Computer Science → Artificial Intelligence → Machine Learning
Works: 12,450 | Citations: 450,210
Description: A field of computer science that focuses on creating systems capable of generating...
The server registers 18 tools across 7 categories:
| Tool | Signature | Description |
|---|---|---|
search_papers_tool | (query, limit=5, use_scopus=True, sort_by="relevance") | Search papers via Scopus (Boolean syntax) or OpenAlex. Sort by cited_by_count or publication_year. |
get_paper_details_tool | (paper_id) | Fetch full metadata and abstract by Scopus ID, DOI, or OpenAlex ID (with automatic routing). |
search_titles_unpaywall_tool | (query, is_oa=None) | Search Unpaywall's database directly by title. Set is_oa=True for strictly OA results. |
get_related_works_tool | (paper_id, limit=10) | Find related/similar papers using OpenAlex's bibliographic coupling. |
| Tool | Signature | Description |
|---|---|---|
autocomplete_authors_tool | (name, limit=5) | Rapidly disambiguate author names and resolve OpenAlex Author IDs. |
search_authors_tool | (name, institution=None, limit=5) | Detailed bibliometric profiles: H-index, i10-index, ORCID, and research concepts. |
search_author_by_orcid_tool | (orcid) | Look up an author directly by ORCID (raw or URL format). |
retrieve_author_works_tool | (author_id, limit=15) | Chronologically sorted publications for a given OpenAlex author. |
get_author_profile_scopus_tool | (author_id) | Fetch precise Scopus-sourced h-index, citation counts, and affiliation. |
| Tool | Signature | Description |
|---|---|---|
get_citations_tool | (paper_id, direction="references") | Retrieve forward citations or backward references via OpenAlex. |
| Tool | Signature | Description |
|---|---|---|
get_full_text_tool | (url, start_page=None, end_page=None) | Extract text from an OA PDF or HTML page. Supports page range selection. |
get_full_text_visual_tool | (url, max_pages=3) | Render PDF pages as images for Vision-capable LLMs. |
fetch_pdf_text_unpaywall_tool | (doi) | All-in-one: resolve DOI via Unpaywall → download PDF → extract text. |
| Tool | Signature | Description |
|---|---|---|
get_bibtex_tool | (doi) | Generate a BibTeX entry for LaTeX via CrossRef content negotiation. |
format_citation_tool | (doi, style="apa") | Format citation in APA, IEEE, Chicago, Harvard, Vancouver, MLA, or Turabian. |
| Tool | Signature | Description |
|---|---|---|
get_unpaywall_link_tool | (doi) | Resolve a DOI to all available OA locations via Unpaywall. |
| Tool | Signature | Description |
|---|---|---|
search_topics_tool | (query, limit=10) | Browse research topics/concepts. Returns fields, domains, and publication volume. |
batch_lookup_tool | (dois: list[str]) | Batch-fetch metadata for multiple DOIs in a single call (max 50). |
Scholar MCP is engineered for precision and fault tolerance in high-stakes research environments, utilizing several layers of protection to ensure data integrity:
Strict Data Contracts (Pydantic)
Fault-Tolerant Networking (Tenacity)
tenacity for transient HTTP errors (429, 5xx).Resource Safety & Concurrency
asyncio.gather with localized exception handling to prevent session-wide failures.System Observability
stderr) logging provides execution visibility during the tool lifecycle without interfering with the MCP JSON-RPC protocol.Automated Verification
Scholar-MCP/
├── .github/workflows/ # GitHub Actions (CI & Releases)
├── scripts/ # Automation & Validation scripts
├── tests/ # Pytest suite (respx mocked)
├── server.py # FastMCP tool entry point
├── api.py # API Clients (Scopus, OpenAlex, Unpaywall, CrossRef)
├── extractor.py # PDF/HTML Extraction & Rendering
├── models.py # Pydantic Data Validation
├── server.json # MCP Registry Manifest
├── pyproject.toml # Python packaging configuration
├── requirements.txt # Dependencies
├── VERSION # Version tracking (v1.0.0)
├── LICENSE # MIT License
├── README.md # Documentation
├── .env.example # Template for API keys
└── .gitignore # Git exclusion rules
| Symptom | Cause | Resolution |
|---|---|---|
HTTP 401 from Scopus | Standard API keys lack META_ABS view access. | Set SCOPUS_INST_TOKEN or use OpenAlex as fallback. |
HTTP 403 on PDF download | Publisher anti-bot protection (Cloudflare, DataDome). | Provide the PDF manually to the LLM. |
| Empty Unpaywall results | Paper is behind a strict paywall with no OA copies. | Request the PDF from the author via ResearchGate or institutional access. |
SCOPUS_API_KEY is not set | Missing environment variable. | Ensure .env is configured or pass via MCP client env block. |
git checkout -b feature/my-feature).git commit -m 'feat: add new capability').git push origin feature/my-feature).Please ensure all code follows PEP 8 conventions.
MIT License. See LICENSE for details.
Disclaimer: Automated querying of publisher APIs must comply with the respective Terms of Service of Elsevier, OpenAlex, and Unpaywall. Do not distribute API keys. Adhere to all applicable rate limits.
mcp-name: io.github.mlintangmz2765/scholar
Be the first to review this server!
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption