Server data from the Official MCP Registry
MCP server for NVIDIA NIM - 50+ LLMs, multimodal, image gen, embeddings, reranking
MCP server for NVIDIA NIM - 50+ LLMs, multimodal, image gen, embeddings, reranking
This is a well-structured, production-ready MCP server for NVIDIA NIM models with strong security practices. Authentication is properly required via NVIDIA_API_KEY, input validation is comprehensive with Zod, and permissions align with the server's purpose (network calls to NVIDIA APIs, environment variable access). Minor code quality observations do not materially impact security. Supply chain analysis found 8 known vulnerabilities in dependencies (0 critical, 7 high severity). Package verification found 1 issue.
3 files analyzed ยท 13 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: NVIDIA_API_KEY
Environment variable: NVIDIA_NIM_BASE_URL
Environment variable: NVIDIA_AI_FOUNDATION_URL
Environment variable: DEFAULT_MODEL
Environment variable: ENABLE_IMAGE_GENERATION
Environment variable: ENABLE_VISION
Environment variable: ENABLE_MULTIMODAL
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-david-eve-za-nvidia-nim-mcp": {
"env": {
"DEFAULT_MODEL": "your-default-model-here",
"ENABLE_VISION": "your-enable-vision-here",
"NVIDIA_API_KEY": "your-nvidia-api-key-here",
"ENABLE_MULTIMODAL": "your-enable-multimodal-here",
"NVIDIA_NIM_BASE_URL": "your-nvidia-nim-base-url-here",
"ENABLE_IMAGE_GENERATION": "your-enable-image-generation-here",
"NVIDIA_AI_FOUNDATION_URL": "your-nvidia-ai-foundation-url-here"
},
"args": [
"-y",
"nvidia-nim-mcp"
],
"command": "npx"
}
}
}From the project's GitHub README.
A production-ready Model Context Protocol (MCP) server for consuming NVIDIA NIM (NVIDIA Inference Microservices) models. Supports 50+ LLMs, multimodal models, image generation, embeddings, reranking, function calling, vision, and code-specialized models with rich metadata for intelligent agent selection.
NVIDIA_API_KEY required; all others have sensible defaultsnvapi-...)# Install globally
npm install -g nvidia-nim-mcp
# Run directly
nvidia-nim-mcp
# Initialize your project
npm init -y
# Install locally
npm install nvidia-nim-mcp
# Run with npx
npx nvidia-nim-mcp
# Clone / download the project
cd nvidia-nim-mcp
# Install dependencies
npm install
# Build TypeScript
npm run build
# Pull from Docker Hub (when published)
docker pull nvidia-nim-mcp
# Or build locally
docker build -t nvidia-nim-mcp .
Copy .env.example to .env and fill in your API key:
cp .env.example .env
Only NVIDIA_API_KEY is required โ all other variables have production-ready defaults:
| Variable | Required | Default | Description |
|---|---|---|---|
NVIDIA_API_KEY | โ | โ | Your NVIDIA NGC API key |
NVIDIA_NIM_BASE_URL | โ | https://integrate.api.nvidia.com/v1 | Base URL for NIM API |
DEFAULT_MODEL | โ | black-forest-labs/flux.1-dev | Default model (best image generation) |
MAX_REQUESTS_PER_MINUTE | โ | 40 | Rate limit cap (NVIDIA API limit) |
MAX_TOKENS_PER_REQUEST | โ | 4096 | Hard cap on tokens per request |
REQUEST_TIMEOUT_MS | โ | 120000 | Request timeout (ms) |
MAX_RETRIES | โ | 3 | Max retry attempts on failure |
RETRY_DELAY_MS | โ | 1000 | Base delay between retries (ms) |
LOG_LEVEL | โ | info | error|warn|info|debug |
ENABLE_IMAGE_GENERATION | โ | true | Enable image generation tools |
ENABLE_VISION | โ | true | Enable vision/multimodal tools |
ENABLE_MULTIMODAL | โ | true | Enable multimodal task tools |
# Run the server
nvidia-nim-mcp
# With custom environment variables
NVIDIA_API_KEY=nvapi-your-key LOG_LEVEL=debug nvidia-nim-mcp
# Run with npx
npx nvidia-nim-mcp
# Or add to package.json scripts
# "scripts": { "start": "nvidia-nim-mcp" }
npm start
# Development mode with auto-reload
npm run dev
# Production mode (compiled)
npm run build && npm start
# Run with environment variables
docker run --rm \
-e NVIDIA_API_KEY=nvapi-your-key \
-e LOG_LEVEL=info \
nvidia-nim-mcp
# Run in background with port mapping (if needed)
docker run -d \
--name nvidia-nim-mcp \
-e NVIDIA_API_KEY=nvapi-your-key \
nvidia-nim-mcp
# Make executable (if not already)
chmod +x dist/index.js
# Run directly
./dist/index.js
# With environment variables
NVIDIA_API_KEY=nvapi-your-key ./dist/index.js
{
"mcpServers": {
"nvidia-nim": {
"command": "nvidia-nim-mcp",
"env": {
"NVIDIA_API_KEY": "nvapi-your-key-here",
"LOG_LEVEL": "info"
}
}
}
}
{
"mcpServers": {
"nvidia-nim": {
"command": "npx",
"args": ["nvidia-nim-mcp"],
"env": {
"NVIDIA_API_KEY": "nvapi-your-key-here",
"LOG_LEVEL": "info"
}
}
}
}
{
"mcpServers": {
"nvidia-nim": {
"command": "node",
"args": ["/absolute/path/to/nvidia-nim-mcp/dist/index.js"],
"env": {
"NVIDIA_API_KEY": "nvapi-your-key-here",
"LOG_LEVEL": "info"
}
}
}
}
chat_completionMulti-turn conversation with any NIM LLM.
{
"model": "nvidia/nemotron-3-ultra-550b-a55b",
"messages": [
{ "role": "user", "content": "Explain quantum computing" }
],
"temperature": 0.3,
"max_tokens": 4096
}
text_generationSingle-prompt text generation (simplified interface).
{
"prompt": "Write a haiku about machine learning",
"temperature": 0.5,
"max_tokens": 512
}
create_embeddingsConvert text(s) to vector embeddings for RAG/search.
{
"model": "nvidia/nv-embed-v1",
"input": ["NVIDIA makes GPUs", "AI runs on GPUs"],
"truncate": "END"
}
rerank_passagesRerank passages by relevance to a query.
{
"query": "What is CUDA?",
"passages": ["CUDA is a GPU programming platform", "NIM serves AI models"],
"top_k": 3
}
function_callingUse NIM models with tool/function calling.
{
"model": "z-ai/glm-5.1",
"messages": [{ "role": "user", "content": "What's the weather in Paris?" }],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": {
"type": "object",
"properties": { "city": { "type": "string" } },
"required": ["city"]
}
}
}]
}
generate_imageGenerate images from text prompts using FLUX.1, SDXL, SD3, DiffusionGemma.
{
"model": "black-forest-labs/flux.1-dev",
"prompt": "A photorealistic mountain landscape at sunset, 8K",
"width": 1024,
"height": 1024,
"steps": 30,
"cfg_scale": 3.5,
"sampler": "euler_a",
"scheduler": "simple"
}
analyze_imageAnalyze and describe images using vision/multimodal models.
{
"model": "moonshotai/kimi-k2.6",
"image_url": "https://example.com/image.jpg",
"prompt": "Describe this image in detail",
"detail": "high"
}
multimodal_taskPerform multimodal tasks combining text and images.
{
"model": "minimaxai/minimax-m3",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Analyze this chart" },
{ "type": "image_url", "image_url": { "url": "https://example.com/chart.png" } }
]
}
],
"max_tokens": 2048
}
list_modelsList available models with rich metadata and advanced filtering.
{
"category": "code",
"commercial_use": true,
"supports_reasoning": true,
"tags": ["coding", "agentic"],
"include_details": true
}
Filter Options:
category: language, embedding, reranking, vision, code, multimodal, image_generation, allcommercial_use: Filter by commercial licensesupports_reasoning: Filter by reasoning capabilitysupports_vision: Filter by vision capabilitysupports_function_calling: Filter by function callingsupports_multimodal: Filter by multimodal inputmin_context_length: Minimum context window (tokens)tags: Filter by use case tagshardware: Filter by GPU type (Hopper, Blackwell, Ampere)include_details: Include full metadata (benchmarks, image specs, etc.)get_model_infoGet complete metadata for a specific model.
{ "model_id": "nvidia/nemotron-3-ultra-550b-a55b" }
Returns: licensing, hardware requirements, benchmarks, image gen specs, reasoning modes, tags, supported languages, etc.
compare_modelsCompare 2-5 models side-by-side across all decision factors.
{
"model_ids": [
"nvidia/nemotron-3-ultra-550b-a55b",
"deepseek-ai/deepseek-v4-pro",
"moonshotai/kimi-k2.6",
"z-ai/glm-5.1"
]
}
Returns: Structured comparison table with licensing, hardware, benchmarks, capabilities, tags, image generation specs, etc.
| Model | Parameters | Context | License | Commercial | Best For |
|---|---|---|---|---|---|
nvidia/nemotron-3-ultra-550b-a55b | 550B (55B active) | 131K | OpenMDW-1.1 | โ | Frontier reasoning, coding, agentic, 1M context, multilingual |
nvidia/nemotron-3-ultra-550b-a55b-instruct | 550B | 131K | OpenMDW-1.1 | โ | Instruction-tuned variant |
minimaxai/minimax-m3 | 428B (22B active) | 1M | Non-Commercial | โ | Multimodal, video (30min), 8hr coding, agentic |
moonshotai/kimi-k2.6 | 1T (32B active) | 256K | Modified MIT | โ | Long-horizon coding, 300 agents, vision, agentic |
deepseek-ai/deepseek-v4-pro | 1.6T (49B active) | 1M | MIT | โ | Advanced coding, math, reasoning, 3 reasoning modes |
z-ai/glm-5.1 | 754B (DSA) | 131K | MIT | โ | Software engineering, agentic, SWE-Bench 58.4% |
qwen/qwen3.5-397b-a17b | 397B (MoE) | 131K | Research | โ | Large-scale multilingual, multimodal |
mistralai/mistral-large-3-675b-instruct-2512 | 675B | 131K | Research | โ | Frontier reasoning, multimodal |
openai/gpt-oss-120b | 120B | 131K | Apache 2.0 | โ | Open-weight, research, fine-tuning |
google/diffusiongemma-26b-a4b-it | 25.2B (3.8B active) | 256K | Apache 2.0 | โ | Diffusion text gen, 35+ langs, fast, multimodal |
| Model | Parameters | Context | License | Commercial |
|---|---|---|---|---|
z-ai/glm-5.1 | 754B | 131K | MIT | โ |
z-ai/glm5 | - | 128K | Z.ai | โ |
qwen/qwen2.5-coder-32b-instruct | 32B | 131K | Research | โ |
| Model | Parameters | Context | Vision | Video | License | Commercial |
|---|---|---|---|---|---|---|
meta/llama-3.2-90b-vision-instruct | 90B | 128K | โ | โ | Llama 3.2 | โ |
meta/llama-3.2-11b-vision-instruct | 11B | 128K | โ | โ | Llama 3.2 | โ |
nvidia/neva-22b | 22B | 4K | โ | โ | NVIDIA | โ |
microsoft/phi-3.5-vision-instruct | - | 128K | โ | โ | MIT | โ |
minimaxai/minimax-m3 | 428B | 1M | โ | โ (30min) | Non-Commercial | โ |
moonshotai/kimi-k2.6 | 1T | 256K | โ | โ | Modified MIT | โ |
| Model | Architecture | Resolutions | Aspect Ratios | Max Images | ControlNet | License | Commercial |
|---|---|---|---|---|---|---|---|
black-forest-labs/flux.1-dev | Diffusion Transformer | 1024ยฒ, 1152ร896, 1344ร768, 21:9 | 1:1, 16:9, 9:16, 4:3, 3:4, 21:9 | 1 | Canny, Depth | Apache 2.0* | โ* |
black-forest-labs/flux.1-kontext-dev | Diffusion Transformer | Same | Same | 1 | - | Apache 2.0* | โ* |
nvidia/stable-diffusion-xl | UNet + Attention | 1024ยฒ, 1152ร896, 1216ร832 | 1:1, 16:9, 9:16, 4:3, 3:4 | 4 | - | SDXL 1.0 | โ ** |
stabilityai/sd-3-medium | SD3 | Same | Same | 2 | - | Stability AI | โ ** |
nvidia/sdxl-turbo | ADD | 512ยฒ, 1024ยฒ | 1:1 | 4 | - | SDXL 1.0 | โ ** |
*Non-commercial default; commercial via contact
**Requires Stability AI membership
| Model | Type | Context | Dimensions | License | Commercial |
|---|---|---|---|---|---|
nvidia/nv-embedqa-e5-v5 | Embedding | 512 | - | NVIDIA | โ |
nvidia/nv-embed-v1 | Embedding | 4096 | - | NVIDIA | โ |
baai/bge-m3 | Embedding | 8192 | - | MIT | โ |
nvidia/nv-rerankqa-mistral-4b-v3 | Reranking | 4096 | - | NVIDIA | โ |
NVIDIA_API_KEY)The project includes a comprehensive test suite:
# Run all tests
npm test
# Run tests with coverage report
npm test -- --coverage
# Run tests in watch mode
npm test -- --watch
# Run specific test file
npm test src/handlers.test.ts
Current Test Status: โ All tests passing (96 tests)
# Install dependencies
npm install
# Compile TypeScript to JavaScript
npm run build
# Clean build artifacts
npm run clean
# Development mode with auto-reload
npm run dev
# Run linter
npm run lint
# Run tests
npm test
# Run both linting and tests
npm run check
Contributions are welcome!
git checkout -b feature/your-feature-namenpm run check to verify code quality and testsgit push origin feature/your-feature-namenpm run dev for continuous developmentnpm test to verify your changesnpm run build to compile the projectnpm run lint to check code quality# Build the project
npm run build
# Create NPM package (.tgz)
npm pack
# Build Docker image
docker build -t nvidia-nim-mcp .
# All checks (lint, test, build)
npm run check && npm run build
MIT
Be the first to review this server!
by Modelcontextprotocol ยท Developer Tools
Read, search, and manipulate Git repositories programmatically
by Modelcontextprotocol ยท Developer Tools
Web content fetching and conversion for efficient LLM usage
by Toleno ยท Developer Tools
Toleno Network MCP Server โ Manage your Toleno mining account with Claude AI using natural language.