Server data from the Official MCP Registry
Benchmark AI models on real prompts. Find cheaper, faster alternatives across 340+ models.
Benchmark AI models on real prompts. Find cheaper, faster alternatives across 340+ models.
This is a well-structured MCP server for the LLMTest benchmarking service. Authentication is properly handled via environment variables, code is clean with no malicious patterns or dangerous operations, and permissions align with the server's purpose (API calls to a hosted service). Minor code quality observations around broad error handling and SSE parsing robustness do not significantly impact security. Supply chain analysis found 2 known vulnerabilities in dependencies (0 critical, 2 high severity). Package verification found 1 issue.
4 files analyzed · 7 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: LLMTEST_API_KEY
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-tjacquesson-llmtest-mcp": {
"env": {
"LLMTEST_API_KEY": "your-llmtest-api-key-here"
},
"args": [
"-y",
"llmtest-mcp"
],
"command": "npx"
}
}
}From the project's GitHub README.
MCP server that benchmarks AI models on your actual prompts and finds cheaper, faster alternatives. Works with Claude Code, Cursor, Windsurf, and any MCP-compatible tool.
Sign up at llmtest.io and grab your API key from the dashboard.
Claude Code:
claude mcp add llmtest -- npx llmtest-mcp
Then set your key:
export LLMTEST_API_KEY=llmt_your_key_here
Cursor / Windsurf / Other MCP clients:
Add to your MCP config file:
{
"mcpServers": {
"llmtest": {
"command": "npx",
"args": ["llmtest-mcp"],
"env": {
"LLMTEST_API_KEY": "llmt_your_key_here"
}
}
}
}
Just ask in natural language:
LLMTest is a proxy that sits between your app and AI providers. Point your app at https://llmtest.io/v1 instead of calling OpenAI/Anthropic directly, and LLMTest tracks your usage, benchmarks alternatives, and suggests cost savings.
This MCP server gives your AI assistant access to LLMTest's tools so it can manage everything for you.
| Tool | Description |
|---|---|
status | Show proxy status and activity summary |
list_flows | List all AI flows with cost and latency stats |
get_suggestions | Get pending model-switch recommendations |
update_suggestion | Accept or dismiss a suggestion |
run_benchmark | Benchmark a flow against challenger models |
optimize_prompt | Rewrite a flow's prompt and find a cheaper model that still works |
seed_samples | Add test prompts for pre-launch benchmarking |
list_samples | Show stored test samples per flow |
list_new_models | Show new and trending models |
get_account | Check credit balance and usage |
get_autopilot_status | Check whether autopilot is on and whether the account is eligible |
enable_autopilot | Turn on weekly auto-optimization with safety gates + drift-based auto-revert |
disable_autopilot | Turn off autopilot (existing optimizations stay active) |
list_active_optimizations | List auto-accepted optimizations still inside their 24h revert window |
revert_optimization | Roll an auto-accepted optimization back to the previous prompt |
Autopilot automatically optimizes your flows on a weekly cadence. Changes that pass every safety gate go live with a 24-hour revert window. Drift detection keeps checking after that and rolls back if quality slips.
To enable from your IDE: ask your AI assistant something like "enable LLMTest autopilot". It will call enable_autopilot. Use get_autopilot_status to confirm prerequisites.
Prerequisites (checked per flow each cycle):
Safety gates (all must pass for auto-accept): 95% CI lower bound > 50% win rate, multi-judge agreement ≥ 80%, ≥ 20% total savings, no length-bias warning, golden-set regression check.
Revert: 24h window after auto-accept. After that, only drift detection can roll back.
Pre-launch (no traffic yet):
seed_samplesrun_benchmark to compare modelsget_suggestions with cheaper alternativesPost-launch (with real traffic):
https://llmtest.io/v1| Variable | Required | Description |
|---|---|---|
LLMTEST_API_KEY | Yes | Your API key from llmtest.io/dashboard |
LLMTEST_BASE_URL | No | Custom API URL (defaults to https://llmtest.io) |
MIT
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Web content fetching and conversion for efficient LLM usage
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.