Server data from the Official MCP Registry
Give brains to your small models. MCP server enforcing step-by-step Chain-of-Thought.
Give brains to your small models. MCP server enforcing step-by-step Chain-of-Thought.
CotForce-MCP is a well-architected MCP server for enforcing Chain-of-Thought reasoning in LLMs. Code quality is solid with proper input validation, structured logging, and error handling. The server appropriately uses environment variables for credentials and supports both MCP sampling and direct HTTP modes. Minor concerns include broad exception handling and some code quality improvements possible in error context propagation, but these do not present material security risks. Permissions align with the server's stated purpose as a developer tool for LLM reasoning. Supply chain analysis found 3 known vulnerabilities in dependencies (0 critical, 2 high severity). Package verification found 1 issue.
4 files analyzed · 8 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: MODE
Environment variable: API_BASE_URL
Environment variable: MODEL
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-islobodan-cotforce-mcp": {
"env": {
"MODE": "your-mode-here",
"MODEL": "your-model-here",
"API_BASE_URL": "your-api-base-url-here"
},
"args": [
"-y",
"@slbdn/cotforce-mcp"
],
"command": "npx"
}
}
}From the project's GitHub README.
"Give brains to your small models."
CotForce enforces step-by-step Chain-of-Thought, turning 4B parameter models into methodical reasoners.
A 4-billion-parameter Gemma cannot solve SEND + MORE = MONEY. It's a classic cryptarithmetic puzzle — 8 unique digits, 5 columns, 4 carry values. A bare 4B model guesses randomly. It hallucinates digits. It loses track of carries after column 2.
The same model, with CotForce:
Step 1: Analyze the leftmost column. S+M+C3 = MO. Max sum is 19998. ∴ M=1.
Step 2: S+1+C3 = 10+O. With M=1 and carry, O must be 0.
Step 3: D+E = Y+10C1 → C1=1. Now R+C1=9 → C1=0→R=9 (used), C1=1→R=8.
...
Step 11: All digits assigned. 9567 + 1085 = 10652. Verified.
11 structured reasoning steps. Zero hallucinations. Correct answer.
CotForce doesn't make small models smarter. It forces them to think before they speak — which is often all they need.
CotForce uses the MCP sampling protocol (sampling/createMessage) to call LLMs. If your client supports it (Claude Desktop, Cursor), nothing extra is needed.
If not — or if you're using a local model like Gemma via LMStudio — switch to direct HTTP mode:
{
"mcpServers": {
"cotforce": {
"command": "node",
"args": ["node_modules/@slbdn/cotforce-mcp/index.js"],
"env": {
"MODE": "direct",
"API_BASE_URL": "http://localhost:1234/v1",
"MODEL": "gemma-4-e4b-it-mlx"
}
}
}
}
That's it. The same 4B Gemma that couldn't solve SEND+MORE=MONEY above — now with CotForce, working locally through LMStudio.
{reasoning, result} via strict system prompts and few‑shot examples.CotParser interface. Select parsers via COT_PARSERS env var.
<reasoning>, Reasoning:)cl100k_base encoding, with fallback to character heuristic. Tweak via REASONING_OVERHEAD.MODEL environment variable to hint a specific model; leave unset for host default.MODEL.API_KEY to use direct mode.LOG_LEVEL).TRUNCATION_THRESHOLD).resultSchema parameter validates the result field type‑map; mismatches trigger retry.npm install @slbdn/cotforce-mcp
# or
git clone https://github.com/islobodan/cotforce-mcp
cd cotforce-mcp
npm install
npm run build
Requires Node.js ≥ 18.
Add to claude_desktop_config.json:
{
"mcpServers": {
"cotforce": {
"command": "npx",
"args": ["-y", "@slbdn/cotforce-mcp"],
"env": {
"MODEL": "claude-3-5-sonnet"
}
}
}
}
No clone, no build. npx -y pulls and runs directly from npm.
The server is configured via environment variables (all optional):
| Variable | Default | Description |
|---|---|---|
MODEL | (not set) | Model name hint (e.g. claude-3-5-sonnet, gpt-4o). If empty, no hint sent – MCP host decides. |
MAX_RETRIES | 2 | Number of retry attempts before returning raw output. |
BASE_TEMP | 0.1 | Initial sampling temperature. |
TEMP_INCREMENT | 0.2 | Temperature added per retry attempt. |
TIMEOUT | 60000 / 120000 | Sampling timeout in ms (60s). Direct HTTP mode uses longer default (120s) since local models are slower. |
CACHE_TTL | 3600000 | Result cache TTL in ms (default 1 hour). Set to 0 to disable. |
CACHE_MAX_ENTRIES | 100 | Maximum cached results before evicting oldest. |
COT_PARSERS | (all) | Comma-separated parser names to use (e.g., direct-json,fenced-block). Skips others. |
TRUNCATION_THRESHOLD | 0.95 | Ratio of output/budget that triggers truncation detection. Attempts truncated JSON recovery first, then retries with 1.5x budget. |
REASONING_OVERHEAD | 800 | Fixed token overhead added to the budget formula. Increase for verbose models. |
FALLBACK_MODELS | (not set) | Comma-separated list of fallback models (e.g. gpt-4o,claude-3-5-sonnet). Cycled on failure. |
MODE | auto | auto, sampling, or direct. auto uses direct HTTP when API_KEY is set and client lacks sampling support. |
API_KEY | (not set) | LLM API key for direct HTTP mode. Optional for local endpoints (LMStudio, Ollama). Required for remote providers (OpenAI, Anthropic, etc.). |
API_BASE_URL | https://api.openai.com | Base URL for direct HTTP mode. Change for LMStudio (http://localhost:1234/v1) or other providers. |
LOG_LEVEL | INFO | One of DEBUG, INFO, WARN, ERROR. |
MODEL=gpt-4o MAX_RETRIES=3 BASE_TEMP=0.2 TEMP_INCREMENT=0.15 LOG_LEVEL=DEBUG npx @slbdn/cotforce-mcp
Add to your MCP client configuration. A .mcp.json file is included in the package for auto-discovery by clients like Cursor, VS Code, and Windsurf. Copy the relevant config below to your client's settings:
With MCP sampling (Claude Desktop):
{
"mcpServers": {
"cotforce": {
"command": "node",
"args": ["/path/to/cotforce-mcp/index.js"],
"env": {
"MODEL": "claude-3-5-sonnet",
"MAX_RETRIES": "2"
}
}
}
}
With direct LLM HTTP (LMStudio, OpenAI, Ollama):
{
"mcpServers": {
"cotforce": {
"command": "node",
"args": ["/path/to/cotforce-mcp/index.js"],
"env": {
"MODE": "direct",
"API_BASE_URL": "http://localhost:1234/v1",
"MODEL": "local-model",
"MAX_RETRIES": "2"
}
}
}
}
Note:
API_KEYis optional for local endpoints like LMStudio or Ollama. It is required for remote providers like OpenAI or Anthropic.
The root
index.jsis a launcher that delegates todist/index.js. It guards against missing builds with a helpful error message.
What you see: finish_reason: "length" in the LLM response. The reasoning cuts off before the result field.
Why: The token budget is too tight. Complex reasoning (like SEND+MORE=MONEY) can need 3000+ output tokens, but the default minimum is 4096 — while the default model-level cap can vary.
Fix: Increase the budget overhead:
REASONING_OVERHEAD=1600 # default is 800, raise for verbose models
Or skip token-heavy parser layers to save budget for reasoning:
COT_PARSERS=direct-json,fenced-block # skip heuristic and brace-balanced
What you see: MCP error -32001: Request timed out before the solution appears.
Why: Complex CoT reasoning takes time — 60-90 seconds for local models like Gemma. This error can come from two places:
TIMEOUT env var.Fix — check both sides:
Increase CotForce's timeout:
TIMEOUT=180000 # 3 minutes
Check your MCP client's timeout setting:
LM Studio — add "timeout" to mcp.json (milliseconds):
{
"mcpServers": {
"cotforce": {
"command": "node",
"args": ["index.js"],
"env": {
"TIMEOUT": "180000"
},
"timeout": 300000
}
}
}
Claude Desktop — the tool call timeout is not directly configurable. A workaround is to increase CotForce's TIMEOUT to complete within the client's window, or use a faster model.
Cursor / VS Code — check the MCP extension or .vscode/mcp.json for a timeout or requestTimeout setting.
{
"name": "solve_problem",
"arguments": {
"prompt": "What is 7 * 8 + 2?"
}
}
{
"name": "solve_problem",
"arguments": {
"prompt": "List the prime numbers between 10 and 20",
"resultSchema": {
"primes": "object",
"count": "number"
}
}
}
If the result field doesn't match the schema, the server retries with a correction hint.
See EXAMPLES.md for 16 diverse examples including:
{
"content": [{
"type": "text",
"text": "🤖 Agentic CoT Result:\n\n**Reasoning:** Step 1: Multiply 7 * 8 = 56. Step 2: Add 2 to get 58.\n\n**Answer:** 58\n\n📊 Token Usage: 42 in / 150 out / 4096 budget"
}]
}
If parsing fails after all retries, the server returns the raw LLM output with a warning.
The parser is a priority-sorted pipeline of plugins. Five built-in parsers run in order:
| Priority | Name | What it does |
|---|---|---|
| 10 | direct-json | Parses whole output as JSON (strips ```json fences) |
| 20 | fenced-block | Extracts JSON from markdown code blocks |
| 30 | heuristic | Looks for <reasoning>/<result> XML tags or Reasoning:/Result: labels |
| 40 | brace-balanced | Finds first balanced {} in arbitrary text |
| 50 | truncated-recovery | Salvages reasoning from truncated JSON (hit token limit) |
Filter parsers via COT_PARSERS env var:
COT_PARSERS=direct-json,fenced-block node index.js
Write a custom parser:
import { CotParser, AgenticCotSchema } from "@slbdn/cotforce-mcp";
class YamlParser implements CotParser {
name = "yaml";
priority = 35; // runs after heuristic, before brace-balanced
parse(raw: string): { reasoning: string; result: unknown } | null {
// Custom YAML parsing logic here
return null; // return null if this output isn't YAML
}
}
Then register it programmatically:
import { defaultParserPipeline, ParserPipeline } from "@slbdn/cotforce-mcp";
const pipeline = defaultParserPipeline();
pipeline.addParser(new YamlParser());
const result = pipeline.parse(rawText);
solve_problem{ prompt: string } — the problem to solve.CotForce supports two modes for calling the LLM:
MCP Sampling (default with compatible clients):
sampling/createMessageDirect HTTP (for clients without sampling support):
/v1/chat/completions directlyMODE=auto when API_KEY is set and client lacks samplingMODE=directBoth modes use the same system prompt with few‑shot examples and strict schema constraints.
cotforce-mcp/
├── src/
│ ├── index.ts # MCP server, tool handlers, routing logic
│ └── lib/
│ ├── parser.ts # Parser pipeline: CotParser interface + 5 plugin parsers + Zod schemas
│ ├── tokens.ts # tiktoken integration + budget computation
│ ├── prompts.ts # Model-specific system prompts
│ ├── metrics.ts # In-memory request/performance counters
│ └── llm.ts # Direct HTTP LLM client (OpenAI-compatible)
├── tests/
│ ├── cache.test.ts # 10 unit tests for result caching
│ ├── parser.test.ts # 47 unit tests for parser layers
│ ├── tokens.test.ts # 23 unit tests for token budgeting
│ ├── schema.test.ts # 8 unit tests for result schema validation
│ ├── metrics.test.ts # 9 unit tests for metrics tracking
│ ├── prompts.test.ts # 12 unit tests for model-specific prompts
│ ├── llm.test.ts # 6 tests for direct mode detection
│ ├── retry.test.ts # 4 integration tests for retry loop
│ ├── progress.test.ts # 5 unit tests for progress notifications
│ └── server.test.ts # 9 integration tests via @slbdn/mcp-tester
├── index.js # Root launcher (delegates to dist/)
├── dist/ # Compiled TypeScript output
└── package.json
reasoning and result. Model-specific variants tuned for Claude, GPT-4, Gemini, Grok.COT_PARSERS env var and the CotParser interface.FALLBACK_MODELS) when primary model refuses.estimateTokens() (lightweight heuristic) for budget math and countTokens() (tiktoken) for exact counts. Sets maxTokens dynamically (between 4096 and 8192) via formula overhead + inputTokens × 4. Detects truncation via finish_reason: "length" and attempts JSON recovery before retrying.git clone https://github.com/islobodan/cotforce-mcp
cd cotforce-mcp
npm install
npm run build # compile TypeScript to dist/
npm run dev # tsc --watch
npm run typecheck # type-check src/ and tests/
| Script | Purpose |
|---|---|
npm run build | Compile TypeScript (src/ → dist/) |
npm run dev | Watch mode compilation |
npm run typecheck | TypeScript type-checking for source and tests |
npm test | Run full Jest test suite (133 tests) |
npm run test:smoke | Quick smoke test via mcp-tester CLI |
npm run test:tools | List available tools via mcp-tester CLI |
The test suite uses Jest with ts-jest (ESM) and @slbdn/mcp-tester for MCP server integration testing:
tests/parser.test.ts) — 47 unit tests covering all 5 parser plugins, edge cases, and AgenticCotSchema validation.tests/tokens.test.ts) — 16 unit tests for tiktoken integration, budget computation, and REASONING_OVERHEAD tuning.tests/schema.test.ts) — 8 unit tests for user-supplied resultSchema validation.tests/metrics.test.ts) — 9 unit tests for request counters, latency tracking, and token usage averages.tests/prompts.test.ts) — 10 unit tests for model-specific prompt selection.tests/llm.test.ts) — 3 unit tests for direct HTTP mode detection.tests/server.test.ts) — 11 integration tests for tool discovery, argument validation, server lifecycle, and concurrent calls.Custom Jest matchers are available via @slbdn/mcp-tester:
expect(tools).toHaveTool("solve_problem");
expect(tools).toHaveToolWithSchema("solve_problem");
expect(result).toReturnTextContaining("Reasoning:");
MIT © Slobodan Ivkovic
If you find CotForce-MCP useful, consider starring the repo and sharing your feedback!
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.