Yes, Martin Loop is free to use.

How do I install Martin Loop?

Martin Loop is a local plugin. Install it using npm package: @martinloop/mcp and add the generated configuration snippet to your AI app's MCP config file. Then restart your AI app.

Is Martin Loop safe to use?

Yes. Martin Loop passed MCP Marketplace's automated security scan with a score of 9.7/10 (low risk). Every server on MCP Marketplace is security-scanned before it's listed; see the full security report on this page for the findings and permissions.

What AI apps work with Martin Loop?

Martin Loop uses the Model Context Protocol (MCP) and works with any MCP-compatible AI app, including Claude, ChatGPT / Codex, Gemini, Copilot, Cursor, and more.

Back to Browse

Martin Loop MCP Server

by Keesan12

Developer ToolsLow Risk9.7MCP RegistryLocal

Free

Server data from the Official MCP Registry

Governed MCP server for AI coding agents with budgets, verifier gates, and inspectable runs.

About

Governed MCP server for AI coding agents with budgets, verifier gates, and inspectable runs.

Security Report

9.7

Low Risk9.7Low Risk

Valid MCP server (2 strong, 3 medium validity signals). No known CVEs in dependencies. ⚠️ Package registry links to a different repository than scanned source. Imported from the Official MCP Registry. 1 finding(s) downgraded by scanner intelligence.

11 files analyzed · 1 issue found

Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.

Permissions Required

This plugin requests these system permissions. Most are normal for its category.

file_system

Check that this permission is expected for this type of plugin.

How to Install

Add this to your MCP configuration file:

{
  "mcpServers": {
    "io-github-keesan12-martin-loop": {
      "args": [
        "-y",
        "@martinloop/mcp"
      ],
      "command": "npx"
    }
  }
}

Documentation

View on GitHub

From the project's GitHub README.

MartinLoop

MartinLoop gives AI coding agents budgets, stop conditions, rollback rules, and receipts.

Built from thousands of agent runs where the problem was not intelligence -- it was uncontrolled execution.

Get started: npx -y martin-loop@latest start
Try the demo: npx -y martin-loop@latest demo

MartinLoop is part of the NVIDIA Inception program.

Why MartinLoop

AI coding agents are useful, but unbounded retry loops are expensive.

A task that looked like a small fix can become dozens of attempts, a blown token budget, and a diff nobody trusts. MartinLoop gives every run an explicit contract: objective, verifier, budget, scope, receipts, and a clear stop condition.

Use it when AI coding work needs to stay bounded, inspectable, and safe to review before it becomes expensive or destructive.

Why Teams Adopt MartinLoop

It turns agent behavior into inspectable run receipts you can actually review.
It enforces hard stop conditions before runaway retries spend more money.
It adds rollback-aware rules so failed attempts do not silently leave unsafe changes behind.
It helps teams compare outcomes across agents under one governed flow.

Teams use MartinLoop when they need governed agent execution that can be reviewed and trusted.

2-Minute Install Path

npx -y martin-loop@latest start
npx -y martin-loop@latest demo
cd martin-loop-demo
npm install
npx -y martin-loop@latest run "Summarize the demo workspace and prove tests still pass" --verify "npm test" --budget-usd 2 --max-iterations 1

Quick Start

Try MartinLoop in a disposable demo workspace:

npx -y martin-loop@latest start
npx -y martin-loop@latest demo
npx -y martin-loop@latest --version
cd martin-loop-demo
npm install
npx -y martin-loop@latest run "Summarize the demo workspace and prove tests still pass" --verify "npm test" --budget-usd 2 --max-iterations 1
npx -y martin-loop@latest dossier --latest
npx -y martin-loop@latest share --latest

Optional global install:

npm install -g martin-loop
martin-loop --version

If this flow is useful, open an issue with feedback so we can keep improving the public experience.

start prints the first-run guided path. run auto-checks doctor, session-start, and preflight, then executes when the environment is ready. Use --proof only when you intentionally want an explicit no-spend lane.

Inspect-first flow:

npx -y martin-loop@latest doctor
npx -y martin-loop@latest session-start
npx -y martin-loop@latest preflight "Summarize the demo workspace and prove tests still pass" --verify "npm test"

share --latest writes run-receipt.json and run-receipt.md into the selected run directory under share/. Proof-card images are opt-in with --with-proof-card or --proof-card-format.

Release notes for the current root package: MartinLoop 0.4.1.

Visual Proof

MartinLoop turns an AI coding run into an inspectable execution record: budget used, verifier result, changed files, rollback evidence, and final receipt.

Ungoverned agents can retry until cost and scope drift. MartinLoop adds budget caps, verifier gates, and audit evidence so the run has a clear stop condition.

Proof Receipts

Proof receipts are local share bundles for governed AI coding runs. They show the task, spend, budget, verifier result, receipt integrity, and any evidence boundary that should not be rounded into confidence.

This real governed run spent $0.51 against a $3.00 budget. The verifier passed and the receipt integrity was signed, but the proof stayed at EVIDENCE_BOUNDARY because rollback evidence was not recorded.

Generate your own receipt after a governed run:

npx -y martin-loop@latest run "Summarize the demo workspace and prove tests still pass" --proof --verify "npm test"
npx -y martin-loop@latest runs verify --latest
npx -y martin-loop@latest share --latest

Example receipt files: Markdown and JSON.

Run This Audit Yourself

Use this lane from a clean temp directory to verify the public CLI flow exactly as shipped:

npx -y martin-loop@0.4.1 --version
npx -y martin-loop@0.4.1 start
npx -y martin-loop@0.4.1 demo
cd martin-loop-demo
npm install
npx -y martin-loop@0.4.1 run "Summarize the demo workspace and prove tests still pass" --verify "npm test" --budget-usd 2 --max-iterations 1 --json
npx -y martin-loop@0.4.1 dossier --latest --json
npx -y martin-loop@0.4.1 share --latest --json

For deterministic installs, pin the package line (martin-loop@0.4.1) or use martin-loop@latest. Plain npx martin-loop can resolve a stale local cache on some machines.

Default share bundle outputs:

share/run-receipt.json
share/run-receipt.md

Optional proof-card outputs:

share/proof-card-r<revision>-<hash>.svg
share/proof-card-r<revision>-<hash>.png

See It In Action

The point is not that every governed run is always cheaper. The point is that every run becomes inspectable and enforceable: budget policy, verifier result, stop reason, and evidence are explicit.

For a deterministic public repro lane, use the benchmark workspace and compare governed execution to unbounded retry behavior:

npx martin-loop bench --suite under-3-challenge
npx martin-loop bench --suite ralphy-engineering-50

Ralph-Style Loops

A Ralph-style loop is the failure mode where an AI coding agent keeps trying without knowing when continuing is unsafe, uneconomical, or unlikely to succeed.

MartinLoop keeps the useful part of the loop, then adds brakes:

stop before budget overspend
classify unsafe or invalid actions before execution
write an audit record for every attempt
preserve rollback and verifier evidence for review
reduce runaway context growth with compact run summaries

Failure Taxonomy (13 Runtime Classes)

Public governed runs use one canonical taxonomy: the 13 runtime FailureClass values from @martin/contracts.

See the canonical table: Failure Taxonomy (13 Runtime Classes).

What It Does

Budget caps stop the next attempt before a configured USD, token, or iteration limit is exceeded.
Verifier gates require a real check, such as npm test, before a run can count as complete.
Policy checks block unsafe verifier commands, risky path changes, and secret-like task inputs before execution.
Failure classification uses canonical runtime classes for triage and reporting. See Failure Taxonomy (13 Runtime Classes).
Run receipts capture stop reason, verifier evidence, budget posture, integrity state, and the next safe action.
martin share --latest turns the latest governed run into a local share bundle with a redacted JSON receipt and Markdown recap. Proof-card images are generated only when explicitly requested.
MCP integration gives hosts one write-capable execution entrypoint plus richer planning, inspection, and review helpers.

How It Works

Layer	Purpose
Task contract	Objective, verifier plan, repo root, allowed paths, denied paths, acceptance criteria, workspace, project, and budget.
Policy and budget	Defaults come from `martin.config.yaml`; CLI flags can override them. Budget preflight blocks attempts that would exceed policy.
Agent adapters	Claude CLI, Codex CLI, Gemini CLI, direct-provider, and verifier-only adapters normalize execution results.
Safety and verification	Scope checks, verifier command checks, prompt integrity, and grounding decide whether work can continue.
Persistence	JSONL run records, evidence summaries, and repo-backed artifacts make every run inspectable later. Each loop record is locally signed (HMAC, per-runs-root key) and `dossier`/`runs get`/`runs verify`/`challenge`/`badge` report an `integrity` verdict (`verified` / `tamper_detected` / `unsigned`) so post-hoc edits to a record are detectable, not just inspectable.

Trust Boundaries

Cost and token outputs always include provenance (actual, estimated, or unavailable).
For Codex specifically, MartinLoop reports authoritative usage only when the host exposes it; otherwise MartinLoop labels usage as estimated and avoids presenting it as settled accounting.
Receipt integrity must be verified before a run is treated as trustworthy evidence for external review.

CLI

martin-loop doctor
martin-loop demo
martin-loop session-start [--host <claude|codex|gemini|generic>]
martin-loop phase status|contract|session-start|preflight|run [--execute]
martin-loop preflight <objective> [options]
martin-loop run <objective> [options]
martin-loop bench --suite <suiteId>
martin-loop triage
martin-loop dossier (--latest | --loop-id <id> | --file <path>)
martin-loop runs list|get|attempt|verify ...
martin-loop mcp print-config --host <codex|claude|gemini|generic>
martin-loop mcp install --host <codex|claude|gemini|generic>
martin-loop challenge [--loop-id <id> | --file <path> | --latest]
martin-loop share (--loop-id <id> | --file <path> | --latest) [--out-dir <path>]
martin-loop badge [--format svg|json] [--runs-dir <path>]

Common options:

--budget <n>            Hard cost cap in USD
--budget-usd <n>        Alias for --budget
--soft-limit-usd <n>    Soft budget threshold in USD
--verify <cmd>          Verifier command after each attempt
--proof                 Explicitly opt into a no-spend proof adapter lane
--max-iterations <n>    Maximum number of attempts
--max-tokens <n>        Maximum token budget
--engine <name>         Adapter to use: claude, codex, gemini, or openai
--cwd <path>            Repo root for the run
--allow-path <glob>     Restrict writes to this path pattern; repeatable
--deny-path <glob>      Block this path pattern; repeatable
--runs-dir <path>       Override the local Martin runs root

Examples below use npx martin-loop so they work without a global install. If you install martin-loop globally, the martin alias works too.

Use martin-loop share --latest after dossier when you want a redacted bundle you can hand to another person without sending raw run-store files.

More detail: CLI reference and configuration reference.

Benchmarks

MartinLoop ships a public deterministic benchmark workspace in benchmarks/ plus the installed-package bench command.

From an installed package:

npx martin-loop bench --suite under-3-challenge
npx martin-loop bench --suite ralphy-engineering-50

From a clean public clone:

pnpm install --frozen-lockfile
pnpm bench:build
pnpm bench:eval
pnpm bench:report:ralphy

Equivalent workspace-filter commands:

pnpm --filter @martin/benchmarks build
pnpm --filter @martin/benchmarks test
pnpm --filter @martin/benchmarks eval
pnpm --filter @martin/benchmarks report:ralphy

The installed-package command reads the shipped public fixtures. The repo-clone workflow runs the public benchmark workspace directly.

MCP

Run the standalone MCP package directly:

npx -y @martinloop/mcp

Add it to common hosts:

codex mcp add martin-loop -- npx -y @martinloop/mcp
claude mcp add --transport stdio --scope user martin-loop -- npx -y @martinloop/mcp
claude mcp add --transport stdio --scope user martin-loop -- cmd /c npx -y @martinloop/mcp

Generate host config from the root CLI:

npx martin-loop mcp print-config --host codex --transport stdio --profile minimal
npx martin-loop mcp print-config --host claude --transport stdio --profile diagnostic
npx martin-loop mcp print-config --host gemini --transport stdio --profile full-local
npx martin-loop mcp print-config --host generic --transport stdio --profile github-review

The root martin-loop package and the standalone @martinloop/mcp package move on separate version lines. The current root package line here is 0.4.1; the current standalone MCP source line is 0.3.7, and the live npm baseline remains 0.3.6 until that standalone release is cut.

The public MCP release train labels are:

0.1.4 operator foundation
0.2.0 cockpit expansion
0.2.5 public MCP package line
0.2.7 usability and review release
0.3.0 host adoption and onboarding release
0.3.1 review and handoff release

The standalone MCP registry/server identifier is io.github.Keesan12/martin-loop.

More detail: MCP setup, MCP tool reference, and MCP compatibility.

SDK

npm install martin-loop

import { MartinLoop, createClaudeCliAdapter } from "martin-loop";

const loop = new MartinLoop({
  adapter: createClaudeCliAdapter({ workingDirectory: process.cwd() }),
  defaults: {
    workspaceId: "my-workspace",
    projectId: "my-project",
    budget: {
      maxUsd: 3,
      softLimitUsd: 2.25,
      maxIterations: 3,
      maxTokens: 20_000,
    },
  },
});

const result = await loop.run({
  task: {
    title: "Fix auth regression",
    objective: "Fix the failing auth regression tests",
    verificationPlan: ["pnpm test"],
    repoRoot: process.cwd(),
  },
});

console.log(result.decision.status);

The root SDK also exports createCodexCliAdapter, createGeminiCliAdapter, createDirectProviderAdapter, createOpenAiCompatibleAdapter, and createVerifierOnlyAdapter.

More detail: SDK reference and package map.

Examples

Development

Requirements:

Node.js 20+
pnpm 10.x

git clone https://github.com/Keesan12/martin-loop.git
cd martin-loop
pnpm install --frozen-lockfile
pnpm lint
pnpm test
pnpm build
pnpm public:copy-scan
pnpm public:git-surface
pnpm oss:validate
pnpm public:smoke
pnpm release:matrix:local

Standalone MCP validation:

pnpm --filter @martinloop/mcp lint
pnpm --filter @martinloop/mcp test
pnpm --filter @martinloop/mcp build
pnpm --filter @martinloop/mcp smoke:pack
pnpm --filter @martinloop/mcp smoke:published:pack
pnpm --filter @martinloop/mcp verify:release

Contributing

Issues, bug reports, workflow feedback, and focused pull requests are welcome. Public-facing docs should stay concise, user-centered, and accurate.

git checkout -b feat/your-feature
pnpm lint
pnpm test
git commit -m "feat: describe what you built"
git push -u origin feat/your-feature

License

Apache-2.0. See LICENSE.

Reviews

No reviews yet

Be the first to review this server!

More Developer Tools MCP Servers

Git

Free

by Modelcontextprotocol · Developer Tools

Read, search, and manipulate Git repositories programmatically

Toleno

Free

by Toleno · Developer Tools

Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.

mcp-creator-python

Free

by mcp-marketplace · Developer Tools

Create, build, and publish Python MCP servers to PyPI — conversationally.

MarkItDown

Free

by Microsoft · Content & Media

Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption

MCP Marketplace

Free

by mcp-marketplace · Developer Tools

Search and install MCP servers from inside your AI client.

FinAgent

Free

by mcp-marketplace · Finance

Free stock data and market news for any MCP-compatible AI assistant.

Martin Loop MCP Server

About

Security Report

Findings (1)

Permissions Required

How to Install

Documentation

MartinLoop

Why MartinLoop

Why Teams Adopt MartinLoop

2-Minute Install Path

Quick Start

Visual Proof

Proof Receipts

Run This Audit Yourself

See It In Action

Ralph-Style Loops

Failure Taxonomy (13 Runtime Classes)

What It Does

How It Works

Trust Boundaries

CLI

Benchmarks

MCP

SDK

Examples

Development

Contributing

License

Reviews

No reviews yet

More Developer Tools MCP Servers

Git

Toleno

mcp-creator-python

MarkItDown

MCP Marketplace

FinAgent

Martin Loop MCP Server

About

Security Report

Findings (1)

Permissions Required

How to Install

Documentation

MartinLoop

Why MartinLoop

Why Teams Adopt MartinLoop

2-Minute Install Path

Quick Start

Visual Proof

Proof Receipts

Run This Audit Yourself

See It In Action

Ralph-Style Loops

Failure Taxonomy (13 Runtime Classes)

What It Does

How It Works

Trust Boundaries

CLI

Benchmarks

MCP

SDK

Examples

Development

Contributing

License

Reviews

No reviews yet

More Developer Tools MCP Servers

Git

Toleno

mcp-creator-python

MarkItDown

MCP Marketplace

FinAgent