Record and replay AI agent execution for debugging
This agent session recording and replay MCP server is well-designed for its intended purpose with no critical security issues. The code uses proper input validation via Zod, stores data locally in SQLite, and has no hardcoded credentials or malicious patterns. Minor code quality issues around error handling and input validation in edge cases prevent a higher score, but permissions are appropriate for a developer tool. Supply chain analysis found 4 known vulnerabilities in dependencies (0 critical, 3 high severity). Package verification found 1 issue.
4 files analyzed · 9 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-mdfifty50-boop-agent-replay": {
"args": [
"-y",
"agent-replay-mcp"
],
"command": "npx"
}
}
}From the project's GitHub README.
MCP server for agent session recording and replay — debug non-deterministic agent behavior with session comparison and divergence detection.
Record every action an agent takes, replay sessions step by step, diff two runs to find behavioral regressions, and pinpoint exactly where an agent diverged from expected output.
npx agent-replay-mcp
Add to claude_desktop_config.json:
{
"mcpServers": {
"agent-replay": {
"command": "npx",
"args": ["agent-replay-mcp"]
}
}
}
git clone https://github.com/mdfifty50-boop/agent-replay-mcp.git
cd agent-replay-mcp
npm install
node src/index.js
Start recording all actions for an agent session.
| Param | Type | Default | Description |
|---|---|---|---|
agent_id | string | required | Unique agent identifier |
metadata | object | {} | Optional metadata (task, model, environment) |
Returns a session_id for use with other tools.
Stop recording and return a session summary.
| Param | Type | Description |
|---|---|---|
session_id | string | Session ID from record_session |
Returns: action count, total duration, action type breakdown.
Log a single action during a recording session.
| Param | Type | Default | Description |
|---|---|---|---|
session_id | string | required | Active session ID |
action_type | string | required | Type (tool_call, llm_response, decision, error) |
input | any | required | Input to the action |
output | any | required | Output from the action |
reasoning | string | "" | Agent reasoning for this step |
duration_ms | number | 0 | Action duration in milliseconds |
Replay a recorded session step by step with full action detail.
| Param | Type | Description |
|---|---|---|
session_id | string | Session ID to replay |
Returns: complete action sequence with timing, reasoning, inputs, and outputs.
Behavioral diff between two sessions. Aligns actions by step index and highlights differences.
| Param | Type | Description |
|---|---|---|
session_id_1 | string | First session |
session_id_2 | string | Second session |
Returns: similarity ratio, identical/divergent step counts, first divergence step, and per-step diffs.
Find where an agent first deviated from expected output.
| Param | Type | Description |
|---|---|---|
session_id | string | Session to analyze |
expected_output | any | Expected final output, or array of per-step expected outputs |
If expected_output is an array, compares step by step. If a single value, finds the last matching output and flags the next step as the divergence point.
Export a session for sharing and offline analysis.
| Param | Type | Default | Description |
|---|---|---|---|
session_id | string | required | Session to export |
format | string | "json" | "json" or "markdown" |
Markdown format produces a readable transcript with step headers, reasoning, and code blocks.
| URI | Description |
|---|---|
agent-replay://sessions | All recorded sessions with status and action counts |
1. record_session — start recording at agent launch
2. For each agent action:
- log_action — capture input, output, reasoning, timing
3. stop_recording — finalize the session
4. Debug:
- replay_session — review what happened step by step
- compare_sessions — diff today's run vs yesterday's
- find_divergence_point — pinpoint where it went wrong
5. Share:
- export_session — JSON for tooling, markdown for humans
npm test
MIT
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.