MCP Security Crisis: Two Open-Source Frameworks Solving the Agent Security Problem
9.93% of MCP servers have description-code inconsistencies. Leading models suffer ~100% attack success under tool description poisoning. Here are two frameworks that actually solve this.
The Problem
The Model Context Protocol (MCP) has become the standard interface for connecting LLMs to external tools. As of mid-2026, the ecosystem encompasses over 2,200 public MCP servers. But the security landscape is dire:
- 9.93% of MCP servers have description-code inconsistencies — the tool description says one thing, the code does another (Shi et al., 2026)
- ~100% attack success rate under tool description poisoning on leading models (Liu et al., 2026)
- 53.7% security drop on 27B models under multi-agent attacks — larger models are MORE vulnerable, not less (McAllister et al., 2026)
The repo AIM-Intelligence/awesome-mcp-security documents these threats well. This post presents two open-source frameworks that solve them.
Solution 1: MCP Core Defense
A 7-phase security proxy for MCP agent systems.
MCP Core Defense sits between the agent and ALL MCP servers. Every tool call passes through 7 sequential verification phases:
| Phase | Name | Vulnerability It Solves |
|---|---|---|
| 1 | Policy Engine | Permission Boundary Problems |
| 2 | Schema Validator | Tool Name Conflicts |
| 3 | DCI Checker | Description-Code Inconsistencies (the 9.93%) |
| 4 | TDP Detector | Tool Description Manipulation, Indirect Prompt Injection |
| 5 | Mutual TLS | OAuth Token Theft |
| 6 | Sandbox | Installer Risks, Supply Chain |
| 7 | SDK Adapter | Transparent integration — zero code changes |
The architecture is defense-in-depth: each phase catches what the previous one might miss. Phase 3 (DCI Checker) directly addresses the 9.93% inconsistency rate found by Shi et al. Phase 4 (TDP Detector) catches the tool description poisoning that Liu et al. showed has ~100% success rate.
Stats: 127+ tests. Python 3.10/3.11/3.12. AGPL-3.0. Production-ready.
Solution 2: Agent Fixer Stage
Lightweight output verification for multi-agent AI workflows.
Agent Fixer Stage is based on a key finding from McAllister et al. (2026): a lightweight "Fixer" stage at the end of a multi-agent workflow collapses attack success from 53.7% to 0.6%.
It sits between the last agent and the user, verifying output before delivery using 4 layers:
| Layer | Name | What It Catches |
|---|---|---|
| 0 | Normalization | Unicode attacks, zero-width chars, Cyrillic homoglyphs, leetspeak |
| 1 | Pattern Matching | 30+ weighted patterns across 3 passes (normal, leetspeak, cross-line) |
| 2 | Embeddings | TF-IDF + cosine similarity against 33 malicious examples |
| 3 | LLM Judge | Ambiguous cases only (<5% of real usage) |
Three actions: pass (output is clean), clean (remove malicious content, deliver), reject (block entirely, alert user).
Stats: 42+ tests. CI/CD-ready. Exit codes: 0=pass, 1=clean, 2=rejected.
How They Complement Each Other
| MCP Core Defense | Agent Fixer Stage | |
|---|---|---|
| Layer | Tool call (in transit) | Output (at rest) |
| When | Before execution | Before delivery |
| Threat | Server-side attacks | Agent-side corruption |
| Model | Proxy | Filter |
Use both for defense-in-depth: MCP Core Defense stops poisoned tools from executing, Agent Fixer Stage catches anything that slips through.
Research Backing
- Shi et al. (2026) — 9.93% description-code inconsistency rate across MCP servers
- Liu et al. (2026) — ~100% poisoning success rate on leading models
- McAllister et al. (2026) — Fixer stage collapses attack success from 53.7% to 0.6%
- arXiv:2503.23278 — MCP threat landscape analysis
- arXiv:2504.03767 — MCP safety audit
Links
- MCP Core Defense: https://github.com/amurlaniakea/mcp-core-defense
- Agent Fixer Stage: https://github.com/amurlaniakea/agent-fixer-stage
- Awesome MCP Security repo: https://github.com/AIM-Intelligence/awesome-mcp-security
Both projects are open-source, tested, and production-ready. Feedback and contributions welcome.
License: AGPL-3.0-or-later
Top comments (0)