MCP Security Crisis: Two Open-Source Frameworks Solving the Agent Security Problem

#security #ai #opensource #agents

MCP Security Crisis: Two Open-Source Frameworks Solving the Agent Security Problem

9.93% of MCP servers have description-code inconsistencies. Leading models suffer ~100% attack success under tool description poisoning. Here are two frameworks that actually solve this.

The Problem

The Model Context Protocol (MCP) has become the standard interface for connecting LLMs to external tools. As of mid-2026, the ecosystem encompasses over 2,200 public MCP servers. But the security landscape is dire:

9.93% of MCP servers have description-code inconsistencies — the tool description says one thing, the code does another (Shi et al., 2026)
~100% attack success rate under tool description poisoning on leading models (Liu et al., 2026)
53.7% security drop on 27B models under multi-agent attacks — larger models are MORE vulnerable, not less (McAllister et al., 2026)

The repo AIM-Intelligence/awesome-mcp-security documents these threats well. This post presents two open-source frameworks that solve them.

Solution 1: MCP Core Defense

A 7-phase security proxy for MCP agent systems.

MCP Core Defense sits between the agent and ALL MCP servers. Every tool call passes through 7 sequential verification phases:

Phase	Name	Vulnerability It Solves
1	Policy Engine	Permission Boundary Problems
2	Schema Validator	Tool Name Conflicts
3	DCI Checker	Description-Code Inconsistencies (the 9.93%)
4	TDP Detector	Tool Description Manipulation, Indirect Prompt Injection
5	Mutual TLS	OAuth Token Theft
6	Sandbox	Installer Risks, Supply Chain
7	SDK Adapter	Transparent integration — zero code changes

The architecture is defense-in-depth: each phase catches what the previous one might miss. Phase 3 (DCI Checker) directly addresses the 9.93% inconsistency rate found by Shi et al. Phase 4 (TDP Detector) catches the tool description poisoning that Liu et al. showed has ~100% success rate.

Stats: 127+ tests. Python 3.10/3.11/3.12. AGPL-3.0. Production-ready.

Solution 2: Agent Fixer Stage

Lightweight output verification for multi-agent AI workflows.

Agent Fixer Stage is based on a key finding from McAllister et al. (2026): a lightweight "Fixer" stage at the end of a multi-agent workflow collapses attack success from 53.7% to 0.6%.

It sits between the last agent and the user, verifying output before delivery using 4 layers:

Layer	Name	What It Catches
0	Normalization	Unicode attacks, zero-width chars, Cyrillic homoglyphs, leetspeak
1	Pattern Matching	30+ weighted patterns across 3 passes (normal, leetspeak, cross-line)
2	Embeddings	TF-IDF + cosine similarity against 33 malicious examples
3	LLM Judge	Ambiguous cases only (<5% of real usage)

Three actions: pass (output is clean), clean (remove malicious content, deliver), reject (block entirely, alert user).

Stats: 42+ tests. CI/CD-ready. Exit codes: 0=pass, 1=clean, 2=rejected.

How They Complement Each Other

	MCP Core Defense	Agent Fixer Stage
Layer	Tool call (in transit)	Output (at rest)
When	Before execution	Before delivery
Threat	Server-side attacks	Agent-side corruption
Model	Proxy	Filter

Use both for defense-in-depth: MCP Core Defense stops poisoned tools from executing, Agent Fixer Stage catches anything that slips through.

Research Backing

Shi et al. (2026) — 9.93% description-code inconsistency rate across MCP servers
Liu et al. (2026) — ~100% poisoning success rate on leading models
McAllister et al. (2026) — Fixer stage collapses attack success from 53.7% to 0.6%
arXiv:2503.23278 — MCP threat landscape analysis
arXiv:2504.03767 — MCP safety audit

DEV Community

MCP Security Crisis: Two Open-Source Frameworks Solving the Agent Security Problem

MCP Security Crisis: Two Open-Source Frameworks Solving the Agent Security Problem

The Problem

Solution 1: MCP Core Defense

Solution 2: Agent Fixer Stage

How They Complement Each Other

Research Backing

Links

Top comments (0)