DEV Community

Fenix
Fenix

Posted on

MCP Security Crisis: Two Open-Source Frameworks Solving the Agent Security Problem

MCP Security Crisis: Two Open-Source Frameworks Solving the Agent Security Problem

9.93% of MCP servers have description-code inconsistencies. Leading models suffer ~100% attack success under tool description poisoning. Here are two frameworks that actually solve this.

The Problem

The Model Context Protocol (MCP) has become the standard interface for connecting LLMs to external tools. As of mid-2026, the ecosystem encompasses over 2,200 public MCP servers. But the security landscape is dire:

  • 9.93% of MCP servers have description-code inconsistencies — the tool description says one thing, the code does another (Shi et al., 2026)
  • ~100% attack success rate under tool description poisoning on leading models (Liu et al., 2026)
  • 53.7% security drop on 27B models under multi-agent attacks — larger models are MORE vulnerable, not less (McAllister et al., 2026)

The repo AIM-Intelligence/awesome-mcp-security documents these threats well. This post presents two open-source frameworks that solve them.


Solution 1: MCP Core Defense

A 7-phase security proxy for MCP agent systems.

MCP Core Defense sits between the agent and ALL MCP servers. Every tool call passes through 7 sequential verification phases:

Phase Name Vulnerability It Solves
1 Policy Engine Permission Boundary Problems
2 Schema Validator Tool Name Conflicts
3 DCI Checker Description-Code Inconsistencies (the 9.93%)
4 TDP Detector Tool Description Manipulation, Indirect Prompt Injection
5 Mutual TLS OAuth Token Theft
6 Sandbox Installer Risks, Supply Chain
7 SDK Adapter Transparent integration — zero code changes

The architecture is defense-in-depth: each phase catches what the previous one might miss. Phase 3 (DCI Checker) directly addresses the 9.93% inconsistency rate found by Shi et al. Phase 4 (TDP Detector) catches the tool description poisoning that Liu et al. showed has ~100% success rate.

Stats: 127+ tests. Python 3.10/3.11/3.12. AGPL-3.0. Production-ready.


Solution 2: Agent Fixer Stage

Lightweight output verification for multi-agent AI workflows.

Agent Fixer Stage is based on a key finding from McAllister et al. (2026): a lightweight "Fixer" stage at the end of a multi-agent workflow collapses attack success from 53.7% to 0.6%.

It sits between the last agent and the user, verifying output before delivery using 4 layers:

Layer Name What It Catches
0 Normalization Unicode attacks, zero-width chars, Cyrillic homoglyphs, leetspeak
1 Pattern Matching 30+ weighted patterns across 3 passes (normal, leetspeak, cross-line)
2 Embeddings TF-IDF + cosine similarity against 33 malicious examples
3 LLM Judge Ambiguous cases only (<5% of real usage)

Three actions: pass (output is clean), clean (remove malicious content, deliver), reject (block entirely, alert user).

Stats: 42+ tests. CI/CD-ready. Exit codes: 0=pass, 1=clean, 2=rejected.


How They Complement Each Other

MCP Core Defense Agent Fixer Stage
Layer Tool call (in transit) Output (at rest)
When Before execution Before delivery
Threat Server-side attacks Agent-side corruption
Model Proxy Filter

Use both for defense-in-depth: MCP Core Defense stops poisoned tools from executing, Agent Fixer Stage catches anything that slips through.


Research Backing

  • Shi et al. (2026) — 9.93% description-code inconsistency rate across MCP servers
  • Liu et al. (2026) — ~100% poisoning success rate on leading models
  • McAllister et al. (2026) — Fixer stage collapses attack success from 53.7% to 0.6%
  • arXiv:2503.23278 — MCP threat landscape analysis
  • arXiv:2504.03767 — MCP safety audit

Links


Both projects are open-source, tested, and production-ready. Feedback and contributions welcome.

License: AGPL-3.0-or-later

Top comments (0)