惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

D
DataBreaches.Net
T
Threatpost
N
News and Events Feed by Topic
PCI Perspectives
PCI Perspectives
V2EX - 技术
V2EX - 技术
D
Docker
G
Google Developers Blog
Microsoft Security Blog
Microsoft Security Blog
N
News and Events Feed by Topic
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Google Online Security Blog
Google Online Security Blog
The GitHub Blog
The GitHub Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
Y
Y Combinator Blog
M
MIT News - Artificial intelligence
Blog — PlanetScale
Blog — PlanetScale
博客园 - 司徒正美
T
Troy Hunt's Blog
Webroot Blog
Webroot Blog
Security Archives - TechRepublic
Security Archives - TechRepublic
量子位
Apple Machine Learning Research
Apple Machine Learning Research
H
Help Net Security
F
Full Disclosure
B
Blog
O
OpenAI News
H
Hackread – Cybersecurity News, Data Breaches, AI and More
博客园_首页
Google DeepMind News
Google DeepMind News
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Engineering at Meta
Engineering at Meta
大猫的无限游戏
大猫的无限游戏
Forbes - Security
Forbes - Security
Know Your Adversary
Know Your Adversary
B
Blog RSS Feed
MongoDB | Blog
MongoDB | Blog
Scott Helme
Scott Helme
T
The Exploit Database - CXSecurity.com
博客园 - 聂微东
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
The Last Watchdog
The Last Watchdog
Recorded Future
Recorded Future
IT之家
IT之家
Project Zero
Project Zero
Stack Overflow Blog
Stack Overflow Blog
小众软件
小众软件
Attack and Defense Labs
Attack and Defense Labs
L
Lohrmann on Cybersecurity
SecWiki News
SecWiki News
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python) The Hidden Cost of AI Systems Nobody Talks About. undefined vs undeclared, and how typeof behaves Switching from file-based jobs to NATS/Kafka in Rust without changing code io_uring Adventures: Rust Servers That Love Syscalls Why Agentic AI is Killing the Traditional Database The POUR principles of web accessibility for developers and designers Quantum Neural Network 3D — A Deep Dive into Interactive WebGL Visualization How To Install Caveman In Codex On macOS And Windows Automation Pipeline Reliability: Why Your Workflow Breaks When Nobody Is Watching I Built an 'Open World' AI Coding Agent — It Works From ANY Folder From Freelancing to Product: A Tech Service Company's SaaS Transformation China's AI Giants: Adding Tencent Hunyuan & ByteDance Doubao to AI University (74 Providers) On the Vibe Coders and Their Lies clerk: Auto-Summarize Your Claude Code Sessions AI Weekly — 2026/04/10–04/17 | The Model Lockdown Is Here, but the Toolchain Is the Real Battleground AI 週報 — 2026/04/10–2026/04/17 模型封鎖潮來了,但工具鏈才是真戰場 Maybe this is how Open-Source apps are born... 🚀 Fine-Tune LLMs with LoRA and QLoRA: 2026 Guide tRPC v11 + Next.js App Router: End-to-End Type Safety Without the Boilerplate ShadCN UI in 2026: Why I Stopped Installing Component Libraries and Started Owning My Components SaaS Billing in React Server Components: Stripe + Supabase Without a Single `useEffect` Join our DEV Weekend Challenge — $1,000 in Prizes Across TEN winners! Submissions Due April 20 at 6:59 AM UTC. Implementing FSRS Spaced Repetition in Flutter + Supabase — Adding Memory Science to an AI Learning App "I Texted My Localhost From the Train — Claude Code Fixed the Bug Before I Got Home" I Built a Sales Prep AI and It Went Deeper Than Expected Design to Code #2: One JSON, Eleven Outputs Solving the 100M-Row Problem: A Summary Table Pattern for High-Volume Push Notification Logs Flutter Web With Wasm: What Actually Changes For Developers I Built 50 Royalty-Free Soundtracks for My Side Project in a Weekend Using AI Music Generation The Vibe Coding Security Checklist: 7 Things to Check Before You Ship Stop Letting Googlebot Guess Fix Your React App's SEO Right Desconstruindo o Streaming do LinkedIn: Como Criar um Engine de Extração de Vídeo de Alta Performance com HLS e FFmpeg (EDA Part-1) EDA (Exploratory Data Analysis) Explained With Real Life — Why Looking at Your Data Is the Most Important Step in Machine Learning Brand Relationship Management at Scale: Our 4-Touch Outreach System for 200+ Brands Why String.fromEnvironment() Might Return an Empty String in Dart JGuardrails 1.0.0 — Hardening Java LLM Apps Against Jailbreaks, Toxicity, and Prompt Injection Plan and Schedule a Full Week of Threads Content From One Claude Conversation Coding Cat Oran Ep3, Five Tables Changed Everything Updated: BFF Pattern I'm done watching freelancers get buried by 200 proposals. So I'm building the alternative. This is my first post BFS Algorithm in Java Step by Step Tutorial with Examples Tracking LLM Pricing Monthly: An Open Dataset for 22 AI Models How We Measure Content ROI on a Comparison Site: Revenue Attribution Without Perfect Data Introducing Nova AI Ops: The AI-Native Operating System for SRE Teams I built a free desktop video downloader for Windows — Grabbit How Talkie OCR Helps Vision-Impaired & Dyslexic Users Read the World Around Them VRCFaceTracking安装和iPhone面捕配置教程,有bug Even CrowdStrike Can't See Your Agents The Automation Gold Rush: What n8n Workflows and Claude Are Opening Up for Developers Right Now
# Why Most "Production-Ready" MCP Servers Actually Aren't
Math Enemy · 2026-06-19 · via DEV Community

Disclosure: I'm the author of SUPER-MCP, an open-source MCP server. The criteria in this article are derived from a threat model, not from SUPER-MCP's feature set. Apply this checklist to SUPER-MCP itself and you'll find it passes most items but not all: plugin OS isolation remains category 2 (tracked as a release-blocking open item), and task record encryption is a documented gap.


The MCP ecosystem has a labeling problem.

Search GitHub today and you'll find dozens of MCP server boilerplates proudly stamped "production-ready." Some have clean READMEs and real star counts. A few ship with Docker configs and JWT support. The official Model Context Protocol reference servers maintained by Anthropic's own steering group are explicit about this distinction. Their repository README states that these servers are intended as educational examples, not production-ready solutions, and that developers should evaluate their own security requirements. The community repositories claiming otherwise didn't get that memo.

"Production-ready" has inflated to near-meaninglessness. In the MCP context specifically, the gap between what that label implies and what it actually delivers can expose your users to real harm. This article focuses on the security dimension of that gap — operational, performance, and reliability concerns are equally important, but they're separate topics.

Here's how to evaluate an MCP server's true production posture in under ten minutes. Most signals are visible in documentation, startup behavior, and README structure; a few require brief source review, as noted.


Why This Matters More for MCP Than for Typical APIs

The threat model starts with scale.

MCP is not a niche experiment. As of early 2026, the protocol has crossed 97 million monthly SDK downloads and earned over 81,000 GitHub stars, with every major AI vendor — Anthropic, OpenAI, Google, Microsoft, and AWS — shipping support. In December 2025, Anthropic donated MCP to the Linux Foundation's Agentic AI Foundation, cementing it as a vendor-neutral standard under formal governance. The forthcoming July 2026 release candidate — the largest protocol revision since launch — adds first-class Tasks support, a stateless HTTP core, and authorization hardened closer to OAuth 2.0 and OpenID Connect. This is not a protocol on the way out. It has broader cross-vendor adoption than any alternative in the ecosystem.

That context matters for what follows. The security gaps described in this article are not arguments against MCP. They are arguments for taking production hardening seriously in proportion to MCP's scale and trajectory.

A typical API server failing in production means downtime or errors. An MCP server failing in production means something different: an AI agent with tool access behaving in ways neither you nor your users intended.

The threat model is genuinely different — and 2025–2026 research has now confirmed it at scale.

In March 2026, researchers at NYIT published systematic threat modeling of MCP implementations (arXiv:2603.22489), applying STRIDE and DREAD frameworks across seven major MCP clients. Their finding: tool poisoning — embedding malicious instructions inside tool metadata — is the most prevalent and impactful client-side vulnerability, with most clients failing due to insufficient static validation.

A separate benchmark sharpens that picture in an uncomfortable direction. The MCPTox study (arXiv:2508.14925) tested 45 live MCP servers against real LLMs and found that more capable models can be more susceptible to tool-level attacks, not less. Larger models and reasoning-enabled configurations showed higher attack success rates across multiple tested conditions — superior instruction-following makes models more compliant with malicious metadata, not more resistant to it. The implication for enterprise deployments: hardening the server matters more than assuming the model will catch what slips through.

Then in April 2026, OX Security disclosed an architectural flaw baked into Anthropic's official MCP SDKs across Python, TypeScript, Java, and Rust that enables remote code execution by passing user-controlled configuration values directly to shell execution without sanitization. Ten high- and critical-severity CVEs. Over 150 million package downloads in scope. Four distinct exploit paths. Affected environments included VS Code, Cursor, Windsurf, Claude Code, and Gemini-CLI. Anthropic formally declined to patch the root cause at the protocol level — doing so would require changing the SDK's design philosophy of being unopinionated about execution. Every downstream framework that trusted the reference implementation inherited the flaw.

The same research surfaced a separate finding at the distribution layer: nine of eleven MCP marketplaces surveyed accepted a proof-of-concept malicious package without any validation gate. This is a governance gap distinct from the SDK architectural flaw itself — not a code vulnerability, but an ecosystem-wide absence of validation tooling at the point where packages enter circulation.

Those findings are no longer confined to academic and vendor research. In May 2026, the NSA's Artificial Intelligence Security Center published a Cybersecurity Information Sheet on MCP security (CSI U/OO/6030316-26) — the first formal U.S. government guidance addressing MCP deployment risks. It explicitly recommends sandboxing tool execution, treating all tool outputs as untrusted and filtering them before passing downstream, and validating parameters against defined schemas. OWASP has published an MCP Top 10 project cataloging the highest-risk vulnerability categories in MCP deployments — including tool poisoning, insufficient input validation, and inadequate output filtering.

Some argue the protocol-level responsibility sits appropriately with implementers — the same way SQL injection defense is a developer responsibility, not a database engine responsibility. The counter-argument, and the one this article adopts, is that protocol-level defaults matter at supply chain scale. When the reference implementation carries an architectural flaw and 150 million downloads inherit it silently, the argument that implementers should have caught it stops being satisfying.

The servers marketing themselves as "production-ready" were largely silent on all of these vectors. They added JWT support and called it a day.

Production readiness for MCP isn't about feature count. It's about what happens when things go wrong.


The Five Signals That Actually Matter

Given that threat model — tool poisoning exploiting AI compliance, architectural SDK vulnerabilities propagating silently through supply chains, a government cybersecurity agency now publishing MCP-specific guidance — five signals emerge that actually distinguish production-ready implementations from those that merely claim to be. Each is, at root, a different expression of the same underlying question: does this server make its limitations visible, or does it hide them?

1. Does it fail closed, or does it warn and continue?

This is the single most informative signal, and you can find it in seconds by skimming environment variable documentation or startup behavior.

A server that fails closed refuses to start when its own security invariants aren't met. A server that warns and continues implicitly says: "this security property is optional."

Concrete things to look for: Does NODE_ENV=production with a dev-mode auth configuration cause a hard startup failure, or a warning? If rate limiting isn't configured in production, does the server refuse to start, or does it just run without it? If the plugin sandbox isn't real, does it block untrusted plugins or silently run them with a log line? Does setting a security feature to a known weak value — a default password, a placeholder secret — trigger rejection rather than acceptance?

The difference matters enormously at 2 AM when someone misconfigures a deployment. Fail-closed servers are self-defending. Warn-and-continue servers expect perfect operators.

2. Does it know what it doesn't do?

This sounds counterintuitive — shouldn't good software just work? But in a domain with evolving security standards, the most trustworthy signal is explicit honesty about scope.

Look for a section explicitly titled something like "non-goals," "known limitations," or "explicit non-claims." Not a generic disclaimer — a specific enumeration. Does it claim to have a plugin sandbox? Is that claim qualified? Does it claim crypto-erasure capability? Does it specify whether that's real KMS-backed per-tenant key destruction or just encryption-at-rest with a single global key?

If a server claims "enterprise-grade security" without specifying what that means and what it explicitly excludes, treat that as a red flag rather than a selling point. In the post-April-2026 disclosure landscape, the distinction between "we encrypt your data" and "we implement per-tenant cryptographic erasure with KMS-backed key destruction" is not subtle — it's the difference between a security posture and a security story.

3. What does its auth story actually cover?

JWT support is the floor, not the standard. The question is what the JWT enforcement actually does.

Check whether the server enforces:

  • Resource indicators (RFC 8707): Does the server verify that a token was issued for this specific resource, not just any resource from the same IdP? RFC 8707 is an OAuth extension rather than a universal default, but its absence in a production MCP deployment is a meaningful gap: a token stolen from one service can be replayed against another. Treat it as required for production, and flag its absence accordingly.
  • Issuer and audience as hard production requirements: Not configured-if-you-want-to, but enforced as startup gates when NODE_ENV=production.
  • Tenant isolation in request context: Does every tool call carry tenant/user/client/scope as first-class context, or does identity stop at the authentication middleware and get discarded before execution?
  • Scope enforcement per tool: Can individual tools declare required scopes enforced before handler execution — not just at the route level?

A server with "JWT support" that skips resource indicators, loses tenant context through the execution pipeline, and does per-route but not per-tool scope checks has authentication theater — it looks secure but doesn't actually constrain what an authenticated caller can do once they're in.

4. What does the output pipeline look like?

This is the most underrated production concern in the MCP ecosystem right now, and the one most directly implicated by the tool poisoning and MCPTox research. The MCPTox finding has a specific implication here: if more capable models are more compliant with malicious instructions embedded in tool metadata — not less — then the server-side output pipeline becomes the most reliable defense layer you control, operating independently of model behavior. The model won't catch what you didn't strip before it arrived.

An AI agent calling your MCP tools will return results to an LLM. Those results may contain credentials, PII, or injected instructions. If your output pipeline doesn't intercept and sanitize them before they reach the model, you have an undefended attack surface — one that the research literature has confirmed is actively exploited.

Ask whether the server scans tool outputs for: private key blocks, API key patterns, payment card numbers, SSNs, prompt injection markers, and sensitive field names in structured content. Does it do this recursively for nested structured outputs? Does it have depth and cycle guards to prevent pathological input from causing the firewall itself to fail?

One attack surface that often goes unexamined alongside tool output: error messages. A server that returns raw database connection strings, filesystem paths, or internal service names in error responses is leaking infrastructure topology through a channel that tool output scanning doesn't cover. Ask whether error text is sanitized before it reaches the MCP client — separately from the tool output path.

Absence of an output firewall doesn't make a server insecure by itself. But its presence — and the specificity of what it covers — tells you whether the author has thought about the AI-specific threat model or just the HTTP-API threat model.

5. Can you query its debt at runtime?

Most servers make you read documentation to understand their limitations. A more mature approach exposes those limitations through the protocol itself.

If an MCP server ships a tool that returns a structured, versioned report of its own known security gaps and design debts — including status (open, resolved, monitoring), severity, and a description of current behavior — that's a meaningful architectural commitment. It means the team treats debt as a first-class runtime concern, not an appendix in a README that goes stale.

This matters in production because your monitoring, your on-call runbook, and your security audit can all consume the same debt report your development tooling uses. When a gap is found and a CVE is filed, there's already a canonical place to track remediation status — one the server itself can report.


The Plugin Isolation Problem

If the server supports third-party plugins or tool extensions, ask one specific question: what is the actual isolation boundary?

Most implementations fall into one of three categories:

  1. No isolation: plugins run in-process, with full access to server memory, environment variables, and I/O.
  2. Child process: plugins run as separate processes. Better than nothing, but the boundary is soft. The OX Security research demonstrated exactly why application-layer sanitization fails here: allowlists get bypassed. Language runtimes like Node.js and Python allow arguments passed as parameters to invoke OS-level commands, circumventing command-level restrictions entirely.
  3. Real sandbox: container, Wasmtime/WASM, or microVM boundary. Actual OS-level isolation. At the time of writing, no open-source MCP server boilerplate ships category 3 as an internal plugin-loading architecture. (This is architecturally distinct from external sandboxed execution environments like Microsandbox, which provide microVM isolation for code that an agent runs — a different boundary than plugins that an MCP server loads.)

The problem isn't category 2 itself. Category 2 done responsibly — requiring explicit SHA-256 hash pinning per plugin, failing closed in production unless a named waiver flag is deliberately set, and labeling the runner as "best-effort hardening, not a true sandbox" throughout the documentation — is an honest posture. The problem is category 2 that markets itself as category 3. A server that explicitly labels its child-process runner as what it is, and refuses to load production non-built-in plugins unless a named waiver flag is set, is more trustworthy than one that says "secure plugin execution" without qualification — even though the isolation boundary is soft in both cases.

Category 3 is not an application-layer problem. It requires container infrastructure, Wasmtime integration, or microVM support at the deployment level. A server that is honest about this, tracks the gap as a release-blocking item, and fails closed until real infrastructure is provided is practicing the right discipline. A server that pretends otherwise is making a promise it cannot keep.


The Pattern Debt Concept

The plugin isolation question, and the five signals before it, are all implementations of a single, deeper principle worth naming explicitly.

Every production system has known gaps between what it currently implements and what its threat model ideally requires. Those gaps exist for legitimate reasons: some require external infrastructure (KMS for per-tenant crypto-erasure, container runtime for real plugin isolation), some require upstream stabilization (the MCP Tasks SDK public API surface is still evolving), some are on the roadmap but not yet built.

The question is not whether gaps exist. They always do. The question is whether those gaps are visible or hidden.

The April 2026 OX Security disclosure is a case study in what happens when architectural debt propagates through a supply chain silently: 150 million downloads inheriting a flaw that no individual downstream project could see or defend against, because the root cause was upstream and undocumented. Hidden debt gets discovered at the worst time — during an incident, a security audit, or a CVE disclosure.

Visible debt can be planned for, monitored, and communicated honestly to teams building on top of your infrastructure. A server that exposes its pattern debt as a queryable tool — tracking each item with a status, severity, and current-behavior description — gives operators the same information the development team has. That's not a feature. That's a philosophy, and it's one the broader ecosystem hasn't adopted yet.


A Practical Checklist

When evaluating an MCP server for real production use, the following criteria map directly to the five signals above. Most are answerable from documentation and startup behavior alone. Items 6, 7, and 8 are the exceptions — output scanning depth, error sanitization paths, and plugin isolation boundaries are rarely described in enough detail to evaluate without a brief look at the source.

  • Does it refuse to start with insecure production configuration — auth mode, rate limiting, host allowlist?
  • Does it explicitly enumerate what it does not do?
  • Does it expose its own limitations through the protocol, not just documentation?
  • Does resource indicator enforcement (RFC 8707) exist, and is it required in production?
  • Is tenant/user/scope context propagated as first-class data to every tool call, not just the auth boundary?
  • Does it scan tool outputs before returning them to the LLM — including recursive structured content?
  • Does it sanitize error messages before returning them to the client, separately from tool output scanning?
  • What is the actual isolation boundary for plugins: in-process, child-process, or container/WASM?
  • If crypto-erasure is claimed, what data classes are explicitly covered and what are excluded? Partial coverage is the norm even in careful implementations — per-tenant KMS-backed erasure for vault blobs while task records remain in plaintext implies a materially different risk posture than a top-level encryption claim suggests.

"Production-ready" used to mean something. It meant a team had thought carefully about the failure modes that matter in real deployments and had made deliberate choices about each one.

In the MCP ecosystem right now, it mostly means "I have JWT and Docker."

The 2025–2026 disclosure landscape has made the cost of that gap concrete — and in each case, what made the damage possible was the same thing: hidden debt. The OX Security SDK flaw propagated through 150 million downloads because no downstream project could see the root cause; it was upstream and undocumented. Nine of eleven MCP marketplaces accepted malicious packages because no validation gate existed to surface the gap. More capable models proved more susceptible to tool poisoning because the compliance vector was never modeled as a threat. In every instance, hidden debt was discovered at the worst possible time.

The servers worth trusting are not the ones with the longest feature lists. They're the ones that make their debt visible — through documentation, startup behavior, and ideally the protocol itself — so that operators understand where their edges are before an incident reveals them.


The analysis in this post draws on publicly available MCP ecosystem research: arXiv:2603.22489 (Huang et al., March 2026), arXiv:2508.14925 (MCPTox benchmark), OX Security's April 2026 SDK vulnerability disclosure, NSA AISC Cybersecurity Information Sheet U/OO/6030316-26 (May 2026), OWASP MCP Top 10 project, and direct evaluation of open-source MCP server implementations.