-
-
Notifications
You must be signed in to change notification settings - Fork 74
Debugging MCP Protocol Issues
This guide provides a systematic approach to troubleshooting MCP (Model Context Protocol) issues, based on real debugging experiences with handler registration, tool execution, and protocol routing problems.
- Overview
- Common Issues
- Debugging Methodology
- Handler Registration Problems
- Tool Execution Timeouts
- Protocol Message Routing
- Testing Strategies
- Solutions and Patterns
MCP protocol debugging can be challenging due to the asynchronous nature of the communication and the multiple layers involved. This guide documents proven strategies for identifying and resolving common issues.
- Tool execution hanging after 15+ seconds
- "Tool not found" errors despite proper registration
- Handler functions never being called
- Protocol handshake succeeding but tools failing
- TaskGroup asyncio errors
Real example from June 6, 2025:
23:00 - Tool execution hanging
23:10 - Database operations suspected
23:15 - Handler registration errors found
23:20 - Protocol communication verified working
23:25 - MCP library validation attempted
23:30 - Root cause: message routing issue
Create a debugging matrix to isolate issues:
# Test 1: Minimal MCP Server
# Purpose: Verify basic MCP functionality
@server
class MinimalServer(Server):
@server.list_tools()
async def handle_list_tools(self) -> List[types.Tool]:
return [
types.Tool(
name=\"test_tool\",
description=\"Simple test tool\",
inputSchema={
\"type\": \"object\",
\"properties\": {
\"message\": {\"type\": \"string\"}
}
}
)
]
@server.call_tool()
async def handle_call_tool(self, name: str, arguments: dict) -> Any:
print(f\"TOOL CALLED: {name}\", file=sys.stderr, flush=True)
if name == \"test_tool\":
return {\"result\": f\"Received: {arguments.get('message')}\"}
# Test 2: Simplified Memory Server
# Purpose: Test without complex initialization
class SimplifiedMemoryServer(Server):
def __init__(self):
super().__init__(\"simplified-memory\")
self.storage = None # No ChromaDB initialization
@server.list_tools()
async def handle_list_tools(self) -> List[types.Tool]:
# Return tools without complex setup
return self.get_tool_definitions()
# Test 3: Full Memory Service
# Purpose: Test complete implementation
Add comprehensive logging at key points:
import sys
import logging
# Configure detailed logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.StreamHandler(sys.stderr)
]
)
class DebugMemoryServer(Server):
async def initialize(self, params):
print(\"=== SERVER INITIALIZATION STARTED ===\", file=sys.stderr, flush=True)
result = await super().initialize(params)
print(\"=== SERVER INITIALIZATION COMPLETE ===\", file=sys.stderr, flush=True)
return result
@server.call_tool()
async def handle_call_tool(self, name: str, arguments: dict) -> Any:
print(f\"=== TOOL CALL INTERCEPTED: {name} ===\", file=sys.stderr, flush=True)
print(f\"Arguments: {arguments}\", file=sys.stderr, flush=True)
try:
result = await self._execute_tool(name, arguments)
print(f\"=== TOOL EXECUTION SUCCESS ===\", file=sys.stderr, flush=True)
return result
except Exception as e:
print(f\"=== TOOL EXECUTION FAILED: {str(e)} ===\", file=sys.stderr, flush=True)
raise
This error occurs when trying to verify handler registration incorrectly.
Problem Code:
# WRONG - ToolsCapability is not a list
capabilities = server.get_capabilities()
print(f\"Registered {len(capabilities.tools)} tools\") # ERROR!
Solution:
# CORRECT - Just log the capability object
capabilities = server.get_capabilities(
notification_options=NotificationOptions(),
experimental_capabilities={}
)
print(f\"Server capabilities: {capabilities}\")
The get_capabilities()
method requires specific arguments:
# WRONG
capabilities = server.get_capabilities()
# CORRECT
from mcp.server.models import InitializationOptions
from mcp.types import NotificationOptions
capabilities = server.get_capabilities(
notification_options=NotificationOptions(),
experimental_capabilities={
\"hardware_info\": {
\"architecture\": \"x86_64\",
\"memory_gb\": 16,
\"cpu_count\": 8
}
}
)
Heavy initialization in the server startup can cause timeouts.
Problem Pattern:
class MemoryServer(Server):
def __init__(self):
super().__init__(\"memory\")
# Heavy initialization causing hanging
self.storage = ChromaMemoryStorage() # This might download models!
Solution: Lazy Initialization
class MemoryServer(Server):
def __init__(self):
super().__init__(\"memory\")
self.storage = None
self._storage_initialized = False
def _ensure_storage_initialized(self):
\"\"\"Initialize storage only when needed\"\"\"
if not self._storage_initialized:
print(\"Initializing ChromaDB storage...\", file=sys.stderr, flush=True)
self.storage = ChromaMemoryStorage()
self._storage_initialized = True
@server.call_tool()
async def handle_call_tool(self, name: str, arguments: dict) -> Any:
# Initialize only when actually needed
if name != \"dashboard_check_health\": # Health check doesn't need storage
self._ensure_storage_initialized()
return await self._execute_tool(name, arguments)
Database health checks during startup can cause hanging:
# PROBLEM: This runs during server initialization
async def initialize(self, params):
await super().initialize(params)
await validate_database_health() # Can hang here!
# SOLUTION: Skip validation during startup
async def initialize(self, params):
await super().initialize(params)
print(\"Skipping database validation during startup\", file=sys.stderr, flush=True)
# Validate later when actually using the database
Even with proper registration, tool calls might not reach your handlers.
Debugging Steps:
- Verify Registration:
@server.list_tools()
async def handle_list_tools(self) -> List[types.Tool]:
tools = self._get_tool_definitions()
print(f\"Returning {len(tools)} tools\", file=sys.stderr, flush=True)
for tool in tools:
print(f\" - {tool.name}\", file=sys.stderr, flush=True)
return tools
- Test Protocol Communication:
// test_protocol.js
const { spawn } = require('child_process');
async function testProtocol() {
const server = spawn('python', ['server.py']);
// Send initialization
const initRequest = {
jsonrpc: \"2.0\",
id: 1,
method: \"initialize\",
params: {
protocolVersion: \"2024-11-05\",
capabilities: {}
}
};
server.stdin.write(JSON.stringify(initRequest) + '\
');
// Listen for response
server.stdout.on('data', (data) => {
console.log('Server response:', data.toString());
});
}
- Check Message Format:
# Add raw message logging
async def handle_raw_message(self, message: dict):
print(f\"RAW MESSAGE: {json.dumps(message)}\", file=sys.stderr, flush=True)
return await super().handle_raw_message(message)
Layer 1: Direct Python Test
# test_direct.py
import asyncio
from server import MemoryServer
async def test_direct():
server = MemoryServer()
# Test tool listing
tools = await server.handle_list_tools()
print(f\"Found {len(tools)} tools\")
# Test tool execution
result = await server.handle_call_tool(
\"dashboard_check_health\",
{}
)
print(f\"Result: {result}\")
asyncio.run(test_direct())
Layer 2: MCP Protocol Test
# test_mcp_protocol.py
import asyncio
import json
from mcp.server.stdio import stdio_server
async def test_with_protocol():
async with stdio_server() as (read_stream, write_stream):
# Create server
server = MemoryServer()
# Initialize
init_msg = {
\"jsonrpc\": \"2.0\",
\"id\": 1,
\"method\": \"initialize\",
\"params\": {\"protocolVersion\": \"2024-11-05\"}
}
# Process message
response = await server.handle_message(init_msg)
print(f\"Init response: {response}\")
asyncio.run(test_with_protocol())
Layer 3: Full Integration Test
#!/bin/bash
# test_integration.sh
echo \"Starting MCP server...\"
python server.py &
SERVER_PID=$!
sleep 2
echo \"Running client test...\"
node test_client.js
echo \"Killing server...\"
kill $SERVER_PID
Start simple and add complexity:
- Minimal tool - Just returns a string
- Database read - Reads but doesn't write
- Full operation - Complete functionality
# Progressive test tools
tools = [
# Level 1: No dependencies
{
\"name\": \"echo_test\",
\"handler\": lambda args: {\"echo\": args.get(\"message\")}
},
# Level 2: Read-only database
{
\"name\": \"count_test\",
\"handler\": lambda args: {\"count\": storage.count()}
},
# Level 3: Full functionality
{
\"name\": \"store_test\",
\"handler\": lambda args: storage.store(args[\"content\"])
}
]
class LazyServer(Server):
def __init__(self):
super().__init__(\"lazy-server\")
self._resources = {}
def _get_resource(self, name: str):
if name not in self._resources:
if name == \"storage\":
self._resources[name] = ChromaMemoryStorage()
elif name == \"embedder\":
self._resources[name] = EmbeddingModel()
return self._resources[name]
@property
def storage(self):
return self._get_resource(\"storage\")
import asyncio
async def with_timeout(coro, timeout_seconds=30):
try:
return await asyncio.wait_for(coro, timeout=timeout_seconds)
except asyncio.TimeoutError:
print(f\"Operation timed out after {timeout_seconds}s\", file=sys.stderr)
raise
@server.call_tool()
async def handle_call_tool(self, name: str, arguments: dict) -> Any:
# Health check bypasses all initialization
if name == \"health_check\":
return {
\"status\": \"healthy\",
\"server_running\": True,
\"timestamp\": datetime.now().isoformat()
}
# Other tools initialize resources
self._ensure_initialized()
return await self._route_tool(name, arguments)
When debugging MCP issues:
- Server starts without errors
- Initialization completes successfully
- Tools are listed in handle_list_tools
- Debug logs show tool interception
- No heavy operations in init
- Database operations are deferred
- Error handling includes logging
- Test with minimal example first
- Check protocol version compatibility
- Verify message format compliance
Debugging MCP protocol issues requires:
- Systematic approach - Test each layer independently
- Comprehensive logging - Log at every decision point
- Progressive testing - Start simple, add complexity
- Lazy initialization - Defer heavy operations
- Timeout awareness - Handle long operations gracefully
The key is isolating whether the issue is in your implementation, the MCP framework, or the communication protocol. By following this guide's strategies, you can efficiently identify and resolve MCP protocol issues.