Testing MCP Servers
Unit tests, in-memory integration tests, and interactive debugging with MCP Inspector
Summary
Three testing levels catch different problems: unit tests (individual tool handlers), in-memory integration tests (full protocol stack), and MCP Inspector (interactive debugging). The createServer() factory pattern is essential — it enables testing without network I/O, ports, or process management. Tests use in-memory transport and can exercise servers synchronously.
- Level 1: Unit tests for tool registration and handlers
- Level 2: In-memory integration tests via InMemoryTransport
- Level 3: Interactive MCP Inspector for debugging
- createServer() factory enables testing without network
- Mock external dependencies; use in-memory databases
- Test invalid inputs, edge cases, and error paths
Testing an MCP server has three distinct levels: unit tests for individual tool handlers, in-memory integration tests that exercise the full protocol stack, and interactive inspection via the MCP Inspector. Each level catches different classes of problems — none of them is sufficient on its own.
Why the createServer() Factory Matters for Testing
The createServer() factory pattern described in Server Architecture exists primarily to enable testing. A server that starts a network listener in its constructor cannot be tested without ports, process management, and timing concerns. A factory function that returns a configured McpServer can be connected to an InMemoryTransport in a test and exercised synchronously.
// src/server.ts — the pattern that enables testing
export function createServer(config: ServerConfig): McpServer {
const server = new McpServer({ name: "billing-mcp", version: "1.0.0" });
registerCreateInvoice(server, config);
registerListInvoices(server, config);
registerVoidInvoice(server, config);
return server;
}Every test calls createServer() with a test configuration — an in-memory database URL, a mock API key, reduced limits. No subprocesses, no ports, no cleanup between tests beyond what the test framework provides.
Level 1: Unit Tests
Unit tests cover individual tool registration functions. They verify that the schema validates correctly, that the handler returns expected output for valid inputs, and that it returns isError: true for invalid domain conditions.
// src/tools/create-invoice.test.ts
import { describe, it, expect, beforeEach, vi } from "vitest";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { registerCreateInvoice } from "./create-invoice.js";
import * as db from "../db.js";
vi.mock("../db.js");
describe("billing_create_invoice", () => {
let server: McpServer;
beforeEach(() => {
server = new McpServer({ name: "test", version: "0.0.0" });
registerCreateInvoice(server, {
stripeApiKey: "sk_test_key",
databaseUrl: ":memory:",
});
});
it("returns isError when customer does not exist", async () => {
vi.mocked(db.customers.findById).mockResolvedValue(null);
const result = await callTool(server, "billing_create_invoice", {
customer_id: "00000000-0000-0000-0000-000000000001",
amount_cents: 5000,
currency: "usd",
due_date: "2025-12-31",
});
expect(result.isError).toBe(true);
expect(result.content[0].text).toContain("not found");
expect(result.content[0].text).toContain("billing_list_customers");
});
it("rejects invalid amount_cents", async () => {
const result = await callTool(server, "billing_create_invoice", {
customer_id: "00000000-0000-0000-0000-000000000001",
amount_cents: -100, // negative
currency: "usd",
due_date: "2025-12-31",
});
expect(result.isError).toBe(true);
});
});
// Helper — exercise the tool handler without a transport
async function callTool(
server: McpServer,
toolName: string,
params: Record<string, unknown>
) {
const tool = (server as any)._registeredTools.get(toolName);
if (!tool) throw new Error(`Tool not registered: ${toolName}`);
return tool.handler(params, {});
}Level 2: In-Memory Integration Tests
Integration tests exercise the full MCP protocol stack using InMemoryTransport.createLinkedPair(). The client and server communicate via the MCP JSON-RPC protocol over an in-memory channel — the same code paths that run in production, without any network.
// src/server.integration.test.ts
import { describe, it, expect, beforeEach, afterEach } from "vitest";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { InMemoryTransport } from "@modelcontextprotocol/sdk/inMemory.js";
import { createServer } from "./server.js";
describe("billing-mcp integration", () => {
let client: Client;
let cleanup: () => Promise<void>;
beforeEach(async () => {
const server = createServer({
stripeApiKey: "sk_test_key",
databaseUrl: ":memory:",
maxInvoicesPerPage: 10,
});
const [clientTransport, serverTransport] =
InMemoryTransport.createLinkedPair();
client = new Client(
{ name: "test-client", version: "1.0.0" },
{ capabilities: {} }
);
await server.connect(serverTransport);
await client.connect(clientTransport);
cleanup = async () => {
await client.close();
};
});
afterEach(async () => {
await cleanup();
});
it("lists all registered tools", async () => {
const { tools } = await client.listTools();
const toolNames = tools.map((t) => t.name);
expect(toolNames).toContain("billing_create_invoice");
expect(toolNames).toContain("billing_list_invoices");
expect(toolNames).toContain("billing_void_invoice");
});
it("returns tool descriptions", async () => {
const { tools } = await client.listTools();
const createTool = tools.find((t) => t.name === "billing_create_invoice");
expect(createTool?.description).toBeTruthy();
expect(createTool?.description?.length).toBeGreaterThan(50);
});
it("validates required fields on billing_create_invoice", async () => {
const result = await client.callTool({
name: "billing_create_invoice",
arguments: {
// Missing required fields: amount_cents, currency, due_date
customer_id: "00000000-0000-0000-0000-000000000001",
},
});
expect(result.isError).toBe(true);
});
it("returns isError for non-existent customer", async () => {
const result = await client.callTool({
name: "billing_create_invoice",
arguments: {
customer_id: "00000000-0000-0000-0000-000000000099",
amount_cents: 5000,
currency: "usd",
due_date: "2025-12-31",
},
});
expect(result.isError).toBe(true);
const message = (result.content as Array<{ text: string }>)[0]?.text;
expect(message).toMatch(/not found/i);
});
it("creates an invoice successfully", async () => {
// Seed a customer first
await db.customers.create({ id: TEST_CUSTOMER_ID, ... });
const result = await client.callTool({
name: "billing_create_invoice",
arguments: {
customer_id: TEST_CUSTOMER_ID,
amount_cents: 5000,
currency: "usd",
due_date: "2025-12-31",
},
});
expect(result.isError).toBeFalsy();
const data = JSON.parse(
(result.content as Array<{ text: string }>)[0].text
);
expect(data.invoice_id).toMatch(
/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/
);
expect(data.status).toBe("draft");
});
it("lists resources", async () => {
const { resources } = await client.listResources();
const uris = resources.map((r) => r.uri);
expect(uris).toContain("billing://schemas/invoice");
});
it("reads the invoice schema resource", async () => {
const result = await client.readResource({
uri: "billing://schemas/invoice",
});
expect(result.contents).toHaveLength(1);
const schema = JSON.parse(result.contents[0].text as string);
expect(schema.$schema).toBeTruthy();
expect(schema.properties).toHaveProperty("id");
});
});The InMemoryTransport.createLinkedPair() call returns two transport objects that are wired together. Anything written to one appears on the other. There are no async I/O operations — the transport is synchronous in memory. Tests run at the full speed of the JavaScript runtime.
Testing Approach Matrix
| Scenario | Level | Tool |
|---|---|---|
| Schema validation rejects bad input | Unit | Direct handler call |
| isError returned for domain errors | Unit | Direct handler call |
| All tools are registered | Integration | client.listTools() |
| Tool descriptions are non-empty | Integration | client.listTools() |
| Full happy path creates correct output | Integration | client.callTool() |
| Resources are readable | Integration | client.readResource() |
| Prompts return expected messages | Integration | client.getPrompt() |
| Human-friendly interactive testing | Interactive | MCP Inspector |
| Authentication is enforced | Integration | Custom auth middleware test |
MCP Inspector
The MCP Inspector is an interactive browser-based tool for exploring and testing MCP servers. It connects to a running server and lets you browse tools, call them with custom arguments, inspect responses, and debug protocol messages.
Launch against a local stdio server:
npx @modelcontextprotocol/inspector node dist/cli.jsLaunch against an HTTP server:
npx @modelcontextprotocol/inspector --url http://localhost:3000/mcpWith debug logging enabled:
MCP_INSPECTOR_LOG_LEVEL=debug npx @modelcontextprotocol/inspector node dist/cli.jsBy default, the Inspector opens on port 5173 (UI) and 3000 (proxy). If those ports are in use:
npx @modelcontextprotocol/inspector --port 5174 --proxy-port 3001 node dist/cli.jsInspector features:
- Browse the full tool list with names, descriptions, and input schemas
- Fill in tool arguments using a generated form UI
- Call tools and inspect the raw MCP response, including
isErrorstate - Browse resources and read their content
- List and preview prompt templates
- View the raw JSON-RPC message log for debugging protocol issues
- Test server capabilities advertised during initialization
Use the Inspector for exploratory testing, verifying that descriptions read well, and debugging tool failures before writing regression tests. It is not a substitute for automated tests — errors you find in the Inspector should become test cases.
The Inspector connects using the same MCP client library used by Claude Desktop and other MCP clients. If something works in the Inspector but fails in your actual agent client, the issue is in the client configuration or authentication layer, not the server implementation.
Vitest Configuration
// vitest.config.ts
import { defineConfig } from "vitest/config";
export default defineConfig({
test: {
globals: true,
environment: "node",
coverage: {
provider: "v8",
reporter: ["text", "lcov"],
include: ["src/tools/**", "src/resources/**", "src/prompts/**"],
thresholds: {
lines: 80,
functions: 80,
branches: 70,
},
},
},
});Coverage thresholds on the tools/, resources/, and prompts/ directories enforce that the core server logic is tested. The server factory (server.ts) and CLI entry point (cli.ts) are integration-tested via the in-memory tests rather than unit-tested directly.