Building Custom MCP Servers: A Developer's Guide to Production-Grade AI Agent Tools

Building Custom MCP Servers: A Developer’s Guide to Production-Grade AI Agent Tools

构建自定义 MCP 服务器：AI 智能体工具生产级开发指南

The Model Context Protocol (MCP) has become the default standard for connecting AI agents to external tools and APIs. Governed by the Linux Foundation since early 2025 and adopted by OpenAI, Anthropic, Microsoft, and Vercel, MCP is the USB-C port of the AI ecosystem — one protocol that lets any LLM application talk to any tool server. Model Context Protocol (MCP) 已成为将 AI 智能体连接到外部工具和 API 的默认标准。自 2025 年初由 Linux 基金会管理，并被 OpenAI、Anthropic、Microsoft 和 Vercel 采用以来，MCP 已成为 AI 生态系统中的“USB-C 接口”——这一协议让任何大语言模型（LLM）应用都能与任何工具服务器进行通信。

But there’s a gap between reading the spec and building something that works reliably in production. I’ve spent the last few months building MCP servers for production agent workflows, and this guide captures the patterns that actually matter. If you’ve read the “6 Agent Gateway Platforms” roundups, you know which MCP servers to consume. This is the guide for when you need to build one yourself. 然而，阅读规范与构建在生产环境中可靠运行的产品之间仍存在差距。过去几个月里，我一直在为生产级智能体工作流构建 MCP 服务器，本指南总结了真正关键的开发模式。如果你阅读过“6 大智能体网关平台”的综述，你可能已经知道如何使用现有的 MCP 服务器；而本指南则是为你需要亲自构建服务器时准备的。

What We’re Building

我们将构建什么

By the end of this guide, you’ll have built a production-ready MCP server that: 在本指南结束时，你将构建出一个生产就绪的 MCP 服务器，它具备以下特性：

Exposes typed tools with JSON Schema validation
通过 JSON Schema 验证暴露类型化的工具
Uses Streamable HTTP transport (the 2026 recommended standard)
使用可流式传输的 HTTP 协议（2026 年推荐标准）
Handles errors gracefully with structured responses
通过结构化响应优雅地处理错误
Includes proper authentication for sensitive operations
为敏感操作包含适当的身份验证
Is testable with the MCP Inspector
可通过 MCP Inspector 进行测试

Let’s start with the foundation. 让我们从基础开始。

Architecture: The Three MCP Building Blocks

架构：MCP 的三大构建模块

Before writing code, understand what your server can expose. MCP defines three primitives: 在编写代码之前，先了解你的服务器可以暴露什么。MCP 定义了三个原语：

Feature	What It Does	Who Controls It
Tools	Functions the AI model calls (write, compute, act)	Model decides when to invoke
Resources	Read-only data (files, DB schemas, API docs)	Application retrieves and provides
Prompts	Pre-built templates for common workflows	User triggers explicitly

特性	功能	控制方
工具 (Tools)	AI 模型调用的函数（写入、计算、执行）	模型决定何时调用
资源 (Resources)	只读数据（文件、数据库模式、API 文档）	应用程序检索并提供
提示词 (Prompts)	针对常见工作流的预构建模板	用户显式触发

For a tool server — the most common production pattern — you’ll focus on tools. Resources and prompts are optional but useful for providing context and guiding the model’s behavior. 对于工具服务器（最常见的生产模式）而言，你将专注于“工具”。资源和提示词是可选的，但在提供上下文和引导模型行为方面非常有用。

Setting Up a TypeScript MCP Server

设置 TypeScript MCP 服务器

The official TypeScript SDK is the most widely adopted way to build MCP servers. It’s what Claude Desktop, Cursor, and Windsurf use internally. 官方 TypeScript SDK 是构建 MCP 服务器最广泛采用的方式。Claude Desktop、Cursor 和 Windsurf 内部都在使用它。

// server.ts
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { CallToolRequestSchema, ListToolsRequestSchema } from "@modelcontextprotocol/sdk/types.js";
import { z } from "zod";

// Define a tool with Zod validation
const CodeReviewInput = z.object({
  repoPath: z.string().min(1, "Repository path is required"),
  prNumber: z.number().int().positive("PR number must be positive"),
  strictness: z.enum(["basic", "standard", "deep"]).default("standard"),
});

type CodeReviewInput = z.infer<typeof CodeReviewInput>;

// Server instance
const server = new Server(
  { name: "code-review-mcp", version: "1.0.0" },
  { capabilities: { tools: {} } }
);

// Tool registration
server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: "review_pull_request",
      description: "Perform a code review on a pull request...",
      inputSchema: { /* ... JSON Schema definition ... */ },
    },
  ],
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === "review_pull_request") {
    const args = CodeReviewInput.parse(request.params.arguments);
    try {
      const review = await performReview(args.repoPath, args.prNumber, args.strictness);
      return { content: [{ type: "text", text: JSON.stringify(review, null, 2) }] };
    } catch (error) {
      return { content: [{ type: "text", text: `Review failed: ${error instanceof Error ? error.message : "Unknown error"}` }], isError: true };
    }
  }
  throw new Error(`Unknown tool: ${request.params.name}`);
});

// Start with stdio transport
const transport = new StdioServerTransport();
await server.connect(transport);

This is the skeleton. Every MCP server follows this pattern: declare capabilities, define tool schemas, implement handlers, connect a transport. 这就是骨架。每个 MCP 服务器都遵循此模式：声明能力、定义工具模式、实现处理程序、连接传输层。

Writing Tools That Agents Actually Use Well

编写智能体真正好用的工具

The biggest mistake I see in MCP server designs is writing tools the way you’d write REST endpoints for other developers. Agents don’t read documentation the way humans do. Your tool names, descriptions, and schemas need to be optimized for an LLM to discover and use correctly. 我在 MCP 服务器设计中看到的最大错误，就是像为其他开发者编写 REST 端点那样编写工具。智能体阅读文档的方式与人类不同。你的工具名称、描述和模式需要针对 LLM 进行优化，以便它们能正确发现并使用。

Naming Conventions 命名规范

Use descriptive, action-oriented names: 使用描述性强、以动作为导向的名称：

Good: search_codebase, create_jira_ticket, deploy_to_staging
Bad: exec, run, helper, util

Descriptions That Work 有效的描述

Your tool description is the agent’s documentation. Be explicit about when to use it, what it does, and edge cases. 你的工具描述就是智能体的文档。要明确说明何时使用它、它的功能以及边界情况。

{
  "name": "deploy_service",
  "description": "Deploy a service to the staging environment. Use when the user asks to deploy, push to staging, or test a deployment. Does NOT deploy to production — use deploy_to_production for that. Requires the service to have passed CI checks."
}

Input Schema Design 输入模式设计

Keep required parameters minimal. Agents get confused by complex schemas with many required fields. Use sensible defaults wherever possible. 保持必需参数最少化。智能体会被包含大量必需字段的复杂模式搞糊涂。尽可能使用合理的默认值。

One required field, optional parameters with clear defaults. The agent can succeed with minimal information and ask for more when needed. 一个必需字段，配合带有明确默认值的可选参数。智能体可以用最少的信息成功执行任务，并在需要时进一步询问。

Streamable HTTP: The Production Transport

可流式传输 HTTP：生产环境的传输层

Stdio transport is fine for local development (Claude Desktop, VS Code), but for production deployments you need HTTP. In 2026, Streamable HTTP has replaced Server-Sent Events (SSE) as the recommended standard. Stdio 传输层适用于本地开发（Claude Desktop、VS Code），但对于生产部署，你需要 HTTP。在 2026 年，可流式传输的 HTTP 已取代服务器发送事件（SSE）成为推荐标准。