OpenTelemetry Is Now a CNCF Graduate — and It's Coming for Your AI Stack

OpenTelemetry Is Now a CNCF Graduate — and It’s Coming for Your AI Stack

OpenTelemetry 正式从 CNCF 毕业——并将进军你的 AI 技术栈

OpenTelemetry graduated as a CNCF project on May 21, 2026. That’s not just a badge — it’s the formal recognition that OTel has won the observability standards race. But graduation isn’t the finish line. The project is now squarely aimed at the AI infrastructure era, with GenAI semantic conventions already shipping in VS Code Copilot, OpenAI Codex, and Claude Code. OpenTelemetry 于 2026 年 5 月 21 日正式从 CNCF（云原生计算基金会）毕业。这不仅仅是一个荣誉勋章，更是对其在可观测性标准之争中胜出的正式认可。但毕业并非终点，该项目目前已明确将目标锁定在 AI 基础设施时代，其生成式 AI（GenAI）语义约定已在 VS Code Copilot、OpenAI Codex 和 Claude Code 中投入使用。

“Graduation is not the finish line. The OpenTelemetry community remains committed to building interoperable, high-quality observability standards and tooling for cloud native software at global scale.” — OpenTelemetry project blog “毕业不是终点。OpenTelemetry 社区将继续致力于为全球规模的云原生软件构建可互操作、高质量的可观测性标准和工具。”——OpenTelemetry 项目博客

What actually changed

实际发生了哪些变化

CNCF graduation — OTel moved from incubating to graduated, joining Kubernetes, Prometheus, and a handful of other foundational cloud-native projects. This signals production-readiness and long-term stewardship. CNCF 毕业 —— OTel 从孵化项目转为毕业项目，与 Kubernetes、Prometheus 及其他少数基础云原生项目并列。这标志着它已具备生产就绪能力，并拥有了长期的管理保障。
Origins — formed from the merger of OpenTracing and OpenCensus, OTel has absorbed thousands of contributors across language SDKs, semantic conventions, and the Collector. 起源 —— OTel 由 OpenTracing 和 OpenCensus 合并而成，汇聚了数以千计的贡献者，涵盖了多种语言的 SDK、语义约定以及 Collector（采集器）。
Declarative configuration went stable — a quieter but significant win: you can now configure the OTel Collector declaratively, which matters for GitOps and platform teams managing collectors at scale. 声明式配置趋于稳定 —— 这是一个虽低调但意义重大的胜利：你现在可以以声明式方式配置 OTel Collector，这对采用 GitOps 的团队以及大规模管理采集器的平台团队至关重要。
GenAI semantic conventions are in active use — the gen_ai.* attribute namespace standardises how LLM operations are recorded: model name, input/output token counts, finish reasons, tool calls, and (when opted in) full prompt/response content. GenAI 语义约定已投入使用 —— gen_ai.* 属性命名空间标准化了 LLM 操作的记录方式，包括：模型名称、输入/输出 Token 计数、完成原因、工具调用，以及（在授权情况下）完整的提示词/响应内容。
Major AI tools already emit OTel — VS Code Copilot, OpenAI Codex, and Claude Code all export OTel telemetry today. That’s not an aspiration — it’s already the default for the most-used AI coding tools. 主流 AI 工具已支持 OTel 输出 —— VS Code Copilot、OpenAI Codex 和 Claude Code 目前均已导出 OTel 遥测数据。这不再是愿景，而是目前最主流 AI 编程工具的默认配置。

Why this matters

为什么这很重要

OTel is the first observability framework that’s genuinely spanning both cloud-native infrastructure and AI workloads under a single standard. That’s a big deal. Before the GenAI semantic conventions, monitoring an AI agent meant vendor-specific dashboards, proprietary SDKs, or rolling your own spans. Now you get a common schema — gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.client.operation.duration — that any OTLP-compatible backend can ingest and visualise. OTel 是第一个真正将云原生基础设施和 AI 工作负载统一在单一标准下的可观测性框架。这意义重大。在 GenAI 语义约定出现之前，监控 AI 智能体意味着必须使用特定供应商的仪表板、专有 SDK 或自行编写 Span。现在，你拥有了一个通用的模式——gen_ai.request.model、gen_ai.usage.input_tokens、gen_ai.client.operation.duration——任何兼容 OTLP 的后端都可以对其进行摄取和可视化。

The practical upside: if your AI agent takes 45 seconds to answer a question, you can now tell whether it was the model, a slow tool call, or a retry loop — without guessing. Token costs, latency histograms, and tool invocation traces all flow through the same pipeline you already run for your services. 实际的好处在于：如果你的 AI 智能体回答问题耗时 45 秒，你现在可以准确判断是模型本身的问题、工具调用缓慢还是重试循环导致的，无需猜测。Token 成本、延迟直方图和工具调用追踪，现在都可以通过你现有的服务流水线进行统一处理。

The graduation timing is deliberate. OTel is establishing itself as the standard before the AI observability market fragments into proprietary tooling. That’s the same playbook it ran against Prometheus/Jaeger fragmentation in the cloud-native space. 此次毕业的时机经过深思熟虑。OTel 正在 AI 可观测性市场碎片化为各种专有工具之前，确立其标准地位。这与它当年在云原生领域应对 Prometheus/Jaeger 碎片化时所采用的策略如出一辙。

What to do

建议行动

If you’re building AI-powered apps: Instrument with the GenAI semantic conventions now — they’re in use and under active development, so your feedback shapes what gets standardised. Try the free Aspire Dashboard Docker image for local GenAI telemetry exploration — OTLP-native, no cloud account required. 如果你正在构建 AI 应用： 请立即使用 GenAI 语义约定进行埋点——它们已投入使用并处于活跃开发阶段，你的反馈将决定未来的标准化方向。尝试使用免费的 Aspire Dashboard Docker 镜像进行本地 GenAI 遥测探索——它是 OTLP 原生的，无需云账号。
If you’re a platform/infra engineer: OTel Collector declarative config is now stable — worth revisiting your collector setup if you deferred it waiting for stability. Check if your AI tooling already emits OTel (Copilot and Codex do) — you may have free telemetry sitting uncollected. 如果你是平台/基础设施工程师： OTel Collector 的声明式配置现已稳定——如果你之前因等待稳定性而推迟了配置，现在是重新审视的好时机。检查你的 AI 工具是否已支持 OTel 输出（Copilot 和 Codex 已经支持）——你可能正坐拥大量未被采集的免费遥测数据。
If you’re evaluating observability vendors: Prioritise OTLP-native backends. Vendor lock-in via proprietary agents is increasingly a bad bet when the standard is this mature. 如果你正在评估可观测性供应商： 请优先选择 OTLP 原生后端。在标准如此成熟的今天，通过专有 Agent 导致的供应商锁定已不再是明智之举。