Tracing HTTP Requests with Go's net/http/httptrace

Tracing HTTP Requests with Go’s net/http/httptrace

net/http/httptrace has been in the standard library since Go 1.7 and most Go developers I talk to have never used it. It exposes hooks for the points in an outgoing HTTP request that you usually cannot see from outside the transport: DNS resolution, connection acquisition, TLS handshake, the moment bytes go on the wire, the moment the first response byte comes back.

net/http/httptrace 自 Go 1.7 版本起就已包含在标准库中,但我交流过的大多数 Go 开发者从未用过它。它为传出 HTTP 请求中通常无法从传输层外部观察到的关键点提供了钩子(hooks):DNS 解析、连接获取、TLS 握手、数据发送瞬间以及接收到第一个响应字节的时刻。

The interesting part is how it plugs in. There is no Tracer interface on http.Client, no middleware to register. You attach a ClientTrace to a context.Context and the transport pulls it back out via httptrace.ContextClientTrace at the points where it matters. I want to walk through that design choice first because it explains how the package composes with the rest of the stdlib, then build two things with it: a curl --trace-style CLI and a reusable http.RoundTripper that logs timings for every request.

最有趣的部分在于它的接入方式。http.Client 上没有 Tracer 接口,也不需要注册中间件。你只需将一个 ClientTrace 附加到 context.Context 中,传输层就会在关键点通过 httptrace.ContextClientTrace 将其提取出来。我首先想探讨这种设计选择,因为它解释了该包如何与标准库的其他部分协同工作,然后我将用它构建两个工具:一个类似 curl --trace 的命令行工具,以及一个可复用的、能记录每次请求耗时的 http.RoundTripper

Why Context, Not an Interface

为什么使用 Context 而非接口?

The obvious design for request tracing would be to define a Tracer interface, add a Tracer field to http.Client or http.Transport, and call methods on it from inside the transport. That is roughly how most languages handle this. Go’s standard library does not work that way. Instead, httptrace.WithClientTrace returns a new context carrying a *ClientTrace, you attach that context to your request with req.WithContext(ctx), and the transport pulls the trace back out via httptrace.ContextClientTrace at the points where it matters.

请求追踪最直观的设计是定义一个 Tracer 接口,在 http.Clienthttp.Transport 中添加一个 Tracer 字段,并在传输层内部调用其方法。这大致是大多数语言处理此类问题的方式,但 Go 的标准库并非如此。相反,httptrace.WithClientTrace 会返回一个携带 *ClientTrace 的新上下文,你通过 req.WithContext(ctx) 将该上下文附加到请求中,传输层会在关键点通过 httptrace.ContextClientTrace 将追踪器提取出来。

trace := &httptrace.ClientTrace{
    DNSStart: func(info httptrace.DNSStartInfo) {
        fmt.Printf("DNS start: %s\n", info.Host)
    },
    DNSDone: func(info httptrace.DNSDoneInfo) {
        fmt.Printf("DNS done: %v\n", info.Addrs)
    },
}
ctx := httptrace.WithClientTrace(context.Background(), trace)
req, _ := http.NewRequestWithContext(ctx, http.MethodGet, "https://example.com", nil)
http.DefaultClient.Do(req)

This is unusual but it pays off. The trace travels with the request, so any middleware that forwards the context propagates tracing for free. Nothing on the client is shared mutable state, so concurrent requests from the same http.Client can carry different traces. And the transport ignores the trace entirely if no one attached one, so the cost when unused is a nil check.

这种做法虽然不寻常,但非常有效。追踪信息随请求传递,因此任何转发上下文的中间件都能自动传播追踪信息。客户端没有任何共享的可变状态,因此来自同一个 http.Client 的并发请求可以携带不同的追踪信息。此外,如果未附加追踪器,传输层会完全忽略它,因此在不使用时,其开销仅是一个空指针检查。

ClientTrace itself is a struct of optional function fields:

ClientTrace 本身是一个包含可选函数字段的结构体:

type ClientTrace struct {
    GetConn              func(hostPort string)
    GotConn              func(GotConnInfo)
    PutIdleConn          func(err error)
    GotFirstResponseByte func()
    Got100Continue       func()
    Got1xxResponse       func(code int, header textproto.MIMEHeader) error
    DNSStart             func(DNSStartInfo)
    DNSDone              func(DNSDoneInfo)
    ConnectStart         func(network, addr string)
    ConnectDone          func(network, addr string, err error)
    TLSHandshakeStart    func()
    TLSHandshakeDone     func(tls.ConnectionState, error)
    WroteHeaderField     func(key string, value []string)
    WroteHeaders         func()
    Wait100Continue      func()
    WroteRequest         func(WroteRequestInfo)
}

You set only the fields you care about. Unset fields are nil and skipped. This is also why the package can add new hooks without breaking anyone - existing code just leaves the new field nil.

你只需设置你关心的字段。未设置的字段为 nil 并会被跳过。这也是为什么该包可以在不破坏现有代码的情况下添加新钩子的原因——现有代码只需让新字段保持为 nil 即可。

Building a curl —trace

构建一个 curl —trace 工具

The first thing worth building is a CLI that takes a URL and prints a timing breakdown like curl -w does, but with the granularity httptrace exposes. The trick is recording timestamps in each hook and computing durations relative to a start time.

首先值得构建的是一个命令行工具,它接收一个 URL 并打印出类似 curl -w 的耗时明细,但具有 httptrace 所提供的细粒度。其技巧在于在每个钩子中记录时间戳,并计算相对于起始时间的持续时间。

(Code omitted for brevity, see original article for implementation details)

Run it against any URL and the output tells you where the time went. A slow DNS lookup, a slow TLS handshake, or a server that takes a long time to produce the first byte all show up as their own line in the breakdown. No proxy required, no APM agent, no instrumentation library in the dependency graph.

对任何 URL 运行该工具,输出结果会告诉你时间花在了哪里。缓慢的 DNS 解析、缓慢的 TLS 握手,或者服务器长时间未返回第一个字节,都会在明细中单独显示。无需代理,无需 APM 代理,依赖图中也不需要任何插桩库。

A few things to know. DNSStart and DNSDone only fire if Go’s resolver does the lookup - if the address is already in the kernel’s DNS cache or you passed an IP directly, the hooks stay quiet. TLSHandshakeStart and TLSHandshakeDone only fire on HTTPS. GotConn fires whether the connection was new or reused, and the GotConnInfo struct has a Reused field that tells you which.

有几点需要注意:DNSStartDNSDone 仅在 Go 的解析器执行查找时触发——如果地址已在内核的 DNS 缓存中,或者你直接传入了 IP 地址,这些钩子将不会触发。TLSHandshakeStartTLSHandshakeDone 仅在 HTTPS 请求中触发。GotConn 无论连接是新建的还是复用的都会触发,GotConnInfo 结构体中的 Reused 字段会告诉你具体情况。

Building a RoundTripper

构建一个 RoundTripper

A one-shot CLI is useful but most of the time you want every request through…

一次性的命令行工具很有用,但大多数时候你希望通过……(后续内容略)