How an HTTP header caused time.gov to skew from UTC

How an HTTP header caused time.gov to skew from UTC

一个 HTTP 响应头是如何导致 time.gov 时间偏差的

Built on Shards of Silicon In the United States, the National Institute of Standards and Technology (NIST) maintains the official U.S. time reference. NIST distributes this reference to enable all sorts of applications from meteorology to GPS satellites. Programmers are probably most familiar with distributing time using the network time protocol (NTP), which NIST supports by operating several NTP servers. NIST also runs the beautiful time.gov website which provides an official time reference via a web page. Its an easy way to check the time if you don’t trust the clock on your computer’s taskbar.

在美国,国家标准与技术研究院(NIST)负责维护美国的官方时间基准。NIST 分发这一基准以支持从气象学到 GPS 卫星等各种应用。程序员们可能最熟悉的是使用网络时间协议(NTP)来分发时间,NIST 通过运行多个 NTP 服务器来支持该协议。NIST 还运营着精美的 time.gov 网站,通过网页提供官方时间参考。如果你不信任电脑任务栏上的时钟,这是一个查看时间的便捷方式。

On a recent project I needed a trustworthy clock and time.gov was a convenient option. To validate that the provided reference was accurate, I opened time.gov in two browser windows side-by-side, but found that the provided clock offset estimates disagreed by a margin larger than I could tolerate. When I compared to another source, an NTP client, I found even more disagreement. Side-by-side browser windows hint of problems on time.gov I dismissed time.gov as an option, but the inconsistency of the estimates kept bugging me. My intuition told me that more precision should be possible so I had to circle back to figure out what was broken. This blog post explains the issue and how NIST fixed it. For the sake of clarity: the issue described in this post affects the clocks displayed on the time.gov website only, not any of NIST’s other time services.

在最近的一个项目中,我需要一个可靠的时钟,而 time.gov 是一个方便的选择。为了验证所提供的时间参考是否准确,我并排打开了两个 time.gov 浏览器窗口,却发现它们显示的时钟偏差估计值之间的差异大到我无法接受。当我将其与另一个来源(NTP 客户端)进行比较时,发现差异甚至更大。并排的浏览器窗口暗示了 time.gov 存在问题。我放弃了使用 time.gov,但这些估计值的不一致性一直困扰着我。直觉告诉我,应该可以实现更高的精度,所以我必须回头找出哪里出了问题。这篇博文解释了这个问题以及 NIST 是如何修复它的。为了明确起见:本文描述的问题仅影响 time.gov 网站上显示的时钟,不影响 NIST 的任何其他时间服务。

NTP and half round-trip time Before I dig into the time.gov implementation, I’ll give an overview of how NTP works. I’ll simplify things for the sake of brevity; there are great explainers elsewhere if you want an accurate deep dive. In the NTP protocol, the server responds with the current timestamp. This timestamp is accurate at the time it was generated, but the client doesn’t see it immediately. It takes time for the response to reach the client, which causes the timestamp to grow stale. The NTP client needs to estimate how much the time elapsed since the timestamp was generated. It does this by measuring the round-trip time (RTT) of the request, which includes both network latency and server processing time. Adjusting the server-provided timestamp using these metrics can produce a very good estimate of the current time. While it is possible for the request and response to take different amounts of time to travel the network, its reasonable to predict that the network delay is the same in both directions. As such, the network delay experienced by the NTP response can be estimated as half of the round-trip time.

NTP 与半往返时间:在深入研究 time.gov 的实现之前,我先概述一下 NTP 的工作原理。为了简洁起见,我会简化一些内容;如果你想深入了解,网上有很好的解释。在 NTP 协议中,服务器会响应当前的时间戳。这个时间戳在生成时是准确的,但客户端无法立即看到它。响应到达客户端需要时间,这会导致时间戳变得“陈旧”。NTP 客户端需要估算自时间戳生成以来经过了多少时间。它通过测量请求的往返时间(RTT)来实现,其中包括网络延迟和服务器处理时间。使用这些指标调整服务器提供的时间戳,可以得出当前时间非常好的估计值。虽然请求和响应在网络上传输的时间可能不同,但假设双向的网络延迟相同是合理的。因此,NTP 响应所经历的网络延迟可以估计为往返时间的一半。

Distributing time over HTTP The time.gov website synchronized time over HTTP, using JavaScript to perform the requests. The core functionality performed an HTTP request and collected timing information using “new Date()”. The calculated offset is displayed to the user and the website display is updated by adjusting the local time with the offset. The JavaScript looks like a reasonable approximation of NTP and it’s believable that it could produce the correct time. The issue time.gov was facing happened at the network level. What happened on the network? Opening time.gov loads a bunch of resources from the time.gov domain, including the HTML, JavaScript, and CSS. The network request to fetch the timestamp also uses the time.gov domain. For performance reasons web browsers typically send multiple HTTP requests over a single connection. However, the HTTP responses from time.gov all contained the Connection: close header, which told the web browser to immediately close every connection. Each time the web browser requested a resource from time.gov, it did so over a new network connection, which required TCP handshake and TLS setup. The JavaScript code assumed that, like NTP, a single network round-trip occurred. Unfortunately, the Connection: close header forced three round-trips to occur. This incorrect assumption was the root cause of the time.gov issue.

通过 HTTP 分发时间:time.gov 网站通过 HTTP 同步时间,并使用 JavaScript 执行请求。其核心功能是执行 HTTP 请求,并使用“new Date()”收集计时信息。计算出的偏差会显示给用户,网站显示则通过用该偏差调整本地时间来更新。这段 JavaScript 代码看起来像是对 NTP 的合理近似,相信它本可以产生正确的时间。time.gov 面临的问题发生在网络层面。网络上发生了什么?打开 time.gov 会从该域名加载大量资源,包括 HTML、JavaScript 和 CSS。获取时间戳的网络请求也使用 time.gov 域名。出于性能考虑,Web 浏览器通常会在单个连接上发送多个 HTTP 请求。然而,来自 time.gov 的 HTTP 响应都包含了 Connection: close 响应头,这告诉浏览器立即关闭每个连接。每次浏览器请求资源时,都必须通过新的网络连接,这需要进行 TCP 握手和 TLS 设置。JavaScript 代码假设像 NTP 一样,只发生了一次网络往返。不幸的是,Connection: close 响应头强制发生了三次往返。这种错误的假设是 time.gov 问题的根本原因。

How can this be fixed? I see two approaches that could help fix this issue. First would be to change the connection header to keep-alive. This would allow the web browser to keep the connection open longer, reducing the time-synchronization HTTP request to a single network round trip in most circumstances. This is the approach used by the National Research Council (NRC) Canada web clock.

如何修复这个问题?我认为有两种方法可以帮助解决此问题。第一种是将连接头更改为 keep-alive。这将允许 Web 浏览器保持连接更长时间,在大多数情况下将时间同步的 HTTP 请求减少为单次网络往返。这就是加拿大国家研究委员会(NRC)网络时钟所采用的方法。