Amazon Thinks the Future of Data Centers Depends on a Technical Problem It Just Solved
Amazon Thinks the Future of Data Centers Depends on a Technical Problem It Just Solved
亚马逊认为数据中心的未来取决于一个刚刚解决的技术难题
Amazon says it recently achieved a major breakthrough in networking design—and has been quietly deploying the new technology in its data centers since late last year. The company claims it has significantly increased data speeds while reducing energy use, potentially giving the tech giant an edge as companies race to build ever-faster systems in the cloud. 亚马逊表示,其近期在网络设计方面取得了重大突破,并自去年年底以来一直在其数据中心悄悄部署这项新技术。该公司声称,此举显著提高了数据传输速度并降低了能耗,随着各家公司竞相构建更快的云端系统,这可能使这家科技巨头获得竞争优势。
The new technology hinges on a “quasi-random” design that combines elements of traditional, structured data networks with the performance advantages of more random architectures. Researchers have explored random networks for decades, but the technology has never been successfully scaled. Now, Amazon thinks it has cracked the code. 这项新技术的核心在于一种“准随机”设计,它结合了传统结构化数据网络的要素与更随机架构的性能优势。研究人员几十年来一直在探索随机网络,但该技术此前从未成功实现规模化。现在,亚马逊认为他们已经破解了这个难题。
The fact that Amazon is using this in the real world is “remarkable,” says Brighten Godfrey, a computer science professor at the University of Illinois Urbana-Champaign and an expert in networking, who was not involved in Amazon’s research. Godfrey coauthored a seminal 2012 paper on random network graphs, which he says are a “mind-bending problem to solve, in general.” 伊利诺伊大学厄巴纳-香槟分校计算机科学教授、网络专家 Brighten Godfrey 表示,亚马逊能在现实世界中应用这一技术是“非凡的”(他本人未参与亚马逊的研究)。Godfrey 曾合著过一篇关于随机网络图的开创性论文,他称这类问题“总体上是一个令人费解的难题”。
A team of engineers and researchers at Amazon Web Services, including several recruited from academia, has been working on the random networking problem since 2023. Amazon also designed a new piece of data center equipment it dubbed the ShuffleBox, which automatically shuffles the cables required for this kind of networking. 自 2023 年以来,亚马逊云科技(AWS)的一支由工程师和研究人员组成的团队(其中包括多位从学术界招募的人才)一直在攻克随机网络难题。亚马逊还设计了一种名为“ShuffleBox”的新型数据中心设备,可以自动重新排列此类网络所需的电缆。
“By essentially flattening the network, we eliminated the bottlenecks that come with traditional networking designs,” Matt Rehder, vice president of AWS Network Engineering, said in an exclusive interview with WIRED. “We think we’re the only ones who have done this at scale.” “通过从本质上扁平化网络,我们消除了传统网络设计中存在的瓶颈,”AWS 网络工程副总裁 Matt Rehder 在接受《连线》(WIRED)独家采访时表示。“我们认为我们是唯一一家实现大规模应用该技术的公司。”
Amazon detailed its new networking design in a paper published last month titled “RNG: Flat Datacenter Networks at Scale.” RNG stands for “resilient network graphs,” which are neither entirely structured nor entirely random. 亚马逊在上个月发表的一篇题为《RNG:大规模扁平化数据中心网络》(RNG: Flat Datacenter Networks at Scale)的论文中详细介绍了其新的网络设计。RNG 代表“弹性网络图”(resilient network graphs),它既非完全结构化,也非完全随机。
Interestingly, the Amazon team behind RNG isn’t making this networking pitch around generative AI. This is about making the company’s everyday data center architecture more efficient. “RNG is a great fit for our core demands, but AI training data patterns are far more coordinated and centrally orchestrated, so they don’t approximate a random graph,” Rehder says. 有趣的是,负责 RNG 的亚马逊团队并没有围绕生成式 AI 来推销这种网络技术。其目的是为了提高公司日常数据中心架构的效率。Rehder 表示:“RNG 非常适合我们的核心需求,但 AI 训练数据模式的协调性和中心化程度要高得多,因此它们并不接近随机图。”
Since the mid-1980s, communications networks—from telecoms to data centers—have been predominantly designed with a “fat-tree” topology, which includes two or three vertical layers of switches and routers. These are connected by “fat” nodes at the top of the structure, where there are multiple routers of the same type, and thinner branches toward the bottom. Put very simply, in a fat-tree network, data moves up and down the stack. The increased bandwidth near the top of the structure, where the data bisects, helps eliminate bottlenecks. 自 20 世纪 80 年代中期以来,从电信到数据中心的通信网络主要采用“胖树”(fat-tree)拓扑结构设计,其中包括两到三层垂直的交换机和路由器。这些结构在顶部通过“胖”节点连接,那里有多个同类型的路由器,而底部则有较细的分支。简单来说,在胖树网络中,数据在堆栈中上下移动。结构顶部(数据分流处)增加的带宽有助于消除瓶颈。
Over time, the tech industry has developed and deployed variations on the fat-tree architecture. But the design has room for improvement. It’s generally reliable but also rigid and inefficient, and it requires complex cabling. As in, actual physical cables. 随着时间的推移,科技行业开发并部署了胖树架构的各种变体。但这种设计仍有改进空间。它虽然总体可靠,但也显得僵化且效率低下,并且需要复杂的布线。这里指的是实际的物理电缆。
If you’ve ever been in a data center or an office building’s server room, you’ve likely seen nests of colorful cables spilling out of metal racks. Cabling is one of the greatest costs in networking, Rehder says, and Amazon’s global data centers are currently connected with 20 million kilometers of fiber-optic cables. That’s roughly the distance it would take to travel from Earth to the moon and back 25 times. 如果你曾进入过数据中心或办公楼的机房,你很可能见过从金属机架中溢出的彩色电缆网。Rehder 表示,布线是网络建设中最大的成本之一,亚马逊全球数据中心目前连接着 2000 万公里的光纤电缆。这大约相当于往返地球与月球 25 次的距离。
In 2012, as the demand for cloud computing services was exploding, a group of researchers at University of Illinois Urbana-Champaign, including Godfrey, introduced a concept known as Jellyfish. Fixed network designs in use at the time were struggling to meet growing demand, so the researchers proposed a “high-capacity network interconnect which, by adopting a random graph topology, yields itself naturally to incremental expansion.” They believed this random approach could be more efficient and scalable than networks built using the fat-tree architecture. 2012 年,随着云计算服务需求的激增,伊利诺伊大学厄巴纳-香槟分校的一组研究人员(包括 Godfrey)提出了一个名为“水母”(Jellyfish)的概念。当时使用的固定网络设计难以满足日益增长的需求,因此研究人员提出了一种“高容量网络互连方案,通过采用随机图拓扑,使其能够自然地进行增量扩展。”他们认为,这种随机方法比使用胖树架构构建的网络更高效、更具可扩展性。
“We gave it the name Jellyfish because it’s fluid,” Godfrey says. “You can connect the routers and switches randomly and it becomes this flexible pool of network capacity, which is very efficient.” “我们给它起名‘水母’是因为它具有流动性,”Godfrey 说。“你可以随机连接路由器和交换机,它就变成了一个灵活的网络容量池,非常高效。”
However, Jellyfish also introduced new challenges in layout, data routing, and cabling. Routing in random graphs is trickier, Godfrey says, because there are many more and diversified paths that data can take from its source to its destination. Cabling is harder because the endpoints of the cables are chosen randomly. 然而,“水母”架构也在布局、数据路由和布线方面带来了新的挑战。Godfrey 指出,在随机图中进行路由更棘手,因为数据从源头到目的地可以走的路径更多且更多样化。布线也更困难,因为电缆的端点是随机选择的。
A couple of years later, Google began toying with another solution: It started integrating optical circuit switching, or OCS, into its network designs. This approach uses tiny mirrors to reflect light from an input port to an output port, which lets Google reconfigure optical cabling in real time. But again, this adds a certain amount of engineering complexity as well as cost. 几年后,谷歌开始尝试另一种解决方案:它开始将光路交换(OCS)集成到其网络设计中。这种方法利用微小的镜子将光从输入端口反射到输出端口,使谷歌能够实时重新配置光缆。但同样,这也增加了一定的工程复杂性和成本。
So RandomAmazon, meanwhile, was searching for the “holy grail,” says Giacomo Bernardi, who is one of the lead authors on the new paper, along with Amazon Scholars Ratul Mahajan and Seshadhri Comandur. In an ideal world, a data network would be flat and efficient, resilient to hardware failures, random enough to maximize performance, and scalable enough to grow without becoming unwieldy. It would also rely on simpler, streamlined cabling rather than increasingly complex fiber-optic systems. 与此同时,亚马逊一直在寻找“圣杯”,新论文的首席作者之一 Giacomo Bernardi 说道(其他作者还包括亚马逊学者 Ratul Mahajan 和 Seshadhri Comandur)。在理想世界中,数据网络应该是扁平且高效的,能够抵御硬件故障,具备足够的随机性以最大化性能,并具有足够的可扩展性,在增长时不会变得难以管理。它还应依赖更简单、精简的布线,而不是日益复杂的光纤系统。
When he and his colleagues began trying to build such a network, Bernardi says he had already become obsessed with Penrose tiling, a kind of aperiodic tiling named after the British physicist Roger Penrose. (Other researchers have been so inspired by Penrose tilings that they’ve tried to translate the patterns into error-correcting code in quantum computers.) Bernardi wondered if Amazon could use a similar construction and create a flat “mesh” by following a repeating pattern. He and his team tried building a simulation of… 当他和同事开始尝试构建这样的网络时,Bernardi 说他已经迷上了彭罗斯平铺(Penrose tiling),这是一种以英国物理学家罗杰·彭罗斯命名的非周期性平铺方式。(其他研究人员也深受彭罗斯平铺的启发,试图将这些模式转化为量子计算机中的纠错码。)Bernardi 想知道亚马逊是否可以利用类似的结构,通过遵循某种重复模式来创建一个扁平的“网格”。他和他的团队尝试建立了一个模拟……