Project Valhalla, Explained: How a Decade of Work Arrives in JDK 28

Project Valhalla, Explained: How a Decade of Work Arrives in JDK 28

Project Valhalla 解析:十年磨一剑,终入 JDK 28

The new JVM Weekly is here… and Ragnarok seems to come, as we finally have Valhalla in the JDK. However, situation is a bit… nuanced. 新一期 JVM Weekly 如约而至……“诸神黄昏”似乎真的降临了,因为我们终于在 JDK 中迎来了 Valhalla。然而,情况稍微有点……微妙。

On June 15, Oracle engineer Lois Foltan confirmed what a good chunk of the industry had stopped believing: JEP 401: Value Classes and Objects will be integrated into the main OpenJDK repository and is targeting JDK 28. 6 月 15 日,Oracle 工程师 Lois Foltan 证实了业界相当一部分人已经不再相信的事情:JEP 401:值类与对象(Value Classes and Objects)将被集成到 OpenJDK 主仓库中,并计划在 JDK 28 中发布。

The change is so large that the remaining committers were asked to hold off on bigger commits during the integration. The pull request alone adds over 197 thousand lines of code across 1,816 files. 这次变更规模巨大,以至于其他提交者被要求在集成期间暂停提交大型代码。仅这个合并请求(Pull Request)就跨越 1,816 个文件,增加了超过 19.7 万行代码。

Before we pop the champagne, though: this is preview, disabled by default, and, as Brian Goetz was quick to cool everyone down, “only the first part of Valhalla.” Goetz added a great observation that the “they’ll never ship it” crowd will now smoothly switch over to “but they didn’t ship the most important part” (and a joke has been going around the community for years that we’ll sooner end up in Valhalla ourselves, the Norse-afterlife one, than the project ships). 不过,在开香槟庆祝之前:这只是预览版,默认处于禁用状态。正如 Brian Goetz 迅速泼下的冷水所言,这“只是 Valhalla 的第一部分”。Goetz 还敏锐地指出,那些曾经说“他们永远不会发布”的人,现在会顺滑地切换到“但他们没发布最核心的部分”(社区里流传多年的笑话是:我们自己先去北欧神话里的那个 Valhalla,项目都不一定能发布)。

So this is a good moment to tell the whole story. This issue is one big deep-dive, written on the assumption that you’ve never followed the work on Valhalla before: from the 2014 problem, through the evolution of ideas (a fair number of which ended up in the trash), all the way to what exactly we’ll be getting our hands on in JDK 28. 现在是讲述完整故事的好时机。本期内容是一次深度解析,假设你此前从未关注过 Valhalla 的进展:从 2014 年的问题开始,经历各种理念的演变(其中不少最终被废弃),一直到我们在 JDK 28 中到底能用上什么。

1. Introduction - what this is even about

1. 引言——这到底是什么?

The slogan Valhalla has carried from the start is: “codes like a class, works like an int.” In a single sentence it captures the whole point of the project: we want to write normal, readable classes with methods, constructor validation, and sensible field names, but we want the JVM to be able to treat them as efficiently as primitives. Valhalla 从一开始就打出的口号是:“像类一样编写代码,像 int 一样运行”。这句话一语道破了该项目的核心:我们希望编写正常、可读的类,拥有方法、构造函数验证和合理的字段名,但同时希望 JVM 能像处理基本类型(primitives)那样高效地处理它们。

To understand why this is a problem, you have to go back to Java’s foundation. In this language, with the exception of the eight primitives (int, long, double, boolean, and the rest), everything is a reference type. When you write Point p = new Point(1, 2), the variable p isn’t a point. The variable p is a pointer, a coat-check number: somewhere on the heap sits an object, and you’re holding a slip of paper with its address. Every time you want to read a field, the JVM has to “go to the coat check,” performing a hop through the pointer (pointer indirection). 要理解为什么这是一个问题,必须回到 Java 的基础。在这个语言中,除了八种基本类型(int, long, double, boolean 等)外,一切皆为引用类型。当你写下 Point p = new Point(1, 2) 时,变量 p 并不是一个点,而是一个指针,就像寄存处的号码牌:对象位于堆的某个地方,而你手里拿着一张写有地址的纸条。每次你想读取字段时,JVM 都必须“去寄存处”,通过指针进行跳转(指针间接寻址)。

For a single object, that’s nothing. The problem starts at scale. Every object on the heap has its own header (a dozen-or-so bytes of metadata: among other things, so the JVM knows what type it is and whether anyone is synchronizing on it). Incidentally, this is exactly the problem Project Lilliput has been tackling lately, helping to shrink object header sizes. But header size isn’t everything. Every object has to be allocated, and later garbage collected. And since objects are scattered across the heap, an array of a million Points is in practice a million slips of paper pointing at a million boxes strewn across the whole warehouse. 对于单个对象来说,这不算什么。问题在于规模化之后。堆上的每个对象都有自己的头部(约十几字节的元数据:用于让 JVM 识别类型以及是否有人在对其进行同步等)。顺便一提,这正是 Project Lilliput 最近致力于解决的问题,旨在缩小对象头的大小。但头部大小并非全部。每个对象都必须分配内存,随后还要进行垃圾回收。由于对象散落在堆中,一个包含一百万个 Point 的数组,实际上就是一百万张纸条,指向散落在整个仓库里的百万个盒子。

Brian Goetz, in his “State of Valhalla” documents, calls such a memory layout “fluffy”: puffed up, bloated. What we dream of is a dense layout, one where the data lies side by side. Brian Goetz 在他的“Valhalla 现状”文档中将这种内存布局称为“蓬松”(fluffy):即膨胀、臃肿。我们梦寐以求的是一种紧凑的布局,让数据并排存储。

Why does density matter? Because the hardware changed faster than Java did. In 1995, a memory access cost roughly the same as a CPU operation. Today the CPU is two orders of magnitude faster than main memory, and the whole gap is bridged by the cache. The processor reads memory in chunks called cache lines (usually 64 bytes). If the data lies densely and in order, one such chunk brings in a ton of useful values at once. If we’re hopping across pointers, every access risks a cache miss, and that can be a hundred times slower than a hit. This is locality of reference, and it’s the real stake in this whole game. 为什么密度很重要?因为硬件的发展速度超过了 Java。1995 年,内存访问的成本与 CPU 操作大致相当。而今天,CPU 的速度比主存快了两个数量级,整个差距由缓存来弥补。处理器以“缓存行”(通常为 64 字节)为单位读取内存。如果数据紧凑且有序,一个缓存行就能一次性带入大量有用的值。如果我们通过指针跳转,每次访问都有可能导致缓存未命中,这比命中缓存慢上百倍。这就是引用局部性(locality of reference),也是这场博弈中真正的赌注。

“But the JVM has escape analysis,” someone sharp will say. True: the virtual machine can recognize that some object never “escapes” beyond a local fragment of code, and then it doesn’t allocate it at all. From the programmer’s point of view it looks as if the object exists, but in reality its fields get spread out into ordinary variables or CPU registers. In the best case, the cost of allocation and the later cleanup by the garbage collector drops to practically zero. “但 JVM 有逃逸分析(escape analysis),”敏锐的人会说。没错:虚拟机可以识别出某些对象从未“逃逸”出局部代码片段,从而根本不进行堆分配。从程序员的角度看,对象似乎存在,但实际上其字段被拆解为普通变量或 CPU 寄存器。在最佳情况下,分配成本和后续垃圾回收的清理成本几乎降为零。

The trouble is that this optimization is unpredictable and fragile. It works only when the JIT compiler can trace the object’s entire flow with high confidence. But all it takes is for the object to land in a field of another class, get stored in an array, get passed into a more complex method, or appear beyond the boundary of code the JIT can analyze, and the whole trick stops working. The source code stays identical, but the performance behavior can change dramatically. 问题在于这种优化是不可预测且脆弱的。它仅在 JIT 编译器能高置信度地追踪对象完整流向时才有效。但只要对象被存入另一个类的字段、存入数组、传入更复杂的方法,或者出现在 JIT 无法分析的代码边界之外,这个技巧就会失效。源代码完全没变,但性能表现却可能发生剧烈变化。

This is precisely why experienced JVM programmers treat escape analysis as a nice bonus, not a project’s foundation. If an application’s performance depends on whether a particular JIT version manages to apply this optimization, it’s very easy to fall into the trap of hard-to-predict regressions. A minor refactor, a JDK update, or a change in code structure can send objects back onto the heap, and the costs of allocation and garbage-collector work return in full force. 这正是为什么经验丰富的 JVM 程序员将逃逸分析视为一种“锦上添花”,而非项目基石的原因。如果应用程序的性能取决于特定的 JIT 版本是否能成功应用此优化,就很容易陷入难以预测的性能回退陷阱。一次微小的重构、一次 JDK 更新或代码结构的变动,都可能导致对象重新回到堆上,分配和垃圾回收的成本将卷土重来。

That leaves the brute-force option: give up on objects and encode the data by hand. Instead of a Color class, hold three bytes r, g, b. This isn’t just an academic example. The approach has been used for years in game engines, graphics libraries, image-processing systems, databases, analytics engines, and HPC code, where every byte of memory and every allocation matters. The trouble is that the speed comes at the cost of safety and readability. We lose names, private… 这就只剩下“暴力破解”方案了:放弃对象,手动编码数据。不再使用 Color 类,而是存储三个字节 r、g、b。这不仅仅是一个学术示例。这种方法多年来一直被用于游戏引擎、图形库、图像处理系统、数据库、分析引擎和高性能计算(HPC)代码中,在这些领域,每一字节内存和每一次分配都至关重要。问题在于,这种速度是以牺牲安全性和可读性为代价的。我们失去了名称、私有访问控制……