Recapping the London Mercurial sprint
Recapping the London Mercurial sprint
回顾伦敦 Mercurial 开发冲刺活动
The sprint is already over! Thanks to the gracious hosting by Jane Street, we gathered a bit less than 20 people every day and managed to discuss and work on a large variety of subjects, including truly riveting discussions over the genetics of cross-breeding apples and oranges over lunch break. The sprint was set in motion and organized by Pierre-Yves David and Raphaël Gomès, both maintainers of Mercurial working at Octobus, as well as Arun Kulshreshtha from Jane Street. 本次开发冲刺活动已经圆满结束!感谢 Jane Street 的热情接待,我们每天聚集了近 20 人,讨论并处理了各种各样的议题,甚至在午休时还进行了关于苹果和橙子杂交遗传学的精彩讨论。本次活动由 Octobus 的 Mercurial 维护者 Pierre-Yves David 和 Raphaël Gomès,以及 Jane Street 的 Arun Kulshreshtha 共同发起并组织。
Day 1 - Wednesday 27th
第一天 - 11 月 27 日,周三
The first day saw everyone get started on tasks that they either had wanted to get done for a long time, or that sparked up from ad-hoc discussions. This was very much the point of this first day: a focus on bootstrapping occasional contributors and overall project maintenance. If we include everything submitted during the 3-day window as far as visible changes go, we received a few bug fixes (#1965, #1968, #1969, #1970, #1971, #1976), some documentation (#1967, #1975, poulpe#82) and website improvements (hg-website#31), a new debug command (#1973) to create a synthetic repo from a DAG, and some good progress on larger work that never gets enough attention (#1974, #1966, ci-images#66). We also hatched a plan with Matt Harbison remoting in from the US to fix the Windows console encoding deprecation problem. 第一天,大家开始着手处理那些积压已久的任务,或是由即兴讨论引发的新工作。这正是第一天的核心目标:重点帮助偶尔贡献者上手,并进行整体的项目维护。如果算上这三天内提交的所有可见变更,我们共收到了若干错误修复(#1965, #1968, #1969, #1970, #1971, #1976)、部分文档更新(#1967, #1975, poulpe#82)和网站改进(hg-website#31),一个用于从 DAG 创建合成仓库的新调试命令(#1973),以及在一些长期被忽视的大型任务上取得了良好进展(#1974, #1966, ci-images#66)。我们还与从美国远程连线的 Matt Harbison 制定了修复 Windows 控制台编码弃用问题的计划。
Progress was made on projects external to Mercurial but very much integral to its ecosystem. Manuel Jacob helped lay out a plan for hg-git’s tech debt, while Georges Racinet handled Heptapod’s 18.10.4 and 18.11.4 releases, while working towards making Heptapod “cloud native” by GitLab’s definition (heptapod#1647). Finally, for people who weren’t busy with the above or getting familiar with new features of Mercurial, it was time to already start more high-level discussions. These discussions kept going for most of the rest of the sprint… which brings us to day 2 and 3! 在 Mercurial 外部但对其生态系统至关重要的项目上也取得了进展。Manuel Jacob 协助规划了 hg-git 的技术债务处理方案,Georges Racinet 完成了 Heptapod 18.10.4 和 18.11.4 版本的发布,并致力于按照 GitLab 的定义使 Heptapod 实现“云原生化”(heptapod#1647)。最后,对于那些没有忙于上述工作或熟悉 Mercurial 新功能的人来说,是时候开始更高级别的讨论了。这些讨论贯穿了冲刺活动的剩余时间……这也引出了第二天和第三天的内容!
Day 2 and 3 - Thursday 28th and Friday 29th
第二天和第三天 - 11 月 28 日周四及 29 日周五
The agenda for days 2 and 3 was to get everyone familiar with the latest, current and future developments of Mercurial, as well as to discuss concepts from the larger VCS ecosystem. Here are some of the larger discussions we can remember. 第二天和第三天的议程是让大家熟悉 Mercurial 的最新现状和未来发展,并讨论更广泛的版本控制系统(VCS)生态系统中的概念。以下是我们记录下的一些主要讨论议题。
A Virtual File System for Mercurial
Mercurial 的虚拟文件系统 (VFS)
As repositories grow larger and larger, filesystem overhead gets to be more and more noticeable, both in terms of disk usage and speed. Even a fast Rust parallel implementation of hg update can take up to several seconds for large working copies, with kernel writes and inode creation overhead at the center of the slowdown. It’s no secret that tech giants like Microsoft or Meta have used virtual file systems to fight this scale and improve their developer experience, and it’s time for Mercurial to grow its FOSS, fully integrated VFS. Upstream development of this effort was started earlier this year. The first experimental read-only and local version based on FUSE is already being used by real users in conjunction with an overlay filesystem to support writes. This has improved the time to first interaction for a new working copy in the worst cases from 20s+ to under 2s, with only a 10-20% overhead in normal operations. During the sprint, the discussions were mostly about planning what’s next for the VFS: faster update still, seamless support in hg status and hg update, and an integrated write layer.
随着仓库规模越来越大,文件系统的开销在磁盘使用和速度方面变得愈发明显。即使是快速的 Rust 并行实现 hg update,在处理大型工作副本时也可能耗时数秒,而内核写入和 inode 创建开销是导致减速的核心原因。众所周知,像微软或 Meta 这样的科技巨头已经使用虚拟文件系统来应对这种规模并改善开发体验,现在是 Mercurial 开发其开源、完全集成式 VFS 的时候了。这项工作的上游开发始于今年早些时候。第一个基于 FUSE 的实验性只读本地版本已经由真实用户结合覆盖文件系统(overlay filesystem)来支持写入。这使得新工作副本的首次交互时间在最坏情况下从 20 多秒缩短到了 2 秒以内,且正常操作下的开销仅增加了 10-20%。在冲刺期间,讨论主要集中在 VFS 的下一步规划上:更快的更新速度、在 hg status 和 hg update 中的无缝支持,以及一个集成的写入层。
Heptapod, our friendly GitLab fork
Heptapod,我们友好的 GitLab 分支
Heptapod is a major way that Mercurial stays relevant both for the FOSS community and professional users. Its maintainer Georges Racinet gave a small presentation about its current state and its future, right after being done keeping up with the latest GitLab releases. This discussion helped clear a few misconceptions about Heptapod, fix a couple of small user problems as well as helped with the hg-git effort. Finally, some of the blockers for upgrading Heptapod to Mercurial 7.2 have been identified and will be dealt with soon. Heptapod 是 Mercurial 在开源社区和专业用户中保持影响力的重要途径。其维护者 Georges Racinet 在完成与最新 GitLab 版本的同步后,简要介绍了其现状和未来。这次讨论澄清了关于 Heptapod 的一些误解,修复了几个用户小问题,并对 hg-git 的工作提供了帮助。最后,一些阻碍 Heptapod 升级到 Mercurial 7.2 的问题已被识别,并将很快得到解决。
Scaling obsmarker exchange and bundle caching
扩展 obsmarker 交换与 bundle 缓存
Florian Horn, Laurent Bulteau and Pierre-Yves David presented and discussed with other attendees new developments that are currently being upstreamed. We have an upcoming set of algorithms and formats that enable significant performance and storage improvements for both exchanging obsolescence markers and improve the cache of bundles. While the mathematical modelling has been underway for a long time, we have finally been able to start the upstream implementation and we will most likely cover that whole topic in a separate post when it becomes usable. Florian Horn、Laurent Bulteau 和 Pierre-Yves David 与其他参会者讨论了目前正在向上游提交的新进展。我们即将推出一套算法和格式,旨在显著提升交换“废弃标记”(obsolescence markers)和改进 bundle 缓存的性能与存储效率。虽然数学建模工作已经进行了很长时间,但我们终于能够开始上游实现,一旦该功能可用,我们很可能会在单独的文章中详细介绍这一主题。
First-class conflicts
一等公民冲突处理
As soon as you can do multiple things concurrently, you will have conflicts: they are an inevitable part of version control. A conflict is an ambiguity, and many version control systems give you neither a good model nor a good interface to help you with them. Pijul (the spiritual successor to Darcs) is the only active version control system that we know of with a mathematical model of conflicts. For our users, this model can be thought of as an extension of the Mercurial branching model: multiple heads on a branch is a natural consequence of things happening at the same time in a distributed branching model, conflicts are the natural consequence of things happening at the same time in a distributed version control system. Why should we model file changes any differently than we do branches? It turns out that this model’s contact with the real world is not without its share of headaches, and there are still a lot of things to iron out. Pierre-Étienne Meunier, creator and maintainer of Pijul, has been very open to collaboration. Of course, lately the Jujutsu VCS has become very popular, with its own flavor of first-class conflicts. Some users seem to get a lot of mileage out of it and we can definitely learn something from the use cases it covers. Nevertheless, a more general and complete model is needed as there are edge cases where the approach suffers. Conflict handling turns out to be especially painful in the context… 一旦你可以并发执行多项操作,冲突就会产生:它们是版本控制中不可避免的一部分。冲突即歧义,而许多版本控制系统既没有提供良好的模型,也没有提供良好的界面来帮助用户处理它们。Pijul(Darcs 的精神继承者)是我们所知的唯一具有冲突数学模型的活跃版本控制系统。对于我们的用户来说,这个模型可以被视为 Mercurial 分支模型的扩展:分支上的多个头(heads)是分布式分支模型中同时发生操作的自然结果,而冲突则是分布式版本控制系统中同时发生操作的自然结果。为什么我们对文件变更的建模方式要与分支不同呢?事实证明,这个模型在现实世界中的应用并非没有麻烦,仍有许多问题需要解决。Pijul 的创建者兼维护者 Pierre-Étienne Meunier 对合作持非常开放的态度。当然,最近 Jujutsu VCS 变得非常流行,它也有自己的一套“一等公民冲突”处理方式。一些用户似乎从中受益匪浅,我们绝对可以从它涵盖的用例中学到一些东西。然而,由于该方法在某些边缘情况下表现不佳,我们需要一个更通用、更完整的模型。冲突处理在……的背景下显得尤为痛苦。