A 16-byte x86 demo: Matrix rain with sound
A 16-byte x86 demo: Matrix rain with sound
16字节 x86 演示:矩阵雨与声音
wake up! 16b Released at the Outline Demoparty in May 2026, Ommen, NL. An exploration of algorithmic density in 16 bytes of x86 assembly. Watch Video | Demozoo Entry. 醒醒!16b 于 2026 年 5 月在荷兰 Ommen 举办的 Outline Demoparty 上发布。这是一次对 16 字节 x86 汇编语言中算法密度的探索。观看视频 | Demozoo 条目。
In the demoscene, exploring what can be achieved within extreme constraints is a rewarding technical challenge. The following 16 bytes of x86 real-mode DOS assembly code represent a careful exercise in algorithmic density. When executed, it utilizes the computer’s video memory as a calculation space to draw an infinite Sierpinski fractal, while simultaneously interpreting that geometry as audio data. 在演示场景(Demoscene)中,探索在极端限制下能实现什么是一项极具成就感的技术挑战。以下 16 字节的 x86 实模式 DOS 汇编代码是一次关于算法密度的严谨练习。执行时,它利用计算机的显存作为计算空间来绘制无限的谢尔宾斯基(Sierpinski)分形,同时将该几何结构解释为音频数据。
int 10h ; 2 bytes
mov bh, 0xb8 ; 2 bytes
mov ds, bx ; 2 bytes
L: lodsb ; 1 byte
sub si, byte 57 ; 3 bytes
xor [si], al ; 2 bytes
out 61h, al ; 2 bytes
jmp short L ; 2 bytes
1. The Canvas: A Primed Void
1. 画布:预置的虚空
The code begins with a standard BIOS interrupt: int 10h. This initializes Video Mode 0, establishing a 40x25 text mode grid. The subsequent instructions point the Data Segment (DS) to 0xB800, the physical memory address of the VGA/CGA text buffer. When the BIOS clears the screen during this interrupt, it does not fill the memory with absolute zeroes. In text mode, every character space consists of two bytes: the ASCII character and the color attribute. The BIOS initializes all 2,000 character slots uniformly: the ASCII byte is set to 0x20 (the Space character), and the color byte is set to 0x07 (Light Gray text on a Black background). While the screen appears completely empty, it is actually a canvas primed with a uniform pattern of data.
代码以标准的 BIOS 中断 int 10h 开始。这会初始化视频模式 0,建立一个 40x25 的文本模式网格。随后的指令将数据段(DS)指向 0xB800,即 VGA/CGA 文本缓冲区的物理内存地址。当 BIOS 在此中断期间清屏时,它并不会将内存完全填充为零。在文本模式下,每个字符空间由两个字节组成:ASCII 字符和颜色属性。BIOS 会统一初始化所有 2,000 个字符槽位:ASCII 字节被设为 0x20(空格字符),颜色字节被设为 0x07(黑底浅灰字)。虽然屏幕看起来完全空白,但它实际上是一块预置了统一数据模式的画布。
The Importance of Uniformity: The mathematical progression we are about to explore relies on a predictable environment. If the memory contained random artifact data, the algorithmic calculations would ingest those discrepancies. In a cellular automaton, an unexpected bit can disrupt the pattern. A reasonably uniform memory space provides a foundation for the fractal to emerge clearly. 均匀性的重要性: 我们即将探索的数学演进依赖于一个可预测的环境。如果内存中包含随机的伪影数据,算法计算就会摄入这些差异。在元胞自动机中,一个意外的位(bit)可能会破坏整个模式。一个相对均匀的内存空间为分形的清晰呈现提供了基础。
2. The Engine: Additive Prefix Sums
2. 引擎:加法前缀和
To understand the pure mathematics of the fractal, let us temporarily isolate our variables. We will model a perfectly zeroed state instead of the base 0x20 initialization. Additionally, we will substitute add instead of xor, and step forward by 16 bytes at a time across this memory, assuming the accumulator AL is loaded with the value 2. A real-mode DOS segment spans exactly 65,536 bytes. By moving forward 16 bytes per iteration, it takes exactly 4,096 steps to traverse the segment ($65536 / 16 = 4096$). When the SI register advances past 0xFFFF, it wraps cleanly back to 0x0000. As the loop progresses, it adds the current value of the accumulator to the memory cell, reading the updated value back into the accumulator. This effectively creates a running prefix sum. Because 4,096 is a multiple of 256 (the capacity of our 8-bit register), the mathematical carryover aligns when the segment wraps, cleanly resetting AL to 2 at the end of each full sweep.
为了理解分形的纯数学原理,我们暂时隔离变量。我们将模拟一个完全归零的状态,而不是 0x20 的初始状态。此外,我们将用 add 代替 xor,并以每次 16 字节的步长遍历内存,假设累加器 AL 的初始值为 2。实模式 DOS 段正好跨越 65,536 字节。通过每次迭代前进 16 字节,遍历整个段正好需要 4,096 步 ($65536 / 16 = 4096$)。当 SI 寄存器超过 0xFFFF 时,它会平滑地回绕到 0x0000。随着循环进行,它将累加器的当前值加到内存单元中,并将更新后的值读回累加器。这实际上创建了一个运行中的前缀和。由于 4,096 是 256(8 位寄存器的容量)的倍数,当段回绕时,数学进位会保持对齐,从而在每次完整扫描结束时将 AL 清晰地重置为 2。
3. Crystallization: XOR and the Sierpinski Shift
3. 结晶:异或与谢尔宾斯基位移
A deeper pattern is present within those decimal values. When performing binary addition, the bit-planes carry over into adjacent positions. However, if we discard the arithmetic carry and perform addition strictly modulo 2, we are left with the Exclusive OR (XOR) operation. By using xor instead of add, the algorithm isolates the bit-planes. Because our modeled starting value is 2 (binary 00000010), only Bit 1 is ever affected by this specific calculation. The cascading decimal numbers become a pure toggle between 0x00 and 0x02. This progression maps perfectly to Rule 60 in Stephen Wolfram’s elementary cellular automata:
在这些十进制数值中存在着更深层的模式。在进行二进制加法时,位平面会向相邻位置进位。然而,如果我们丢弃算术进位并严格执行模 2 加法,剩下的就是异或(XOR)运算。通过使用 xor 而非 add,算法隔离了位平面。由于我们模拟的起始值是 2(二进制 00000010),只有第 1 位会受到此特定计算的影响。级联的十进制数字变成了 0x00 和 0x02 之间的纯粹切换。这种演进完美映射到斯蒂芬·沃尔夫勒姆(Stephen Wolfram)初等元胞自动机中的规则 60:
$$Cell^{(p)}[k] = Cell^{(p-1)}[k] \oplus Cell^{(p)}[k-1]$$
According to Lucas’s Theorem, this XOR relationship is mathematically guaranteed to match the state of Bit 1 from the additive table. 根据卢卡斯定理(Lucas’s Theorem),这种异或关系在数学上保证与加法表中的第 1 位状态相匹配。
4. The Voice of the Machine: Translating Data to Audio
4. 机器之声:将数据转化为音频
A remarkably elegant detail lies in the instruction: out 61h, al. Port 61h interfaces with the internal PC speaker. Bit 1 of this port directly controls the speaker cone by pushing it outward when set to 1, and returning it when set to 0. Our routine computes the fractal via XOR, updates the memory, and immediately sends that byte to the speaker port. Because the algorithm specifically isolates and toggles Bit 1, the geometry of the Sierpinski triangle serves as a direct set of instructions for the speaker cone. The execution speed of the CPU establishes the functional sample rate. The patterns of 1s and 0s generated by the fractal yield distinct square waves, varying naturally in pulse width and frequency.
一个非常优雅的细节在于指令:out 61h, al。端口 61h 与内置的 PC 扬声器接口。该端口的第 1 位直接控制扬声器锥体:设为 1 时将其向外推,设为 0 时使其返回。我们的程序通过异或计算分形,更新内存,并立即将该字节发送到扬声器端口。由于算法专门隔离并切换第 1 位,谢尔宾斯基三角形的几何结构直接充当了扬声器锥体的一组指令。CPU 的执行速度决定了功能采样率。由分形生成的 1 和 0 的模式产生了独特的方波,其脉冲宽度和频率自然变化。
5. The 56-Byte Step: Octave Shifts and Diagonal Shears
5. 56 字节步长:八度偏移与对角剪切
Returning to the actual code, we notice it does not step by 16. The instruction sub si, byte 57, combined with the increment from lodsb, results in a net movement of -56 bytes per iteration. The routine traverses memory in reverse. This adjustment alters both the auditory frequency and the visual layout of the output.
回到实际代码,我们注意到它并不是以 16 为步长。指令 sub si, byte 57 结合 lodsb 的增量,导致每次迭代的净移动量为 -56 字节。该程序反向遍历内存。这种调整改变了音频频率和输出的视觉布局。