An interactive introduction to the terrific experience of rendering Arabic typography and its technical debt
An interactive introduction to the terrific experience of rendering Arabic typography and its technical debt
阿拉伯语排版渲染的绝妙体验及其技术债:交互式导论
Jun 10, 2026 2026年6月10日
An interactive introduction to the terrific experience of rendering Arabic typography and its technical debt. 阿拉伯语排版渲染的绝妙体验及其技术债:交互式导论。
Once upon a time, a frontend ticket landed on my queue which was not properly mine, but the only other Arabic reader on the team was on leave. It went roughly as follows; a block of mixed-content Arabic prose on the customer-facing dashboard was rendering with a ragged left edge (the rag falls on the left in Arabic, since the lines set out from the right margin; the ticket said “ragged right”) when the design team had explicitly specified justified text. Attached were three screenshots from three browsers and a polite note from the product manager observing that the Latin-script version of the same block looked, I quote, “fine.” 曾几何时,我的任务队列中出现了一个本不该由我处理的前端工单,但团队中唯一懂阿拉伯语的同事当时正在休假。工单内容大致如下:客户仪表盘上的一段阿拉伯语混合内容文本显示时左边缘参差不齐(阿拉伯语的行是从右向左排版的,因此“参差”出现在左侧;而工单却写着“右侧参差”),但设计团队明确要求使用两端对齐。附件中包含了来自三个浏览器的截图,以及产品经理的一条礼貌备注,指出同一区块的拉丁字母版本看起来“没问题”。
The same six months I had closed three other tickets against the same product, each of which had presented to its filer as the only bug. A customer’s name had appeared with its letters unjoined on a printed agreement, the way a sign-painter would have laid them out in 1962, because the PDF library on the receipt server pre-dated the existence of a shaping engine in its language runtime. A search index had been returning empty for accounts the customer service team could see in the database because a 2017 import had encoded twelve thousand names using fossil Unicode codepoints from 1991 instead of regular ones from 1995, and the index, very reasonably, treated the two encodings as different strings. So, that ragged-left ticket was the smallest of the four, HOWEVER, it sat on top of the same iceberg and pointed at the same thing. 在那六个月里,我处理了针对同一产品的另外三个工单,每一个工单在提交者眼中都是“唯一的漏洞”。一份打印协议上,一位客户的名字字母没有连接在一起,看起来就像1962年招牌画师排版的那样,原因在于收据服务器上的PDF库早于其语言运行时中字形引擎的出现。一个搜索索引对客服团队在数据库中可见的账户返回空结果,因为2017年的一次导入使用了1991年的过时Unicode码点而非1995年的标准码点,而索引非常合理地将这两种编码视为不同的字符串。所以,那个左侧参差的工单是四个问题中最小的一个,然而,它却位于同一座冰山之上,指向了同一个本质问题。
Here is the disagreement, reproduced live. I used random text, the original had more spacing, I’m too lazy to pick words to maximize the ragging and spacing. 以下是争议的现场重现。我使用了随机文本,原文的间距更大,我太懒了,不想为了最大化参差效果和间距去挑选词汇。
[PRODUCTION, ANY BROWSER] [生产环境,任意浏览器] الخط هندسة روحانية ظهرت بآلة جسمانية، وهو لسان اليد ورسول العقل، وسفير الضمير ووحي الفكر، وسلاح المعرفة وأنس الإخوان عند الفرقة. apply the fix the ticket asks for
[THE MOCKUP, AS DESIGN APPROVED IT] [设计稿,设计团队批准的版本] الخـــــط هندســـــة روحانيـــــة ظهــــرتبآلــــة جسمانيــــة، وهــــو لســـان اليدورســـــول العقـــــل، وسفيــــر الضميــــرووحـــــي الفكـــــر، وســـــلاح المعــــرفةوأنـــــس الإخـــــوان عنــــد الفــــرقــــة.
On the right, the agreed design: both margins flush, every line filled by elongating the strokes inside the words, never the spaces between them. It renders in your browser only because I placed every elongation by hand, a confession I will expand on below. On the left, what production ships. Tick the box to apply the one tool CSS offers, text-align: justify.
右侧是商定的设计:两端对齐,每一行通过拉长单词内部的笔画来填充,绝不拉伸单词之间的空格。它能在你的浏览器中渲染,仅仅是因为我手动放置了每一个拉伸,这个秘密我稍后会详细说明。左侧是生产环境的实际效果。勾选复选框以应用CSS提供的唯一工具:text-align: justify。
(For these demonstrations this site ships its first webfont ever: Amiri, self-hosted, a hundred and fifty kilobytes of one man’s unpaid evenings, redistributed under the OFL. That this is what it takes to show you something your operating system cannot do on its own is, I want to be clear, part of the argument. I think it is a delightful hundred and fifty kilobytes.) (为了演示,本站首次使用了Web字体:Amiri。这是自托管的,是一个人利用无偿的夜晚时间制作的150KB字体,根据OFL协议重新分发。我想明确指出,为了向你展示操作系统本身无法完成的效果而不得不这样做,正是本文论点的一部分。我认为这150KB非常美妙。)
It did look fine. I spent about half an hour with it, I walked the rendered DOM, I set text-align: justify in so many different combinations of font-family and direction declarations, and at the end of the exercise I wrote a reply explaining, more or less honestly, that the problem was not a bug in our stylesheet but the state of Arabic typography on the web.
它看起来确实没问题。我花了大约半小时研究它,遍历了渲染后的DOM,在各种不同的字体族和方向声明组合中设置了text-align: justify。在练习结束时,我写了一封回复,或多或少诚实地解释说,问题不在于我们的样式表,而在于阿拉伯语在Web排版上的现状。
The reply took and the closure of the ticket took half an hour or so. The reasons behind it took five hundred years to pile up, and they involve a twice-mutilated vizier, a Qurʾān that vanished for four centuries, a Beirut newspaperman with a deadline, and an Egyptian physician who taught himself font engineering for fun (or that what I imagine about him). Walking through these, ended up to be the most enjoyable couple of weeks in that job, and I want to go through it here too. 回复和关闭工单只花了半小时左右。但其背后的原因却积累了五百年,涉及一位被两次残害的维齐尔(宰相)、一本消失了四个世纪的《古兰经》、一位赶截稿日期的贝鲁特报人,以及一位为了乐趣自学字体工程的埃及医生(或者我是这么想象他的)。梳理这些历史,成了我在那份工作中度过的最愉快的几周,我也想在这里与大家分享。
What the scribes solved
书法家们解决了什么
The history deserves recording because most people outside the small world of Arabic font engineering don’t know it, and it is wonderful. Classical Arabic typography, by which I mean the manuscript tradition that the early printers of Istanbul and Bulaq spent their careers chasing, justifies a line of text without stretching the spaces between words at all. Stretched spaces are the Latin convention, and in Arabic they produce an effect the scribes would have found simply ugly. Instead the scribe extends the letterforms themselves along the baseline, using what is called taṭwīl or, in the modern technical vocabulary, kashida: the connecting strokes between certain pairs of letters can be lengthened, sometimes lavishly, to carry a line out to the margin. A well-set page of Naskh from the seventeenth century has every line flush at both margins, and the result is the dense, regular weave that anyone who has spent time with a good manuscript Qurʾān will recognise on sight. 这段历史值得记录,因为阿拉伯字体工程这个小圈子之外的大多数人并不了解它,而且它非常精彩。古典阿拉伯排版——我指的是伊斯坦布尔和布拉克(Bulaq)的早期印刷商们毕生追求的手稿传统——在进行两端对齐时,完全不会拉伸单词之间的空格。拉伸空格是拉丁语的惯例,而在阿拉伯语中,这种做法在书法家看来简直丑陋不堪。相反,书法家会沿着基线延伸字母本身,使用所谓的“taṭwīl”,或者用现代技术词汇来说叫“kashida”(延长符):某些字母对之间的连接笔画可以被拉长,有时甚至非常华丽,以使行尾达到页边距。十七世纪排版精良的纳斯赫体(Naskh)页面,每一行都能实现两端对齐,其结果是那种紧凑、规整的质感,任何接触过优秀《古兰经》手稿的人一眼就能认出来。
Fig. 1. A Qurʾān folio, fourteenth century, now in the Metropolitan Museum of Art. Run your eye down the left edge: every line lands flush, and not one word-space was stretched to get it there. The justification lives inside the words. (Public domain, via Wikimedia Commons.) 图1:十四世纪的《古兰经》书页,现藏于大都会艺术博物馆。顺着左边缘看下去:每一行都对齐了,没有一个单词间的空格被拉伸。对齐发生在单词内部。(公共领域,来自维基共享资源。)
And this was not improvisation but a system, with a paper trail. The system was written down by Ibn Muqla, Abbasid vizier and chief calligrapher, who served three caliphs in succession and was imprisoned by two of them; the third had his right hand amputated on a charge of treasonous correspondence, and Ibn Muqla then kept writing for the next several months by lashing a reed pen to the stump of his wrist, and was rewarded for what he wrote by having his tongue cut out, and died in prison around the year 940. His body was buried three times in three different places, his daughter moving it after each interment to keep the grave out of police hands. The system he wrote down outlasted everybody who hurt him by a thousand years. It is called al-khaṭṭ al-mansūb, the proportional script; every letterform measured in rhombic dots of the reed nib, every curve a defined arc of a defined circle, the alif a fixed number of dots high and anything… 这并非即兴创作,而是一套有据可查的系统。这套系统由阿拔斯王朝的维齐尔兼首席书法家伊本·穆格莱(Ibn Muqla)记录下来。他先后侍奉过三位哈里发,其中两位将他投入监狱;第三位哈里发以叛国通信为由砍掉了他的右手。伊本·穆格莱在接下来的几个月里,将芦苇笔绑在手腕残肢上继续书写,而他书写的回报是被割掉了舌头,并于公元940年左右死于狱中。他的遗体被三次安葬在三个不同的地方,他的女儿在每次下葬后都会转移遗体,以防坟墓落入警察手中。他记录下的系统比所有伤害过他的人多存活了一千年。它被称为“al-khaṭṭ al-mansūb”,即比例书法;每一个字形都以芦苇笔尖的菱形点来衡量,每一条曲线都是定义圆的定义弧,字母“alif”的高度是固定的点数,任何……