The dashboard ticket that pointed at thirty years of Arabic web debt

PREVIEWThe dashboard ticket that pointed at thirty years of Arabic web debt · MD

A dashboard ticket arrived describing a small problem: a block of mixed Arabic and English text on a customer page rendered with a ragged left edge, while the design spec called for both margins flush. The polite product manager noted that the Latin version of the same block "looked fine." A developer writing on the personal blog La Vita Nouva treats that mismatch as the visible tip of a much larger problem, and lays out three other tickets that share its root (An interactive introduction to the terrific experience of rendering Arabic typography and its technical debt).

The four tickets landed in a six-month window against the same product. One was the ragged-left dashboard. A second came from a printed customer agreement where a name appeared with unjoined letters, the "1962 sign-painter" look that happens when a print pipeline renders Arabic without the software that connects letters inside a word. A third ticket pointed at a search index that returned nothing for roughly 12,000 accounts after a 2017 import: the import had encoded names using Unicode forms that the standard retired in 1995, and the index treated those forms as different strings from the modern ones. A fourth, the ragged-left ticket itself, exposed the same layer from a different angle.

What all four share is a single, traceable debt. The author's central argument is that the production bug, the print-runtime gap, the 1991-versus-1995 Unicode history, and a 500-year manuscript tradition of Arabic justification are different surfaces of the same underlying problem: web software built first for Latin, with Arabic treated as a translation afterthought (An interactive introduction to the terrific experience of rendering Arabic typography and its technical debt).

The mechanisms are specific, not fog. The 1991-versus-1995 codepoint issue is concrete. The earliest Unicode release for Arabic used presentation forms, full glyph shapes the standard later folded back into their underlying letters. Code written against those early codepoints still ships. A 2017 importer pulling names from a legacy system could easily inherit the old forms and store them in a search index, where modern queries for the same name, encoded in the post-1995 form, simply miss.

The print-pipeline gap is also concrete. Generating a PDF requires a runtime that can connect Arabic letters into their contextual forms. Some PDF libraries predate that capability, so the library renders each letter in its isolated form, producing the unjoined sign-painter look. The fix exists; the runtime is the missing piece.

CSS text-align: justify, the property web designers reach for to fill both margins, is the third mechanism. Classical Arabic typography fills lines by elongating letterforms along the baseline, a stretch called kashida or taṭwīl, rather than by stretching the spaces between words. Browsers implement justify as a Latin inter-word stretch, so an Arabic block ends up with uneven letter spacing and, often, a ragged margin. The post ships an interactive demo using the open-source Amiri typeface (about 150KB, OFL licensed, the author calls it "one man's unpaid evenings") that lets readers compare what justification is supposed to look like with what production actually ships.

The historical backdrop the argument rests on is also source-asserted. The system of proportional Arabic script, al-khaṭṭ al-mansūb, is reported as codified by Ibn Muqla around 940 CE, then refined by Ibn al-Bawwāb, whose surviving Naskh Qurʾān is held in the Chester Beatty Library in Dublin, and by Yāqūt al-Mustaʿṣimī, the Six Pens calligrapher of 1258 (An interactive introduction to the terrific experience of rendering Arabic typography and its technical debt). Those historical claims belong to the post and should be read as the author's framing rather than independently verified biography.

The partial-fix picture matters as much as the diagnosis. The debt is real, but it is not fog. The Amiri font, paired with any modern shaping engine, will connect letters correctly when a runtime actually calls it. A search index can be re-encoded; the standard retired the old forms for a reason. A dashboard that switches from CSS justify to a kashida-aware typesetter, or accepts the ragged margin as the honest output of a Latin-shaped property, can ship a real fix. The iceberg frame holds because every layer points at a specific, named cause, and every cause has at least one known mitigation.

What to watch is whether the underlying libraries catch up. PDF runtimes that ship a shaping engine by default would close the print-pipeline gap. Browser-level kashida support, or a typesetting pass that runs before the browser does, would close the dashboard gap. A re-encoding pass on legacy data would close the search gap. None of these is speculative; the components already exist, scattered across maintainer time, vendor priorities, and a long English-first default.

The post closes by treating the four tickets as a single argument, and the argument is constructive. The debt is traceable, the causes are named, and a meaningful share of the fix is already in the ecosystem. The web's English-default pipeline has compounding, user-visible costs for Arabic readers, and the path off that pipeline runs through specific, mostly-known upgrades rather than a fresh start.

The dashboard ticket that pointed at thirty years of Arabic web debt — type0 | type0

The dashboard ticket that pointed at thirty years of Arabic web debt

Sources