惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Hugging Face - Blog
Hugging Face - Blog
Jina AI
Jina AI
宝玉的分享
宝玉的分享
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
人人都是产品经理
人人都是产品经理
博客园 - 聂微东
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
J
Java Code Geeks
博客园 - 【当耐特】
小众软件
小众软件
博客园 - Franky
S
SegmentFault 最新的问题
WordPress大学
WordPress大学
雷峰网
雷峰网
The Cloudflare Blog
酷 壳 – CoolShell
酷 壳 – CoolShell
量子位
Last Week in AI
Last Week in AI
博客园_首页
月光博客
月光博客
IT之家
IT之家
阮一峰的网络日志
阮一峰的网络日志
Webroot Blog
Webroot Blog
Stack Overflow Blog
Stack Overflow Blog
腾讯CDC
云风的 BLOG
云风的 BLOG
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
W
WeLiveSecurity
Recent Commits to openclaw:main
Recent Commits to openclaw:main
D
Docker
The Last Watchdog
The Last Watchdog
有赞技术团队
有赞技术团队
Hacker News - Newest:
Hacker News - Newest: "LLM"
D
DataBreaches.Net
S
Security @ Cisco Blogs
Blog — PlanetScale
Blog — PlanetScale
GbyAI
GbyAI
TaoSecurity Blog
TaoSecurity Blog
S
Security Affairs
Y
Y Combinator Blog
O
OpenAI News
罗磊的独立博客
MongoDB | Blog
MongoDB | Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Forbes - Security
Forbes - Security
P
Palo Alto Networks Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
K
Kaspersky official blog
Cloudbric
Cloudbric

HN's home page

Rainbow Query Language | Hacker News Exec into Node via Kubectl An AI native hedge fund The Seven-Action Documentation Model | Hacker News Package Manager for Kubectl Plugins Tongan Castaways | Hacker News Tech overlords plan for conscious AI to conquer the cosmos. What could go wrong? Data Breach Disclosure Lag Is Getting Worse How LLMs Work | Hacker News I Dropped PRDs for Shape Up Go Experiments Explained | Hacker News FCA's Palantir deal could expose UK financial data to Trump's US, critics fear WebXR BCI for Neural-Adaptive Avatar Control in Mixed Reality The first murder conviction via DNA analysis Tom Interviews Theo de Raadt of the OpenBSD Project (2019) [video] Show HN: Replace shell commands with bun shell typescript scripts Quay.io Is Down | Hacker News AI driven analysis of brokerage account fees in the UK Bill Gates Spent Years Crafting His Image. Now It's Cracking Using LLMs to secure source code Wi-Fi 8 in the Lab [video] The household battery revolution that could change energy bills and the world Is Python Becoming Pinyin? | Hacker News Livia – Executive Assistant | Hacker News FindMyPipe – Query Apple Find My from Linux for AI Agents Show HN: Agent skill for creating product launch videos with Remotion RecruitMyself – AI job search copilot for resumes and applications AI coding agents and the erosion of system understanding The 'Resting' Generation and South Korea's Youth Recession AMD Computex 2026: 10 Years of AM4, AM5 Support Through 2029 Docker Networking Explained | Hacker News Textbooks in Tokenland | Hacker News Key Chemistry Question Answered, No Quantum Computer Required Gifts For Retrocomputing Fans – remix yesterday's tech with a modern spin Miscellany № 49: introducing the quasiquote – Shady Characters Amazon Thinks the Future of Data Centers Is a Technical Problem It Just Solved A brief history of the UUID (2017) Flying High Unpressurized (2016) | Hacker News Five Years of Trying to Add Recursion to Lychee How British comfort food won over the French Blorp Language | Hacker News Decache – you might have the internet's lost media in your PC's cache folders Criminal Activities and Migration | Hacker News A free, open-source library of DESIGN.md files for AI-generated UIs MiniMax M3 | Hacker News People are apparently farming citations on ResearchGate – Chuniversiteit Hacker News Basketeer – a typed TS SDK for your Tesco account, with nutrition data 'Penguin' decays from CERN's Large Hadron Collider experiment hint new physics Emergence World: A Laboratory for Evaluating Long-Horizon Agent Autonomy Homebrew lead Mike McQuaid: Sandboxes and Worktrees - My Secure Agentic AI Setup Lean, Not Backpressure | Hacker News AI Dangers Eclipse Nuclear Weapons at Singapore Defense Forum Open source analytics that answers backbase How turkey hacked the hair-transplant industry How GPT Image 2 Is Transforming Marketing Workflows in 2026 Improve Git monorepo performance with a file system monitor Strava for Claude Code MiniMax M3 on Qubrid AI There's Something Else We Should Be Worrying About Celebrity Profile of an A.I. Actress What Is Windows K2? | Hacker News AI is devoid of meaning and humanity. Its vapid voice suits the political moment Show HN: Interpreto – Live Translation for Travel Taxicab Geometry Sealed classes and interfaces in Java (2025) Show HNs | Hacker News My AI Skill Edited This Video That Explains My AI Skill – Arcturus Labs Amazon Pinpoint End of Support The Mystery of the Backward Index MP/M's Process Dispatcher SlimTide Reviews: A Modern Solution for Metabolism and Energy Learning Lustre: Type-safe front end development with gleam Thomas Mann: Goethe Heartened by Panama (As Suez for English, or Danube-Rhine) How to make Message Log of the Unreal Engine 100 times faster Sum-product, unit distances, and number fields Can Meta Buy Belief? | Hacker News Twenty Years of Bigtable | Hacker News Show HN: Combine WigglyPaint GIFs into Video Show HN: AgentThreatBench – Benchmark for AI Agent Memory Security Genius Spotted in the Wild Napkins: Where Ethernet, Compaq and Facebook’s cool data center got their starts (2011) Moderate caffein use alters sleep-related EEG Nvidia Announces RTX Spark | Hacker News Show HN: Ministry of Everything – CLI agent harness for a single operator CEOs blame AI for layoffs, MIT prof says it fits a pattern to find cover story Bugs I didn't expect while building a zsh cleanup script for macOS dev machines Nvidia jumps into PCs with new chip debuting in laptops from Microsoft, Dell, HP Nvidia unveils PC 'superchip' in challenge to Apple and Intel Show HN: Having fun making mini static site apps Synthea API: Create Synthetic Medical Records as a Service Berkshire Hathaway to buy Taylor Morrison for $6.8B in cash The most complex model we understand [video] SanDisk stock is +4,440.53% in the past year Driftwm: What if your window manager worked like a whiteboard? US Immigration enforcement looks into buying ad data AI Is Creating More Work for Australia's Workplace Tribunal Finding New Biblical Cross-References with Codex Glide: A tiling window manager for macOS Ultra-highly efficient enrichment of uranium from seawater via studtite nanodots (2024)
Introduction to the experience of rendering Arabic typography&its technical debt
bookofjoe · 2026-06-13 · via HN's home page
Introduction to the experience of rendering Arabic typography&its technical debt (lr0.org)
79 points by bookofjoe 4 hours ago | hide | past | favorite | 15 comments
 help


One thing I sometimes think about when I think about text layout problems is how the text we use also has a bunch of complexities that we can take for granted.

Think of variable width characters and kerning and ligatures and hyphenation and justification. Imagine computers had been won by a CJK language, which have none of these problems. You could imagine a similar article about how exotic and difficult English layout is.


Disclaimer: I’m not fluent in Arabic by any means, but the stretched out to both margins style looks very Quranic to me. I don’t think it looks appropriate for say, a message about my DoorDasher.


awesome article! i appreciated the alternatives and misc. workarounds to OT/unicode typesetting toward the end, very helpful :D (go blue!)


Is this a small typo?

> The relevant rule, W2 of UAX #9, reclassifies a digit as an ARABIC NUMBER if any of the previous strong characters in the paragraph were Arabic letters, and as a EUROPEAN NUMBER otherwise. Both render their internal digits left-to-right, which is correct: numbers everywhere on Earth are read most-significant-first.

Does the author mean most-significant-on-the-left? The statement as written is a statement about the order in which one reads or perhaps thinks the number, whereas I think the author is discussing how numbers, including collections of numbers delimited by hyphens and such, should be laid out on the page.


I think he's talking about the rendering algorithm with regards to the stream of text. essentially saying rendering direction should follow reading convention.

on the other hand, in formal arabic, it's not unusual that numers are read in clusters from least significant to most significant (right to left). 1984 would be read : eighty four and nine hundred and a thousand. not sure if the author is aware of this


This article is wonderful. It's interesting, it's captivating, full with detail, and to think I never gave much thought about Arabic rendering before.

This part nearly had me chuckle audibly:

He says yes. The result is "Simplified Arabic": initial fused into medial, final into isolated, ligatures dropped. It conquers the Arab newsroom in a generation. Mrowa is assassinated at his desk eight years later, by an unrelated faction, in an unrelated dispute.

Also, it's depressing how hundreds of millions of people couldn't even get their language typeset on a computer, and our industry meanwhile was busy building AI-native AI for your groceries (have we mentioned it has AI btw?) and similar performative bullshit.


  Internet Explorer 5.5 implements text-justify: kashida. For one brief, weird browser-quarter Microsoft is the only software vendor on earth that can justify Arabic correctly on a screen.

the entire article has llm tells all over it. i read it anyway, and i'm grateful for all the facts i learned (although i cannot trust all of them, for reasons aforementioned), but i genuinely (!) think it's a shame because the topic is an absolutely fascinating one

(i will permit myself to not explain in excruciating detail why i feel that way about this, as we have this discussion several times a day on this site)


Very interesting. I just implemented a text shaper and renderer from scratch with support for complex scripts like Arabic, Nastaliq and Indic (will soon post about it here on HN). Now that you write about it, the lack of stretching really is a deficiency in the OpenType spec.

If you want a solution for this it has to happen in the rendering step, not the shaping (which is HarfBuzz's main task). The shaper has no information about the available space, but when rendering you could stretch individual glyphs to the desired width, similar to adjusting the width of whitespace in Latin, but more complex, because you actually have to modify the glyphs with a scale transform. I am not an expert on Arabic script by any means, but this should be possible IMO. It would at least be an interesting experiment. Of course the JSTF table would be the right way to do it, but there seems to be a lot of confusion around it. Maybe in the age of LLMs we can give it another shot.


It does seem like a KP-like algorithm ought to be able to optimize the break positions without extreme algorithmic difficulty aside from the inputs being considerably more complex than for Latin block print: the cost function for a proposed line is a straightforward [0] calculable function of the contents of the line, and I think one could make a dynamic programming algorithm that tracks, for each input position, the cost of the optimal layout of all text up to that position with a break at that position. This gives an algorithm that takes cubic time. (For input length n, you need to fill in n values in the table. Each value scans the entire table before that position and does a calculation with complexity linear in the proposed line length.)

As a practical matter, there’s an input length n and there is some upper bound B on a credible line length as measured in code points, so there are only at most n*B credible proposed lines to evaluate, which also limits the useful look back on the table to B positions, so I think the time complexity could be reduced to O(n*B^2) without making the results worse on reasonable inputs, and this is probably quite tolerable.

[0] Straightforward once you’ve implemented the whole Arabic rendering stack, anyway. I am certainly not qualified to calculate this function :)


very interesting, arabic is a good reminder that text rendering is mostly solved for the scripts that shaped the defaults.

The hard part is that typography, shaping, bidi behavior, font fallback, search, and the editor model all leak into each other.

You cannot fix one layer cleanly when the assumptions are wrong in all of them.