




















We present a vision-language model (VLM) that automatically edits website HTML to address violations of the Web Content Accessibility Guidelines 2 (WCAG2) while preserving the original design. We formulate this as a supervised image-conditioned program synthesis task, where the model learns to correct HTML given both the code and its visual rendering. We create WebAccessVL, a website dataset with manually corrected accessibility violations. We then propose a violation-conditioned VLM that further takes the detected violations' descriptions from a checker as input. This conditioning enables an iterative checker-in-the-loop refinement strategy at test time. We conduct extensive evaluation on both open API and open-weight models. Empirically, our method achieves 0.211 violations per website, a 96.0\% reduction from the 5.34 violations in raw data and 87\% better than GPT-5. A perceptual study also confirms that our edited websites better maintain the original visual appearance and content.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。