

















Abstract:Large language models (LLMs) are increasingly used to describe, evaluate and interpret places, yet it remains unclear whether they do so from a culturally neutral standpoint. Here we test urban perception in frontier LLMs using a balanced global street-view sample and prompts that either remain neutral or invoke different regional cultural standpoints. Across open-ended descriptions and structured place judgments, the neutral condition proved not to be neutral in practice. Prompts associated with Europe and Northern America remained systematically closer to the baseline than many non-Western prompts, indicating that model perception is organized around a culturally uneven reference frame rather than a universal one. Cultural prompting also shifted affective evaluation, producing sentiment-based ingroup preference for some prompted identities. Comparisons with regional human text-image benchmarks showed that culturally proximate prompting could improve alignment with human descriptions, but it did not recover human levels of semantic diversity and often preserved an affectively elevated style. The same asymmetry reappeared in structured judgments of safety, beauty, wealth, liveliness, boredom and depression, where model outputs were interpretable but only partly reproduced human group differences. These findings suggest that LLMs do not simply perceive cities from nowhere: they do so through a culturally uneven baseline that shapes what appears ordinary, familiar and positively valued.
| Subjects: | Computation and Language (cs.CL); Computers and Society (cs.CY) |
| Cite as: | arXiv:2604.20048 [cs.CL] |
| (or arXiv:2604.20048v2 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2604.20048 arXiv-issued DOI via DataCite |
From: Rong Zhao [view email]
[v1]
Tue, 21 Apr 2026 23:05:15 UTC (8,469 KB)
[v2]
Tue, 26 May 2026 15:47:03 UTC (8,186 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。