

















Abstract:We introduce interaction SSD, an extension of Supervised Semantic Differential that models how semantic meaning varies across moderators such as groups, traits, or conditions making this variation testable and interpretable. The method estimates a main semantic gradient, an interaction gradient, and conditional gradients, all interpretable through standard SSD tools. We illustrate it on the UC Berkeley Measuring Hate Speech corpus, testing whether annotator racial identity moderates hate-speech judgments of comments targeting people of color. The interaction model detects a significant moderation effect: the shared gradient contrasts dehumanizing hostility with counter-speech, while the interaction gradient reveals smaller group-linked differences in which semantic cues predict hate-speech ratings. Interaction SSD makes moderated meaning-outcome relationships statistically testable and interpretable.
| Subjects: | Computation and Language (cs.CL) |
| Cite as: | arXiv:2605.27322 [cs.CL] |
| (or arXiv:2605.27322v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2605.27322 arXiv-issued DOI via DataCite (pending registration) |
From: Felix Ostrowicki [view email]
[v1]
Tue, 26 May 2026 17:33:02 UTC (36 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。