惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

WordPress大学
WordPress大学
Stack Overflow Blog
Stack Overflow Blog
MongoDB | Blog
MongoDB | Blog
小众软件
小众软件
U
Unit 42
S
SegmentFault 最新的问题
A
About on SuperTechFans
T
Tailwind CSS Blog
Hugging Face - Blog
Hugging Face - Blog
H
Help Net Security
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Recorded Future
Recorded Future
V
Visual Studio Blog
G
Google Developers Blog
The GitHub Blog
The GitHub Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
I
InfoQ
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Y
Y Combinator Blog
博客园 - 司徒正美
量子位
美团技术团队
云风的 BLOG
云风的 BLOG
B
Blog RSS Feed
酷 壳 – CoolShell
酷 壳 – CoolShell
D
Docker
J
Java Code Geeks
B
Blog
L
LangChain Blog
博客园 - 叶小钗
雷峰网
雷峰网
博客园_首页
F
Fortinet All Blogs
Recent Announcements
Recent Announcements
Google DeepMind News
Google DeepMind News
The Cloudflare Blog
Engineering at Meta
Engineering at Meta
有赞技术团队
有赞技术团队
H
Hackread – Cybersecurity News, Data Breaches, AI and More
GbyAI
GbyAI
Blog — PlanetScale
Blog — PlanetScale
Microsoft Azure Blog
Microsoft Azure Blog
阮一峰的网络日志
阮一峰的网络日志
P
Proofpoint News Feed
博客园 - 聂微东
腾讯CDC
T
The Blog of Author Tim Ferriss
罗磊的独立博客
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
博客园 - 三生石上(FineUI控件)

cs.AI updates on arXiv.org

暂无文章

RoTRAG: Rule of Thumb Reasoning for Conversation Harm Detection with Retrieval-Augmented Generation
Juhyeon Lee, Wonduk Seo, Junseo Koh, Seunghyun Lee, Haihua Chen, · 2026-04-19 · via cs.AI updates on arXiv.org

Detecting harmful content in multi turn dialogue requires reasoning over the full conversational context rather than isolated utterances. However, most existing methods rely mainly on models internal parametric knowledge, without explicit grounding in external normative principles. This often leads to inconsistent judgments in socially nuanced contexts, limited interpretability, and redundant reasoning across turns. To address this, we propose RoTRAG, a retrieval augmented framework that incorporates concise human written moral norms, called Rules of Thumb (RoTs), into LLM based harm assessment. For each turn, RoTRAG retrieves relevant RoTs from an external corpus and uses them as explicit normative evidence for turn level reasoning and final severity classification. To improve efficiency, we further introduce a lightweight binary routing classifier that decides whether a new turn requires retrieval grounded reasoning or can reuse existing context. Experiments on ProsocialDialog and Safety Reasoning Multi Turn Dialogue show that RoTRAG consistently improves both harm classification and severity estimation over competitive baselines, with an average relative gain of around 40% in F1 across benchmark datasets and an average relative reduction of 8.4% in distributional error, while reducing redundant computation without sacrificing performance.