惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

罗磊的独立博客
T
Tenable Blog
WordPress大学
WordPress大学
爱范儿
爱范儿
博客园 - 【当耐特】
Jina AI
Jina AI
T
Tailwind CSS Blog
博客园 - Franky
IT之家
IT之家
J
Java Code Geeks
博客园 - 叶小钗
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
博客园_首页
S
SegmentFault 最新的问题
月光博客
月光博客
阮一峰的网络日志
阮一峰的网络日志
博客园 - 聂微东
博客园 - 司徒正美
酷 壳 – CoolShell
酷 壳 – CoolShell
The Cloudflare Blog
有赞技术团队
有赞技术团队
小众软件
小众软件
Apple Machine Learning Research
Apple Machine Learning Research
Last Week in AI
Last Week in AI
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
V
Visual Studio Blog
V
V2EX
雷峰网
雷峰网
宝玉的分享
宝玉的分享
Hugging Face - Blog
Hugging Face - Blog
美团技术团队
腾讯CDC
NISL@THU
NISL@THU
人人都是产品经理
人人都是产品经理
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
T
Threat Research - Cisco Blogs
Scott Helme
Scott Helme
Simon Willison's Weblog
Simon Willison's Weblog
Spread Privacy
Spread Privacy
P
Palo Alto Networks Blog
T
Threatpost
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
博客园 - 三生石上(FineUI控件)
A
Arctic Wolf
量子位
I
Intezer
B
Blog RSS Feed
L
LangChain Blog

cs.CL updates on arXiv.org

Indexing Multimodal Language Models for Large-scale Image Retrieval SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments PersonaVLM: Long-Term Personalized Multimodal LLMs MedRCube: A Multidimensional Framework for Fine-Grained and In-Depth Evaluation of MLLMs in Medical Imaging Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning? Addressing Overthinking in Large Vision-Language Models via Gated Perception-Reasoning Optimization VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows (How) Learning Rates Regulate Catastrophic Overtraining Parameter Importance is Not Static: Evolving Parameter Isolation for Supervised Fine-Tuning $π$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context Dental-TriageBench: Benchmarking Multimodal Reasoning for Hierarchical Dental Triage Detection Without Correction: A Robust Asymmetry in Activation-Based Hallucination Probing Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews Mathematical Reasoning Enhanced LLM for Formula Derivation: A Case Study on Fiber NLI Modellin Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports? IWLV-Ramayana: A Sarga-Aligned Parallel Corpus of Valmiki's Ramayana Across Indian Languages Unleashing Implicit Rewards: Prefix-Value Learning for Distribution-Level Optimization Evaluating the Evaluator: Problems with SemEval-2020 Task 1 for Lexical Semantic Change Detection AgentSPEX: An Agent SPecification and EXecution Language TLoRA+: A Low-Rank Parameter-Efficient Fine-Tuning Method for Large Language Models Empirical Evidence of Complexity-Induced Limits in Large Language Models on Finite Discrete State-Space Problems with Explicit Validity Constraints CANVAS: Continuity-Aware Narratives via Visual Agentic Storyboarding Using reasoning LLMs to extract SDOH events from clinical notes ToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative Decoding Synthesizing Instruction-Tuning Datasets with Contrastive Decoding Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate YOCO++: Enhancing YOCO with KV Residual Connections for Efficient LLM Inference MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning Foresight Optimization for Strategic Reasoning in Large Language Models Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection Co-FactChecker: A Framework for Human-AI Collaborative Claim Verification Using Large Reasoning Models Learning the Cue or Learning the Word? Analyzing Generalization in Metaphor Detection for Verbs An Empirical Investigation of Practical LLM-as-a-Judge Improvement Techniques on RewardBench 2 From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding Reward Design for Physical Reasoning in Vision-Language Models Beyond Arrow's Impossibility: Fairness as an Emergent Property of Multi-Agent Collaboration IndicDB -- Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages Syn-TurnTurk: A Synthetic Dataset for Turn-Taking Prediction in Turkish Dialogues BenGER Platform: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks Training-Free Test-Time Contrastive Learning for Large Language Models Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments From Prediction to Justification: Aligning Sentiment Reasoning with Human Rationale via Reinforcement Learning Peer-Predictive Self-Training for Language Model Reasoning Giving Voice to the Constitution: Low-Resource Text-to-Speech for Quechua and Spanish Using a Bilingual Legal Corpus English is Not All You Need: Systematically Exploring the Role of Multilinguality in LLM Post-Training L2D-Clinical: Learning to Defer for Adaptive Model Selection in Clinical Text Classification Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs InfiniteScienceGym: An Unbounded, Procedurally-Generated Benchmark for Scientific Analysis Toward Generalized Cross-Lingual Hateful Language Detection with Web-Scale Data and Ensemble LLM Annotations Self-Calibrating Language Models via Test-Time Discriminative Distillation HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation Claim2Vec: Embedding Fact-Check Claims for Multilingual Similarity and Clustering Spoiler Alert: Narrative Forecasting as a Metric for Tension in LLM Storytelling Human vs. Machine Deception: Distinguishing AI-Generated and Human-Written Fake News Using Ensemble Learning Weird Generalization is Weirdly Brittle Mirroring Minds: Asymmetric Linguistic Accommodation and Diagnostic Identity in ADHD and Autism Reddit Communities Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models Who Wrote This Line? Evaluating the Detection of LLM-Generated Classical Chinese Poetry Training-Free Cross-Lingual Dysarthria Severity Assessment via Phonological Subspace Analysis in Self-Supervised Speech Representations Simulating Organized Group Behavior: New Framework, Benchmark, and Analysis ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification Nationality encoding in language model hidden states: Probing culturally differentiated representations in persona-conditioned academic text Relational Probing: LM-to-Graph Adaptation for Financial Prediction CodeComp: Structural KV Cache Compression for Agentic Coding FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness Comparative Analysis of Large Language Models in Healthcare Adaptive Multi-Expert Reasoning via Difficulty-Aware Routing and Uncertainty-Guided Aggregation A Structured Clustering Approach for Inducing Media Narratives NameBERT: Scaling Name-Based Nationality Classification with LLM-Augmented Open Academic Data LASQ: A Low-resource Aspect-based Sentiment Quadruple Extraction Dataset BLUEmed: Retrieval-Augmented Multi-Agent Debate for Clinical Error Detection Turing or Cantor: That is the Question NOSE: Neural Olfactory-Semantic Embedding with Tri-Modal Orthogonal Contrastive Learning Instruction Data Selection via Answer Divergence EviCare: Enhancing Diagnosis Prediction with Deep Model-Guided Evidence for In-Context Reasoning Dynamic Adaptive Attention and Supervised Contrastive Learning: A Novel Hybrid Framework for Text Sentiment Classification Structure-Grounded Knowledge Retrieval via Code Dependencies for Multi-Step Data Reasoning Knowing What to Stress: A Discourse-Conditioned Text-to-Speech Benchmark HeceTokenizer: A Syllable-Based Tokenization Approach for Turkish Retrieval BlasBench: An Open Benchmark for Irish Speech Recognition OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation How Robust Are Large Language Models for Clinical Numeracy? An Empirical Study on Numerical Reasoning Abilities in Clinical Contexts Evaluating Memory Capability in Continuous Lifelog Scenario Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation Hidden Measurement Error in LLM Pipelines Distorts Annotation, Evaluation, and Benchmarking LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling
No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions
[Submitted on 11 Jun 2026] · 2026-06-12 · via cs.CL updates on arXiv.org

View PDF HTML (experimental)

Abstract:As AI-generated reviews move from experimental tools into peer-review infrastructure, most robustness concerns have focused on explicit attacks such as hidden instructions and prompt injection. We study a harder and more policy-relevant failure mode: no hidden text, no prompt injection, and no changes to methods, experiments, figures, equations, proofs, or numerical results. The attacker modifies only presentation-level content, such as the abstract, contribution framing, related work, discussion, and narrative structure. We introduce adversarial repackaging: a closed-loop attack that uses AI-reviewer feedback to search for presentation-level revisions while keeping the scientific evidence fixed. Across three mainstream AI reviewers, adversarial repackaging achieves a 75.1% attack success rate and a mean score gain of +1.21/10. The effect is not explained by ordinary prose polishing. We also reveal that strategies that change how the reviewer interprets the paper, such as related-work repositioning and analytical discussion expansion, substantially outperform surface edits such as local polishing, table formatting, and algorithm boxes.
Our analysis reveals two deeper structural failure modes. First, AI reviewers are easier to impress than to convince: highlighting strengths reliably increases perceived merit, while attempts to dissolve weaknesses frequently backfire. Second, AI reviewers can confuse the appearance of addressing a limitation with actually resolving it, allowing unchanged evidence to be reinterpreted as stronger scientific contribution. These results show that the deployment risk is not only malicious hidden instructions, but the emergence of paper presentation itself as an optimization surface. We release a contamination-free rolling benchmark and attack framework for testing whether AI reviewers remain anchored to scientific content under presentation-only edits.

Submission history

From: Xu Yang [view email]
[v1] Thu, 11 Jun 2026 08:30:18 UTC (1,536 KB)