When Roleplaying, Do Models Believe What They Say? - 惯性聚合

推荐订阅源

大猫的无限游戏

cs.AI updates on arXiv.org

Threat Intelligence Blog | Flashpoint

LINUX DO - 最新话题

Threat Research - Cisco Blogs

Schneier on Security

Simon Willison's Weblog

Cybersecurity and Infrastructure Security Agency CISA

CERT Recently Published Vulnerability Notes

cs.CV updates on arXiv.org

TaoSecurity Blog

Darknet – Hacking Tools, Hacker News & Cyber Security

Attack and Defense Labs

Security Affairs

The Cloudflare Blog

博客园 - 三生石上(FineUI控件)

美团技术团队

阮一峰的网络日志

Recent Commits to openclaw:main

博客园_首页

Google Developers Blog

Tor Project blog

宝玉的分享

Recorded Future

Cisco Talos Blog

Palo Alto Networks Blog

Application and Cybersecurity Blog

Exploit-DB.com RSS Feed

PCI Perspectives

Kaspersky official blog

Google Online Security Blog

Hacker News - Newest: "LLM"

aimingoo的专栏

cs.CL updates on arXiv.org

Learning Adaptive Reasoning Paths for Efficient Visual Reasoning AIM: Asymmetric Information Masking for Visual Question Answering Continual Learning Knowing When Not to Answer: Evaluating Abstention in Multimodal Reasoning Systems One RL to See Them All: Visual Triple Unified Reinforcement Learning VisRet: Visualization Improves Knowledge-Intensive Text-to-Image Retrieval ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning AdaSplash-2: Faster Differentiable Sparse Attention Decoupling Scores and Text: The Politeness Principle in Peer Review Correcting Suppressed Log-Probabilities in Language Models with Post-Transformer Adapters MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis What Is the Minimum Architecture for Prolepsis? Early Irrevocable Commitment Across Tasks in Small Transformers RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models Rethinking Patient Education as Multi-turn Multi-modal Interaction Indexing Multimodal Language Models for Large-scale Image Retrieval SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments PersonaVLM: Long-Term Personalized Multimodal LLMs MedRCube: A Multidimensional Framework for Fine-Grained and In-Depth Evaluation of MLLMs in Medical Imaging Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning? Addressing Overthinking in Large Vision-Language Models via Gated Perception-Reasoning Optimization VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows (How) Learning Rates Regulate Catastrophic Overtraining Parameter Importance is Not Static: Evolving Parameter Isolation for Supervised Fine-Tuning $π$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context Dental-TriageBench: Benchmarking Multimodal Reasoning for Hierarchical Dental Triage Detection Without Correction: A Robust Asymmetry in Activation-Based Hallucination Probing Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews Mathematical Reasoning Enhanced LLM for Formula Derivation: A Case Study on Fiber NLI Modellin Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports? IWLV-Ramayana: A Sarga-Aligned Parallel Corpus of Valmiki's Ramayana Across Indian Languages Unleashing Implicit Rewards: Prefix-Value Learning for Distribution-Level Optimization Evaluating the Evaluator: Problems with SemEval-2020 Task 1 for Lexical Semantic Change Detection AgentSPEX: An Agent SPecification and EXecution Language TLoRA+: A Low-Rank Parameter-Efficient Fine-Tuning Method for Large Language Models Empirical Evidence of Complexity-Induced Limits in Large Language Models on Finite Discrete State-Space Problems with Explicit Validity Constraints CANVAS: Continuity-Aware Narratives via Visual Agentic Storyboarding Using reasoning LLMs to extract SDOH events from clinical notes ToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative Decoding Synthesizing Instruction-Tuning Datasets with Contrastive Decoding Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate YOCO++: Enhancing YOCO with KV Residual Connections for Efficient LLM Inference MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning Foresight Optimization for Strategic Reasoning in Large Language Models Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection Co-FactChecker: A Framework for Human-AI Collaborative Claim Verification Using Large Reasoning Models Learning the Cue or Learning the Word? Analyzing Generalization in Metaphor Detection for Verbs An Empirical Investigation of Practical LLM-as-a-Judge Improvement Techniques on RewardBench 2 Doc-V*:Coarse-to-Fine Interactive Visual Reasoning for Multi-Page Document VQA QuantileMark: A Message-Symmetric Multi-bit Watermark for LLMs ToolOmni: Enabling Open-World Tool Use via Agentic learning with Proactive Retrieval and Grounded Execution MUSE: Multi-Domain Chinese User Simulation via Self-Evolving Profiles and Rubric-Guided Alignment Robust Reward Modeling for Large Language Models via Causal Decomposition Beyond Static Personas: Situational Personality Steering for Large Language Models Causal Drawbridges: Characterizing Gradient Blocking of Syntactic Islands in Transformer LMs Dual-Enhancement Product Bundling: Bridging Interactive Graph and Large Language Model From Where Words Come: Efficient Regularization of Code Tokenizers Through Source Attribution From Weights to Activations: Is Steering the Next Frontier of Adaptation? Interpretable Stylistic Variation in Human and LLM Writing Across Genres, Models, and Decoding Strategies Correct Prediction, Wrong Steps? Consensus Reasoning Knowledge Graph for Robust Chain-of-Thought Synthesis From Seeing it to Experiencing it: Interactive Evaluation of Intersectional Voice Bias in Human-AI Speech Interaction From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines Hybrid Retrieval for COVID-19 Literature: Comparing Rank Fusion and Projection Fusion with Diversity Reranking CollabCoder: Plan-Code Co-Evolution via Collaborative Decision-Making for Efficient Code Generation Social media polarization during conflict: Insights from an ideological stance dataset on Israel-Palestine Reddit comments Bridging Compositional and Distributional Semantics: A Survey on Latent Semantic Geometry via AutoEncoder Language steering in latent space to mitigate unintended code-switching ParlaSpeech 3.0: Richly Annotated Spoken Parliamentary Corpora of Croatian, Czech, Polish, and Serbian LaoBench: A Large-Scale Multidimensional Lao Benchmark for Large Language Models Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning Exposía: Teaching and Assessment of Academic Writing Skills for Research Project Proposals and Peer Feedback F-Actor: Controllable Conversational Behaviour in Full-Duplex Models Neural Chain-of-Thought Search: Searching the Optimal Reasoning Path to Enhance Large Language Models Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning Common to Whom? Regional Cultural Commonsense and LLM Bias in India Sparse or Dense? A Mechanistic Estimation of Computation Density in Transformer-based LLMs Kwame 2.0: Human-in-the-Loop Generative AI Teaching Assistant for Large Scale Online Coding Education in Africa RAG or Learning? Understanding the Limits of LLM Adaptation under Continuous Knowledge Drift in the Real World ValueGround: Evaluating Culture-Conditioned Visual Value Grounding in MLLMs Rag Performance Prediction for Question Answering Guaranteeing Knowledge Integration with Joint Decoding for Retrieval-Augmented Generation Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness Growing Pains: Extensible and Efficient LLM Benchmarking Via Fixed Parameter Calibration CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning Coherence in the brain unfolds across separable temporal regimes The Cost of Language: Centroid Erasure Exposes and Exploits Modal Competition in Multimodal Language Models DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines

When Roleplaying, Do Models Believe What They Say?

Benjamin Sturgeon, David Africa, Sid Black · 2026-06-10 · via cs.CL updates on arXiv.org

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。