惯性聚合
高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文
在惯性聚合中打开
即将跳转到惯性聚合
3
在聚合应用中查看完整内容和互动
立即跳转
取消
推荐订阅源
H
Hackread – Cybersecurity News, Data Breaches, AI and More
OSCHINA 社区最新新闻
博
博客园 - 聂微东
钛媒体:引领未来商业与生活新知
V
Visual Studio Blog
PCI Perspectives
I
InfoQ
罗
罗磊的独立博客
云风的 BLOG
U
Unit 42
The Last Watchdog
Google Online Security Blog
T
Troy Hunt's Blog
E
Exploit-DB.com RSS Feed
Help Net Security
H
Hacker News: Front Page
C
Comments on: Blog
Engineering at Meta
W
WeLiveSecurity
N
News | PayPal Newsroom
cs.CV updates on arXiv.org
S
Security Archives - TechRepublic
Hacker News - Newest: "LLM"
Hacker News: Ask HN
cs.AI updates on arXiv.org
www.infosecurity-magazine.com
T
The Exploit Database - CXSecurity.com
Google DeepMind News
I
Intezer
P
Privacy International News Feed
Cisco Talos Blog
P
Proofpoint News Feed
P
Privacy & Cybersecurity Law Blog
Project Zero
N
News and Events Feed by Topic
Simon Willison's Weblog
T
Threat Research - Cisco Blogs
AI
cs.CL updates on arXiv.org
L
LINUX DO - 热门话题
S
Security Affairs
V
V2EX - 技术
V
Vulnerabilities – Threatpost
Security Latest
SecWiki News
Threat Intelligence Blog | Flashpoint
Webroot Blog
H
Heimdal Security Blog
T
Threatpost
A
Arctic Wolf
cs.CL updates on arXiv.org
Indexing Multimodal Language Models for Large-scale Image Retrieval
UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments
Caption First, VQA Second: Knowledge Density, Not Task Format, Drives Multimodal Scaling
PersonaVLM: Long-Term Personalized Multimodal LLMs
MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments
MedRCube: A Multidimensional Framework for Fine-Grained and In-Depth Evaluation of MLLMs in Medical Imaging
Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking
Reward Design for Physical Reasoning in Vision-Language Models
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Failure Makes the Agent Stronger: Enhancing Accuracy through Structured Reflection for Reliable Tool Interactions
Addressing Overthinking in Large Vision-Language Models via Gated Perception-Reasoning Optimization
ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding
VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows
Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning
(How) Learning Rates Regulate Catastrophic Overtraining
Parameter Importance is Not Static: Evolving Parameter Isolation for Supervised Fine-Tuning
$\pi$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data
From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space
A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection
The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious
KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context
Dental-TriageBench: Benchmarking Multimodal Reasoning for Hierarchical Dental Triage
Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data
Detection Without Correction: A Robust Asymmetry in Activation-Based Hallucination Probing
LiveClawBench: Benchmarking LLM Agents on Complex, Real-World Assistant Tasks
Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size
Functional Emotions or Situational Contexts? A Discriminating Test from the Mythos Preview System Card
C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences
Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference
WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain
Text-as-Signal: Quantitative Semantic Scoring with Embeddings, Logprobs, and Noise Reduction
A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews
A Proactive EMR Assistant for Doctor-Patient Dialogue: Streaming ASR, Belief Stabilization, and Preliminary Controlled Evaluation
Token Statistics Reveal Conversational Drift in Multi-turn LLM Interaction
Mathematical Reasoning Enhanced LLM for Formula Derivation: A Case Study on Fiber NLI Modellin
Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub
Correct Chains, Wrong Answers: Dissociating Reasoning from Output in LLM Logic
Curation of a Palaeohispanic Dataset for Machine Learning
EVE: A Domain-Specific LLM Framework for Earth Intelligence
OmniTrace: A Unified Framework for Generation-Time Attribution in Omni-Modal LLMs
DeEscalWild: A Real-World Benchmark for Automated De-Escalation Training with SLMs
Document-tuning for robust alignment to animals
Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports?
IWLV-Ramayana: A Sarga-Aligned Parallel Corpus of Valmiki's Ramayana Across Indian Languages
Unleashing Implicit Rewards: Prefix-Value Learning for Distribution-Level Optimization
InfiniteScienceGym: An Unbounded, Procedurally-Generated Benchmark for Scientific Analysis
Evaluating the Evaluator: Problems with SemEval-2020 Task 1 for Lexical Semantic Change Detection
Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs
L2D-Clinical: Learning to Defer for Adaptive Model Selection in Clinical Text Classification
English is Not All You Need: Systematically Exploring the Role of Multilinguality in LLM Post-Training
Giving Voice to the Constitution: Low-Resource Text-to-Speech for Quechua and Spanish Using a Bilingual Legal Corpus
AgentSPEX: An Agent SPecification and EXecution Language
Peer-Predictive Self-Training for Language Model Reasoning
TLoRA+: A Low-Rank Parameter-Efficient Fine-Tuning Method for Large Language Models
Empirical Evidence of Complexity-Induced Limits in Large Language Models on Finite Discrete State-Space Problems with Explicit Validity Constraints
From Prediction to Justification: Aligning Sentiment Reasoning with Human Rationale via Reinforcement Learning
CANVAS: Continuity-Aware Narratives via Visual Agentic Storyboarding
Using reasoning LLMs to extract SDOH events from clinical notes
ToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative Decoding
Synthesizing Instruction-Tuning Datasets with Contrastive Decoding
Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate
Training-Free Test-Time Contrastive Learning for Large Language Models
YOCO++: Enhancing YOCO with KV Residual Connections for Efficient LLM Inference
MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning
BenGER Platform: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks
Foresight Optimization for Strategic Reasoning in Large Language Models
Syn-TurnTurk: A Synthetic Dataset for Turn-Taking Prediction in Turkish Dialogues
IndicDB -- Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages
Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection
Beyond Arrow's Impossibility: Fairness as an Emergent Property of Multi-Agent Collaboration
Co-FactChecker: A Framework for Human-AI Collaborative Claim Verification Using Large Reasoning Models
Learning the Cue or Learning the Word? Analyzing Generalization in Metaphor Detection for Verbs
An Empirical Investigation of Practical LLM-as-a-Judge Improvement Techniques on RewardBench 2
Toward Generalized Cross-Lingual Hateful Language Detection with Web-Scale Data and Ensemble LLM Annotations
Self-Calibrating Language Models via Test-Time Discriminative Distillation
HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation
Generating High Quality Synthetic Data for Dutch Medical Conversations
GIANTS: Generative Insight Anticipation from Scientific Literature
Claim2Vec: Embedding Fact-Check Claims for Multilingual Similarity and Clustering
Spoiler Alert: Narrative Forecasting as a Metric for Tension in LLM Storytelling
Should We be Pedantic About Reasoning Errors in Machine Translation?
Human vs. Machine Deception: Distinguishing AI-Generated and Human-Written Fake News Using Ensemble Learning
Weird Generalization is Weirdly Brittle
Computational Implementation of a Model of Category-Theoretic Metaphor Comprehension
CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models
Mirroring Minds: Asymmetric Linguistic Accommodation and Diagnostic Identity in ADHD and Autism Reddit Communities
ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models
Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models
Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty
SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models
Who Wrote This Line? Evaluating the Detection of LLM-Generated Classical Chinese Poetry
CircuitSynth: Reliable Synthetic Data Generation
Training-Free Cross-Lingual Dysarthria Severity Assessment via Phonological Subspace Analysis in Self-Supervised Speech Representations
Simulating Organized Group Behavior: New Framework, Benchmark, and Analysis
Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities
ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification
Nationality encoding in language model hidden states: Probing culturally differentiated representations in persona-conditioned academic text
Is Grep All You Need? How Agent Harnesses Reshape Agentic Search
Sahil Sen, A
·
2026-05-15
·
via
cs.CL updates on arXiv.org
Comments
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。
原文来自
— 版权归原作者所有。