惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

cs.AI updates on arXiv.org

Teaching Through Analogies: A Modular Pipeline for Educational Analogy Generation CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents Spectral Probe-Circuits: A Three-Step Recipe for Identifying Attention-Head Circuits in Pretrained Transformers How Much Thinking is Enough? Quantifying and Understanding Redundancy in LLM Reasoning Privacy-Preserving Local Language Models for Longitudinal Data Retrieval in Chronic Dermatologic Disease: Implementation in Pemphigus Patients Automated Detection and Classification of Delusion-related Content in Naturalistic Audio Diaries Using Multi-Agent Language Models Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security Partner-Aware Hierarchical Skill Discovery for Robust Human-AI Collaboration Inference Time Context Sparsity: Illusion or Opportunity? Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning Market Regime Council for Dynamic Credit Assignment in Multi-Agent LLM Decision Systems Feature Lottery? A Bifurcation Theory of Concept Emergence QUIVER: A Formal Framework for Quantifying Perturbation Propagation and Bifurcation in Compound AI Systems Fundamental Limitation in Explaining AI LC-ERD: Mining Latent Logic for Self-Evolving Reasoning via Consistency-Regulated Reward Decomposition Why We Need World Models for AGI: Where LLMs Fail and How World Models May Outperform When Mean CE Fails: Median CE Can Better Track Language Model Quality AvalancheBench: Evaluating Enterprise Data Agents Through Latent World Recovery Verified SHAP: Provable Bounds for Exact Shapley Values of Neural Networks RECTOR: Priority-Aware Rule-Based Reranking for Compliance-Aware Autonomous Driving Trajectory Selection Filtered Posterior Mean Collections: A Unified Framework for Analytical Models of Diffusion Generalization Low-Cost Labels, Reliable Choices: Rollout-Calibrated Hyper-Heuristics for Job Shop Scheduling A Signal-Language Foundation Model for Broad-Spectrum Cardiovascular Assessment from Routine Electrocardiography Toward Enactive Artificial Intelligence Trust but Verify: Prover-Verifier Deliberation for Selective LLM Prediction CausaLab: A Scalable Environment for Interactive Causal Discovery Toward AI Scientists AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning Learning to Reason Efficiently with A* Post-Training A Large-Scale Dataset and Benchmark: Do Protein-Ligand Models Learn Binding Sites or Just Binding Likelihood? Breaking the Chains of Probability: Neutrosophic Logic as a New Framework for Epistemic Uncertainty in Large Language Models Jailbreak to Protect: Buffering and Reinforcing via Temporary Jailbreaking for Safe Fine-Tuning in Large Language Models Raon-Speech Technical Report Mitigating Object Hallucinations in Vision-Language Models through Region-Aware Attention Recalibration IVR-R1: Refining Trajectories through Iterative Visual-Grounded Reasoning in Reinforcement Learning Mixture of Complementary Agents for Robust LLM Ensemble An Interpretable CF-RL-TOPSIS Fusion Model for Skills-Aware Talent Recommendation Identifying and Mitigating Systemic Measurement Bias in Production LLM Inference Benchmarks HeartBeatAI: An Interpretable and Robust Deep Learning Framework for Multi-Label ECG Arrhythmia Detection CITYREP: A Unified Benchmark for Urban Representations Across Cities, Tasks, and Modalities Hypothesis Generation and Inductive Inference in Children and Language Models Generative Representation Learning on Hyper-relational Knowledge Graphs via Masked Discrete Diffusion Safety-Oriented Routing Analysis of Mixtral MoE Under Benign and Harmful Prompts In Search of the Ingredients of Open-Endedness: Replicating Picbreeder with Large Vision-Language Models Overcoming "Physics Shock" in Earth Observation A Heteroscedastic Uncertainty Framework for PINN-based Flood Inference Adaptive Human-AI Coordination via Hierarchical Action Disentanglement Fuzzy, Neutrosophic, and Uncertain Graph Theory: Properties and Applications Extracting Training Data from Diffusion Language Models via Infilling The Model Is Not the Product: A Dual-Pillar Architecture for Local-First Psychological Coaching Exploration of Perceptual Speech Features for Clinical Decision-Support in Mental Health Care Iterative Refinement Neural Operators are Learned Fixed-Point Solvers: A Principled Approach to Spectral Bias Mitigation BoxLitE: A Faithful Knowledge Base Embedding Based on Convex Optimization From Accuracy to Auditability: A Survey of Determinism in Financial AI Systems Parameter Efficient Multi-Class Intelligent Scheduling for Multimodal Online Distributed Industrial Anomaly Detection Hidden-State Privacy Has an Empty Middle High-Risk AI Systems and the Problem of Identity in the European AI Act Federated Learning over Human-Body Communication for On-Body Edge Intelligence: A Survey, Taxonomy, and BODYFED-HBC Scheduling Vignette Clustering as Reasoning: A $k$-Means Interpretation of Chain-of-Thought Graph Learning MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research Geo-Expert: Towards Expert-Level Geological Reasoning via Parameter-Efficient Fine-Tuning Catching The Correct Answer Trap: Characterising AI Tutor Blind Spots When Analysing Student Reasoning MuCRASP: Multimodal Chain-of-thought Reasoning aware Structured Pruning Quantum Frog: Emergent Cooperation and Difficulty Scaling in a Quantized-Time Cooperative Game GlobalDentBench: A Multinational Benchmark for Evaluating LLM Clinical Reasoning in Dentistry with Expert Calibration Reason--Imagine--Act: Closed-Loop LLM Decision Making with World Models for Autonomous Driving Multimodal Alignment and Preference Optimization for Zero-Shot Conditional RNA Generation Cascade-KDE: Robust Time-Series Restoration under Out-of-Distribution Impulse Corruptions MAPLE: Multi-State Aggregated Policy Evaluation for AlphaZero in Imperfect-Information Games SPACE: Unifying Symmetric and Asymmetric Routing Problems for Generalist Neural Solver LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs Summoning the Oracle to Slay It: Mitigating Look-Ahead Bias in Financial Backtesting with Large Language Models Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL TriVAL: A Tri-Validation Framework for Faithful Automatic Optimization Modeling Lattice theory and algebraic models for deep convolutional learning based on mathematical morphology MDIA: A Multi-Agent Diagnostic Intelligence Pipeline on HealthBench Professional Beyond the Frontier: Stochastic Backtracking for Efficient Test-Time Scaling When Does Synthetic Patent Data Help? Volume-Fidelity Trade-offs in Low-Resource Multi-Label Classification Residual Drift Dominates Contradiction in Multi-Turn Constraint Reasoning How Well Do Models Follow Their Constitutions? PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure From Model Scaling to System Scaling: Scaling the Harness in Agentic AI Not All Transitions Matter: Evidence from PPO FLOATBench: A Dataset and Benchmark for Floating Offshore Wind Turbine Tower Fatigue TRACER: A Semantic-Aware Framework for Fine-Grained Contamination Detection in Code LLMs Machine Psychometrics: A Mathematical Psychology of Artificial Intelligence Measuring Reasoning Quality in LLMs: A Multi-Dimensional Behavioral Framework Context: Proactive Goal-Directed Intelligence via Composable Sandboxed Programs, Declarative Wiring, and Structured Interaction EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs Authority Inversion in LLM-Mediated Ubiquitous Systems: When Models Trust Users Over Sensors LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs Second Guess: Detecting Uncertainty Through Abstention and Answer Stability in Small Language Models Knowledge Graph Modulated Deep Learning for Limited-Sample Clinical Data Analysis Beyond Final Answers: Auditing Trajectory-Level Hallucinations in Multi-Agent Industrial Workflows Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing Insuring Every Action: An Authority Frontier Framework for Runtime Actuarial Control of Autonomous AI Agents DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches Confidence Calibration in Large Language Models Understanding Conversational Patterns in Multi-agent Programming: A Case Study on Fibonacci Game Development Nano World Models: A Minimalist Implementation of Future Video Prediction
Human-AI Collaboration in Science at Scale: A Global Large-scale Randomized Field Experiment
Binglu Wang, · 2026-05-26 · via cs.AI updates on arXiv.org

View PDF

Abstract:Collaboration is the defining mode of modern science, yet its core mechanism -- feedback -- remains hard to observe, difficult to scale, and unequally distributed. Here we test whether large language models (LLMs) can contribute to this hidden but vital practice and reallocate scientific feedback, an essential yet scarce resource for knowledge production. In a global large-scale randomized field experiment, we delivered customized LLM-generated feedback for over 31,000 arXiv preprints across 150 fields and more than 45,000 researchers from 133 geographic regions. Relative to controls, authors who received feedback had a significantly higher likelihood of revising their manuscripts, corresponding to a 12.55% relative increase over the baseline revision rate. Exposure to AI feedback also increased authors' subsequent use of LLM tools in their future papers, suggesting longer-run shifts in scientific practice. These effects were strongest among authors from non-English-dominant research regions, manuscripts less embedded in the scholarly literature, and teams with lower h-indexes and earlier career stages, consistent with the idea that AI feedback may provide the greatest benefit where access to timely critique is otherwise limited. Together, these findings provide causal evidence that structured AI-based interventions can transform access to scientific feedback from a largely private advantage into a more widely distributed resource, with broader implications for productivity, equity, and capacity across the global research system.
Subjects: Physics and Society (physics.soc-ph); Artificial Intelligence (cs.AI); Digital Libraries (cs.DL); Human-Computer Interaction (cs.HC)
Cite as: arXiv:2605.24180 [physics.soc-ph]
  (or arXiv:2605.24180v1 [physics.soc-ph] for this version)
  https://doi.org/10.48550/arXiv.2605.24180

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Binglu Wang [view email]
[v1] Fri, 22 May 2026 20:06:17 UTC (702 KB)