惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

MongoDB | Blog
MongoDB | Blog
IT之家
IT之家
J
Java Code Geeks
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Recent Announcements
Recent Announcements
博客园 - 三生石上(FineUI控件)
博客园_首页
MyScale Blog
MyScale Blog
腾讯CDC
I
InfoQ
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
人人都是产品经理
人人都是产品经理
Vercel News
Vercel News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
量子位
爱范儿
爱范儿
U
Unit 42
aimingoo的专栏
aimingoo的专栏
B
Blog RSS Feed
云风的 BLOG
云风的 BLOG
M
MIT News - Artificial intelligence
A
About on SuperTechFans
T
The Blog of Author Tim Ferriss
Blog — PlanetScale
Blog — PlanetScale
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Engineering at Meta
Engineering at Meta
博客园 - 叶小钗
小众软件
小众软件
Jina AI
Jina AI
Hugging Face - Blog
Hugging Face - Blog
Google DeepMind News
Google DeepMind News
The Cloudflare Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
D
Docker
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
博客园 - 【当耐特】
博客园 - Franky
H
Help Net Security
Stack Overflow Blog
Stack Overflow Blog
阮一峰的网络日志
阮一峰的网络日志
C
Check Point Blog
C
CERT Recently Published Vulnerability Notes
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Cisco Talos Blog
Cisco Talos Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
I
Intezer
Latest news
Latest news
D
Darknet – Hacking Tools, Hacker News & Cyber Security
博客园 - 司徒正美
Microsoft Security Blog
Microsoft Security Blog

cs.HC updates on arXiv.org

Do we have the knowledge we need? Rethinking human-AI decision-making in corporations Rethinking Scaffolding in LLM Tutors: The Interactional Mismatch Between Benchmarks and Real-World Deployments Do Large Language Models Have Emotions? Sensory Restoration via Brain-Computer Interfaces: A Unified 2 x 2 Framework and Convergence Roadmap Cognitive Trajectory Modeling: Quantifying Human-AI Co-Creation through Cognitively Grounded Interaction Trajectories The Perils of Agency: How Developers Perceive, Prioritize, and Address Risks in Agentic AI Products SCAN: A Decision-Making Framework for Effective Task Allocation with Generative AI SkillVetBench: LLM-as-Judge for Multi-Dimensional Security Risk Evaluation in Open-Source LLM Agent Skills Orchestrated Reality: From Role-Play to Living, Playable Game Worlds -- LLM-Driven World Simulation as a Parameterized-Action POMDP Using AI in engineering education: a balancing act, driven by clear purpose Contaminated Collaboration: Measuring Gender Bias Transfer in LLM-Assisted Student Writing Bridging the Usability Gap: Lessons from Interpreting Studies for Machine Interpreting Design City landscape in sight: A crowdsourced framework for unlocking urban-scale window view perceptions from real estate imagery Cloze: An Open Research Platform for Studying Human-AI Conversations in Mental Health Contexts Automated Gaze-based Behavioral Segmentation and Temporal Representation for Bridge Inspection in Unconstrained 3D Environments "Stuck in a Spiral": Shame and Guilt as Social Regulators of AI Use in Computing Education Graph of Trace: Visualizing Execution Traces of Scientific Agent Co-Creating Buildable and Open Social Robot Study Companions with University Students A Bilateral Teleoperation Framework for Dexterous Manipulation "ChatGPT, help me draft a breakup text": The Covert Triad and Articulation Labor in AI-Assisted Romantic Communication A Scalability Analysis of Quantitative Confidence Assessment Methods for Assurance Cases What do you mean by human-AI collaboration: Prerequisite functions and the affordances needed to achieve it A Prototypical Decision-Support Tool for Household Energy Management: A New Zealand Case Study Participatory Design for Assistive Mobility in Indian Homes, Grounded in Lived Experience "OpenBloom": A Stigma-Sensitive LLM Design Probe for Reproductive Well-Being If These Walls Could Talk: Critical Play with Large Language Models in Museums Process-Oriented Evaluation of AI-Assisted Scientific Writing Beyond the Blood Draw: Explainable Machine Learning for Non-Invasive Dysglycemia Risk Screening Challenging Partisan Expectations Reduces Political Polarization Impedance MPC with Patient-Torque Estimation for Knee Rehabilitation Exoskeletons The Missing Layer: Why EdTech Needs Design-Time Generative UI, Not Just Runtime Personalization Are LLM-based Chatbots Good Enough to Support Computer Science Students in Multiple-Choice Exercises? AI as a Sparring Partner -- an HCAI Approach to Promote Human Capabilities GraphStory: Collaborative Story Writing through Event-Based Narrative Editing Patient-centered visualization of multistage cancer treatment trajectories Beyond Usability: A UX Case Study on Using "Withdrawal Design" to Challenge Engagement Metrics in Social Robotics Mapping the Design Space for Youth Social Media: A Framework Centered on Friendship Building From 911 to Hospital: Challenges and Opportunities for AI Integration in Emergency Medical Services An Augmented Reality Brain-Robot Interface for Generalist Robot Arm Manipulation Impedance MPC for Physical Human-Robot Interaction: Predictive Disturbance Rejection with Joint-Limit Safety
A comparison of human and LLM-simulated participants in a writing style task
[Submitted on 15 Jun 2026] · 2026-06-16 · via cs.HC updates on arXiv.org

View PDF HTML (experimental)

Abstract:Because large language models (LLMs) can produce natural language that is sometimes indistinguishable from texts produced by people, some researchers are starting to consider replacing human participants with LLM simulations. In this study, we test the extent to which the findings of a simulation with an LLM prompted to act as a synthetic participant match those obtained from 30 human participants. In our experiments, we evaluated how well writing style preference inference algorithms adapted to a participant over repeated interactions, compared to a baseline. We discover hints of bias and a lack of depth in GPT-4o's text generation and judgement that prevent it from accurately simulating people's behavior. Our results also hint at human biases that highlight the importance of considering human factors in the evaluation of systems that depend on human-automation interaction. Rather than treating these discrepancies as evidence for or against the validity of LLM-simulated participants, we present this study as a case analysis of methodological and design challenges.

Submission history

From: Felix Gröner [view email]
[v1] Mon, 15 Jun 2026 14:24:08 UTC (940 KB)