Direct Advantage Regression: Aligning LLMs with Online AI Reward - 惯性聚合

推荐订阅源

Threat Research - Cisco Blogs

Heimdal Security Blog

Darknet – Hacking Tools, Hacker News & Cyber Security

The Hacker News

CXSECURITY Database RSS Feed - CXSecurity.com

Vulnerabilities – Threatpost

Cybersecurity and Infrastructure Security Agency CISA

CERT Recently Published Vulnerability Notes

Proofpoint News Feed

Threat Intelligence Blog | Flashpoint

人人都是产品经理

www.infosecurity-magazine.com

Engineering at Meta

CTFtime.org: upcoming CTF events

cs.AI updates on arXiv.org

有赞技术团队

Tailwind CSS Blog

Cisco Talos Blog

Last Week in AI

让小产品的独立变现更简单 - ezindie.com

Proofpoint News Feed

Google Online Security Blog

Recent Announcements

Hacker News: Ask HN

美团技术团队

Stack Overflow Blog

Privacy International News Feed

Google DeepMind News

Apple Machine Learning Research

TaoSecurity Blog

Security @ Cisco Blogs

Check Point Blog

Hackread – Cybersecurity News, Data Breaches, AI and More

Secure Thoughts

Google Developers Blog

Cyber Attacks, Cyber Crime and Cyber Security

LINUX DO - 最新话题

cs.HC updates on arXiv.org

Knowing When to Ask: Self-Gated Clarification for Hierarchical Language Agents Collaborative Human-Agent Protocol (CHAP) Impedance MPC for Physical Human-Robot Interaction: Predictive Disturbance Rejection with Joint-Limit Safety Formalizing all indexed mathematics as a benchmark for general reasoning, with the example of implementing dilatations of categories Face versus Body Tracking for Human-Robot Interaction: An Egocentric Dataset What LLMs Must Forget to Teach Effectively: A DIY Approach to Premodern Japanese Language Pedagogy The New Social Image: How AI Competency and AI Proactivity Influence Self- and Peer-Perceptions in the Workplace Inform, Coach, Relate, Listen: Auditing LLM Caregiving Support Roles Visual Matters: Connecting Aesthetic Appeal and Production Quality of Photos, Infographics and Data Visualizations to Credibility of Social Media Posts Faster Completion, Less Learning: Generative AI Reduced Study Time on Math Problems and the Knowledge They Build Learning to Decide with AI Assistance under Human-Alignment Positive Alignment: Artificial Intelligence for Human Flourishing Sycophantic AI makes human interaction feel more effortful and less satisfying over time The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers Aligning Human-AI-Interaction Trust for Mental Health Support: Survey and Position for Multi-Stakeholders The Augmentation Trap: AI Productivity and the Cost of Cognitive Offloading Can LLMs Reason About Attention? Towards Zero-Shot Analysis of Multimodal Classroom Behavior Clinically Aware Synthetic Image Generation for Concept Coverage in Chest X-ray Models K2MUSE: A human lower-limb multimodal walking dataset spanning task and acquisition variability for rehabilitation robotics Privacy-Preserving Empathy Detection in Video Interactions GlyTwin: Digital Twin for Glucose Control in Type 1 Diabetes Through Optimal Behavioral Modifications Using Patient-Centric Counterfactuals AgentDynEx: Nudging the Mechanics and Dynamics of Multi-Agent Simulations Creating and Evaluating Personas Using Generative AI: A Scoping Review of 81 Articles Social Human Robot Embodied Conversation (SHREC) Dataset: Benchmarking Foundational Models' Social Reasoning Designing Synthetic Discussion Generation Systems: A Case Study for Online Facilitation FSPO: Few-Shot Optimization of Synthetic Preferences Personalizes to Real Users ExplainReduce: Generating global explanations from many local explanations AIvaluateXR: An Evaluation Framework for on-Device AI in XR with Benchmarking Results RECOVER: Designing a Large Language Model-based Remote Patient Monitoring System for Postoperative Gastrointestinal Cancer Care "Would You Want an AI Tutor?" Understanding Stakeholder Perceptions of LLM-based Systems in the Classroom Influencing Humans to Conform to Preference Models for RLHF User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation LLAMADRS: Evaluating Open-Source LLMs on Real Clinical Interviews--To Reason or Not to Reason? LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot Who Benefits from AI? Self-Selection, Skill Gap, and the Hidden Costs of AI Feedback Visual Analysis of Multi-outcome Causal Graphs Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms VERA: Generating Visual Explanations of Two-Dimensional Embeddings via Region Annotation TouchAI: Exploring human-AI perceptual alignment in touch through language model representations Principled Evaluation with Human Labels: One Rater at a Time and Rater Equivalence Modelling and Analysing Behaviours and Emotions via Complex User Interactions Towards an automated query modification assistant U-Sem: Semantic Enrichment, User Modeling and Mining of Usage Data on the Social Web From Linked Data to Relevant Data -- Time is the Essence Mining User Comment Activity for Detecting Forum Spammers in YouTube An Empirical Study of Real-World SPARQL Queries User Modeling Combining Access Logs, Page Content and Semantics A Human-Centric Approach to Group-Based Context-Awareness Survey on Various Gesture Recognition Techniques for Interfacing Machines Based on Ambient Intelligence Integration of Flexible Web Based GUI in I-SOAS New Methods of Analysis of Narrative and Semantics in Support of Interactivity Emotional State Categorization from Speech: Machine vs. Human Using Soft Constraints To Learn Semantic Models Of Descriptions Of Shapes Integrating multiple sources to answer questions in Algebraic Topology How Controlled English can Improve Semantic Wikis Variations of the Turing Test in the Age of Internet and Virtual Reality Intent expression using eye robot for mascot robot system Fuzzy inference based mentality estimation for eye robot agent Modeling the Experience of Emotion Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems Embedding Data within Knowledge Spaces Cooperative interface of a swarm of UAVs Edhibou: a Customizable Interface for Decision Support in a Semantic Portal Combining Semantic Wikis and Controlled Natural Language MOOPPS: An Optimization System for Multi Objective Scheduling Proposition of the Interactive Pareto Iterated Local Search Procedure - Elements and Initial Experiments AceWiki: Collaborative Ontology Management in Controlled Natural Language AceWiki: A Natural and Expressive Semantic Wiki An Intelligent Multi-Agent Recommender System for Human Capacity Building Collaborative model of interaction and Unmanned Vehicle Systems' interface SimDialog: A visual game dialog editor An Analysis of Key Factors for the Success of the Communal Management of Knowledge Effective Generation of Subjectively Random Binary Sequences Practical Approach to Knowledge-based Question Answering with Natural Language Understanding and Advanced Reasoning The Cyborg Astrobiologist: Porting from a wearable computer to the Astrobiology Phone-cam Can the Internet cope with stress? Personalizing Image Search Results on Flickr Social Information Processing in Social News Aggregation Coupling Control and Human-Centered Automation in Mathematical Models of Complex Systems Social Browsing on Flickr Social Networks and Social Information Filtering on Digg Reuse of designs: Desperately seeking an interdisciplinary cognitive approach Communication of Social Agents and the Digital City - A Semiotic Perspective Understanding Design Fundamentals: How Synthesis and Analysis Drive Creativity, Resulting in Emergence Improving the CSIEC Project and Adapting It to the English Teaching and Learning in China Field geology with a wearable computer: 1st results of the Cyborg Astrobiologist System Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks The Cyborg Astrobiologist: Scouting Red Beds for Uncommon Features with Geological Significance The Cyborg Astrobiologist: First Field Experience Semantic filtering by inference on domain knowledge in spoken dialogue systems Robust Dialogue Understanding in HERALD ScheduleNanny: Using GPS to Learn the User's Significant Locations, Travel Times and Schedule The role of robust semantic analysis in spoken language dialogue systems A Situation Calculus-based Approach To Model Ubiquitous Information Services Semi-metric Behavior in Document Networks and its Application to Recommendation Systems Fast Hands-free Writing by Gaze Direction Tree-gram Parsing: Lexical Dependencies and Structural Relations Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies Representing Scholarly Claims in Internet Digital Libraries: A Knowledge Modelling Approach

Direct Advantage Regression: Aligning LLMs with Online AI Reward

Li He, He Zhao, Stephen Wan, Dadong Wang, Lina Yao, Tongliang Li · 2025-04-19 · via cs.HC updates on arXiv.org

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。