MoralityGym: A Benchmark for Evaluating Hierarchical Moral Alignment in Sequential Decision-Making Agents - 惯性聚合

推荐订阅源

博客园 - Franky

Proofpoint News Feed

Palo Alto Networks Blog

Hackread – Cybersecurity News, Data Breaches, AI and More

The Register - Security

Security @ Cisco Blogs

Simon Willison's Weblog

Recorded Future

大猫的无限游戏

Microsoft Research Blog - Microsoft Research

Tailwind CSS Blog

cs.CV updates on arXiv.org

Cyber Security Advisories - MS-ISAC

Application and Cybersecurity Blog

True Tiger Recordings

有赞技术团队

Cisco Talos Blog

Hacker News - Newest: "LLM"

The GitHub Blog

cs.AI updates on arXiv.org

博客园 - 叶小钗

Hugging Face - Blog

Hacker News: Ask HN

Security Archives - TechRepublic

Future of Privacy Forum

PCI Perspectives

Help Net Security

让小产品的独立变现更简单 - ezindie.com

The Blog of Author Tim Ferriss

Netflix TechBlog - Medium

罗磊的独立博客

Apple Machine Learning Research

Security Latest

美团技术团队

博客园 - 三生石上(FineUI控件)

Schneier on Security

CERT Recently Published Vulnerability Notes

cs.AI updates on arXiv.org

RefusalBench: Why Refusal Rate Misranks Frontier LLMs on Biological Research Prompts LLM Retrieval for Stable and Predictable Ad Recommendations The Log is the Agent: Event-Sourced Reactive Graphs for Auditable, Forkable Agentic Systems A Reproducible Log-Driven AutoML Framework for Interpretable Pipeline Optimization in Healthcare Risk Prediction AtelierEval: Agentic Evaluation of Humans & LLMs as Text-to-Image Prompters TerminalWorld: Benchmarking Agents on Real-World Terminal Tasks Compiling Agentic Workflows into LLM Weights: Near-Frontier Quality at Two Orders of Magnitude Less Cost AttuneBench: A Conversation-Based Benchmark for LLM Emotional Intelligence PEARL: Unbiased Percentile Estimation via Contrastive Learning for Industrial-Scale Livestream Recommendation ExComm: Exploration-Stage Communication for Error-Resilient Agentic Test-Time Scaling Faster Completion, Less Learning: Generative AI Reduced Study Time on Math Problems and the Knowledge They Build Active Evidence-Seeking and Diagnostic Reasoning in Large Language Models for Clinical Decision Support Memory-Induced Supra-Competitive Outcomes Between Deep Reinforcement Learning Agents in Optimal Trade Execution Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning Harnesses for Inference-Time Alignment over Execution Trajectories Evaluating Large Language Models as Live Strategic Agents: Provider Performance, Hybrid Decomposition, and Operational Gaps in Timed Risk Play Planning, Scheduling, and Behavior in EV Charging Systems: A Critical Survey and Trilemma Framework Addressing the Synergy Gap: The Six Elements of the Design Space IdleSpec: Exploiting Idle Time via Speculative Planning for LLM Agents CausalGuard: Conformal Inference under Graph Uncertainty Towards Direct Evaluation of Harness Optimizers via Priority Ranking Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery Latent-space Attacks for Refusal Evasion in Language Models The Shape of Testimony: A Scalable Framework for Oral History Archive Comparison Towards a General Intelligence and Interface for Wearable Health Data Can AI Make Conflicts Worse? An Alignment Failure in LLM Deployment Across Conflict Contexts Frequency-Domain Regularized Adversarial Alignment for Transferable Attacks against Closed-Source MLLMs Think Thrice Before You Speak: Dual knowledge-enhanced Theory-of-Mind Reasoning for Persuasive Agents LLM-Metrics: Measuring Research Impact Through Large Language Model Memory Forecasting Scientific Progress with Artificial Intelligence S2ED: From Story to Executable Descriptions for Consistency-Aware Story Illustration Visibility nowcasting in South Korea: a machine learning approach to class imbalance and distribution shift Understanding Perspectives of Patients, Caregivers and Clinicians towards Emerging Collaborative-decision Making Technologies Local Covariate Selection for Average Causal Effect Estimation without Pretreatment and Causal Sufficiency Assumptions Knowledge Graph Re-engineering Along the Ontological Continuum (extended version) LLMs can construct powerful representations and streamline sample-efficient supervised learning TO-Agents: A Multi-Agent AI Pipeline for Preference-Guided Topology Optimization Patch Hierarchical Attention Transformer for Efficient Particle Jet Tagging EvoScene-VLA: Evolving Scene Beliefs Inside the Action Decoder for Chunked Robot Control Scaling Observation-aware Planning in Uncertain Domains Evaluation of Pipelines for Data Integration into Knowledge Graphs Benchmarking and Improving Monitors for Out-Of-Distribution Alignment Failure in LLMs SciCore-Mol: Augmenting Large Language Models with Pluggable Molecular Cognition Modules Adapting the Interface, Not the Model: Runtime Harness Adaptation for Deterministic LLM Agents Investigating Concept Alignment Using Implausible Category Members Autonomous LLM Agents & CTFs: A Second Look Format-Constraint Coupling in Knowledge Graph Construction from Statistical Tables Scalable On-Policy Reinforcement Learning via Adaptive Batch Scaling Is Capability a Liability? More Capable Language Models Make Worse Forecasts When It Matters Most A Causal Argumentation Method for Explainability of Machine Learning Models MOSS: Self-Evolution through Source-Level Rewriting in Autonomous Agent Systems Multivariate Financial Forecasting using the Chronos Time Series Foundation Models FLUID: From Ephemeral IDs to Multimodal Semantic Codes for Industrial-Scale Livestreaming Recommendation A Camera-Cooperative ISAC Framework for Multimodal Non-Cooperative UAVs Sensing Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention Measuring Cross-Modal Synergy: A Benchmark for VLM Explainability Beyond the Org Chart: AI and the Transformation of Invisible Work HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools Claw AI Lab: An Autonomous Multi-Agent Research Team Cross-domain benchmarks reveal when coordinated AI agents improve scientific inference from partial evidence Deep Reinforcement Learning for Flexible Job Shop Scheduling with Random Job Arrivals The Attribution Impossibility: No Feature Ranking Is Faithful, Stable, and Complete Under Collinearity Advancing Mathematics Research with AI-Driven Formal Proof Search Predicting Performance of Symbolic and Prompt Programs with Examples Trace2Skill: Verifier-Guided Skill Evolution for Long-Context EDA Agents AI-Enabled Serious Games: Integrating Intelligence and Adaptivity in Training Systems Who Uses AI? Platforms, Workforce, and AI Exposure Toward AI VIS Co-Scientists: A General and End-to-End Agent Harness for Solving Complex Data Visualization Tasks Engineering Hybrid Physics-Informed Neural Networks for Next-Generation Electricity Systems: A State-of-the-Art Review When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning Thermodynamic Irreversibility of Training Algorithms MPDocBench-Parse: Benchmarking Practical Multi-page Document Parsing TBP-mHC: full expressivity for manifold-constrained hyper connections through transportation polytopes PocketAgents: A Manifest-Driven Library of Autonomous Defense Agents OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning The Impact of AI Usage and Informativeness on Skill Development in Logical Reasoning Unlocking Proactivity in Task-Oriented Dialogue MindLoom: Composing Thought Modes for Frontier-Level Reasoning Data Synthesis Towards a compositional semantics for quantitative confidence assessment in assurance arguments CLORE: Content-Level Optimization for Reasoning Efficiency ChronoMedicalWorld: A Medical World Model for Learning Patient Trajectories from Longitudinal Care Data Skill Weaving: Efficient LLM Improvement via Modular Skillpacks The Illusion of Reasoning: Exposing Evasive Data Contamination in LLMs via Zero-CoT Truncation A Subjective Logic-based method for runtime confidence updates in safety arguments Meta-Soft: Leveraging Composable Meta-Tokens for Context-Preserving KV Cache Compression WorkstreamBench: Evaluating LLM Agents on End-to-End Spreadsheet Tasks in Finance SGR-Bench: Benchmarking Search Agents on State-Gated Retrieval KAPPS: A knowledge-based CPPS Architecture for the Circular Factory LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems Meta-Learning for Rapid Adaptation in Reference Tracking of Uncertain Nonlinear Systems Graph neural network explanations reveal a topological signature of disease-associated hubs in biological networks ArborKV: Structure-Aware KV Cache Management for Scaling Tree-based LLM Reasoning Implicit Safety Alignment from Crowd Preferences ECPO: Evidence-Coupled Policy Optimization for Evidence-Certified Candidate Ranking What Counts as AI Sycophancy? A Taxonomy and Expert Survey of a Fragmented Construct High-speed Networking for Giga-Scale AI Factories SMDD-Bench: Can LLMs Solve Real-World Small Molecule Drug Design Tasks? Parametric Modular Answer Set Programs Made Declarative AOP-Wiki EMOD 3.0: Data Model Expansions and Content Evaluation Framework for Using Agentic AI to Improve Integration between AOPs and New Approach Methodologies (NAMs) Learning Altruistic Collaboration in Heterogeneous Multi-Team Systems

MoralityGym: A Benchmark for Evaluating Hierarchical Moral Alignment in Sequential Decision-Making Agents

Simon Rosen, · 2026-05-23 · via cs.AI updates on arXiv.org

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。