Recursive Entropic Risk Optimization in Discounted MDPs: Sample Complexity Bounds with a Generative Model - 惯性聚合

推荐订阅源

酷壳 – CoolShell

Hacker News: Front Page

Palo Alto Networks Blog

Apple Machine Learning Research

博客园_首页

True Tiger Recordings

Privacy & Cybersecurity Law Blog

Last Week in AI

Full Disclosure

Hacker News: Ask HN

Comments on: Blog

Microsoft Azure Blog

Cybersecurity and Infrastructure Security Agency CISA

Microsoft Security Blog

博客园 - 【当耐特】

News and Events Feed by Topic

Security Latest

李成银的技术随笔

Microsoft Research Blog - Microsoft Research

Lohrmann on Cybersecurity

cs.CL updates on arXiv.org

Check Point Blog

Y Combinator Blog

Recent Announcements

博客园 - Franky

News | PayPal Newsroom

About on SuperTechFans

The Register - Security

奇客Solidot–传递最新科技情报

Google Online Security Blog

Cisco Talos Blog

WordPress大学

Cyber Attacks, Cyber Crime and Cyber Security

The Hacker News

IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog

LINUX DO - 最新话题

freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

cs.AI updates on arXiv.org

Investigating Cross-Modal Skill Injection: Scenarios, Methods, and Hyperparameters Transforming Constraint Programs to Input for Local Search HeadRank: Decoding-Free Passage Reranking via Preference-Aligned Attention Heads What and When to Distill: Selective Hindsight Distillation for Multi-Turn Agents AI Technologies in Language Access: Attitudes Towards AI and the Human Value of Language Access Managers DecisionBench: A Benchmark for Emergent Delegation in Long-Horizon Agentic Workflows Benchmarking Commercial ASR Systems on Code-Switching Speech: Arabic, Persian, and German From SGD to Muon: Adaptive Optimization via Schatten-p Norms Beyond Rational Illusion: Behaviorally Realistic Strategic Classification Rotation-Aligned Key Channel Pruning for Efficient Vision-Language Model Inference Memory-Augmented Reinforcement Learning Agent for CAD Generation Learning to Hand Off: Provably Convergent Workflow Learning under Interface Constraints EmbGen: Teaching with Reassembled Corpora HalluWorld: A Controlled Benchmark for Hallucination via Reference World Models Library Drift: Diagnosing and Fixing a Silent Failure Mode in Self-Evolving LLM Skill Libraries Operationalizing Document AI: A Microservice Architecture for OCR and LLM Pipelines in Production CopT: Contrastive On-Policy Thinking with Continuous Spaces for General and Agentic Reasoning RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding Harnessing Self-Supervised Features for Art Classification When Tabular Foundation Models Meet Strategic Tabular Data: A Prior Alignment Approach KAN-MLP-Mixer: A comprehensive investigation of the usage of Kolmogorov-Arnold Networks (KANs) for improving IMU-based Human Activity Recognition MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation FormalASR: End-to-End Spoken Chinese to Formal Text Are Rationales Necessary and Sufficient? Tuning LLMs for Explainable Misinformation Detection Base Models Look Human To AI Detectors BLINKG: A Benchmark for LLM-Integrated Knowledge Graph Generation Knowing When Not to Predict: Self Supervised Learning and Abstention for Safer DR Screening SceneCode: Executable World Programs for Editable Indoor Scenes with Articulated Objects AgentNLQ: A General-Purpose Agent for Natural Language to SQL Backtracking When It Strays: Mitigating Dual Exposure Biases in LLM Reasoning Distillation CaptchaMind: Training CAPTCHA Solvers via Reinforcement Learning with Explicit Reasoning Supervision Attention-Guided Reward for Reinforcement Learning-based Jailbreak against Large Reasoning Models Efficient Elicitation of Collective Disagreements LP-Eval: Rubric and Dataset for Measuring the Quality of Legal Proposition Generation Rebalancing Reference Frame Dominance to Improve Motion in Image-to-Video Models EMO-BOOST: Emotion-Augmented Audio-Visual Features for Improved Generalization in Deepfake Detection ContextRAG: Extraction-Free Hierarchical Graph Construction for Retrieval-Augmented Generation Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance Targeted Downstream-Agnostic Attack PhyWorld: Physics-Faithful World Model for Video Generation Discoverable Agent Knowledge -- A Formal Framework for Agentic KG Affordances (Extended Version) EngiAI: A Multi-Agent Framework and Benchmark Suite for LLM-Driven Engineering Design Can Large Language Models Revolutionize Survey Research? Experiments with Disaster Preparedness Responses POLAR-Bench: A Diagnostic Benchmark for Privacy-Utility Trade-offs in LLM Agents Multi-Scale Generative Modeling with Heat Dissipation Flow Matching Progressive Autonomy as Preference Learning: A Formalization of Trust Calibration for Agentic Tool Use OpenComputer: Verifiable Software Worlds for Computer-Use Agents Beyond Mode Collapse: Distribution Matching for Diverse Reasoning Position: The Turing-Completeness of Real-World Autoregressive Transformers Relies Heavily on Context Management From Prompts to Pavement Through Time: Temporal Grounding in Agentic Scene-to-Plan Reasoning Explainable Wastewater Digital Twins: Adaptive Context-Conditioned Structured Simulators with Self-Falsifying Decision Support AQuaUI: Visual Token Reduction for GUI Agents with Adaptive Quadtrees PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents Swimming with Whales: Analysis of Power Imbalances in Stake-Weighted Governance PRISM: A Benchmark for Programmatic Spatial-Temporal Reasoning Generative-Evaluative Agreement: A Necessary Validity Criterion for LLM-Enabled Adaptive Assessment CRAFT: Critic-Refined Adaptive Key-Frame Targeting for Multimodal Video Question Answering SimGym: A Framework for A/B Test Simulation in E-Commerce with Traffic-Grounded VLM Agents IMLJD: A Computational Dataset for Indian Matrimonial Litigation Analysis Not all uncertainty is alike: volatility, stochasticity, and exploration Conflict-Resilient Multi-Agent Reasoning via Signed Graph Modeling Minimax Optimal Variance-Aware Regret Bounds for Multinomial Logistic MDPs Generative Auto-Bidding with Unified Modeling and Exploration Prior Knowledge or Search? A Study of LLM Agents in Hardware-Aware Code Optimization Streamlined Constraint Reasoning via CNN Pattern Recognition on Enumerated Solutions Mathematical Reasoning in Large Language Models: Benchmarks, Architectures, Evaluation, and Open Challenges optimize_anything: A Universal API for Optimizing any Text Parameter TERGAD: Structure-Aware Text-Enhanced Representations for Graph Anomaly Detection Towards Multi-Model LLM Schedulers: Empirical Insights into Offloading and Preemption Formal Skill: Programmable Runtime Skills for Efficient and Accurate LLM Agents CogScale: Scalable Benchmark for Sequence Processing Concept-Guided Noisy Negative Suppression for Zero-Shot Classification and Grounding of Chest X-Ray Findings KappaPlace: Learning Hyperspherical Uncertainty for Visual Place Recognition via Prototype-Anchored Supervision ReacTOD: Bounded Neuro-Symbolic Agentic NLU for Zero-Shot Dialogue State Tracking What Really Improves Mathematical Reasoning: Structured Reasoning Signals Beyond Pure Code Quantized Machine Learning Models for Medical Imaging in Low-Resource Healthcare Settings Evaluating the Utility of Personal Health Records in Personalized Health AI Projecting Latent RL Actions: Towards Generalizable and Scalable Graph Combinatorial Optimization Learning Long-Term Temporal Dependencies in Photovoltaic Power Output Prediction Through Multi-Horizon Forecasting Synthesis and Evaluation of Long-term History-aware Medical Dialogue Embedding by Elicitation: Dynamic Representations for Bayesian Optimization of System Prompts Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution Interference-Aware Multi-Task Unlearning Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering How Far Are We From True Auto-Research? Causal Evidence for Attention Head Imbalance in Modality Conflict Hallucination Trustworthy Agent Network: Trust in Agent Networks Must Be Baked In, Not Bolted On Agentic Trading: When LLM Agents Meet Financial Markets LiFT: Lifted Inter-slice Feature Trajectories for 3D Image Generation from 2D Generators PromptRad: Knowledge-Enhanced Multi-Label Prompt-Tuning for Low-Resource Radiology Report Labeling FAGER: Factually Grounded Evaluation and Refinement of Text-to-Image Models Distribution-Free Uncertainty Quantification for Continuous AI Agent Evaluation Generative Recursive Reasoning GroupAffect-4: A Multimodal Dataset of Four-Person Collaborative Interaction Hallucination as Exploit: Evidence-Carrying Multimodal Agents Dimensional Balance Improves Large Scale Spatiotemporal Prediction Performance Pseudocode-Guided Structured Reasoning for Automating Reliable Inference in Vision-Language Models

Recursive Entropic Risk Optimization in Discounted MDPs: Sample Complexity Bounds with a Generative Model

Oliver Morte · 2026-05-20 · via cs.AI updates on arXiv.org

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。