On the Sample Complexity of Discounted Reinforcement Learning with Optimized Certainty Equivalents

Memory-Guided Trust-Region Bayesian Optimization (MG-TuRBO) for High Dimensions

EngageTriBoost: Predictive Modeling of User Engagement in Digital Mental Health Intervention Using Explainable Machine Learning

Reservoir observer enhanced with residual calibration and attention mechanism

Efficient RL Training for LLMs with Experience Replay

Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning

Adversarial Sensor Errors for Safe and Robust Wind Turbine Fleet Control

IKKA: Inversion Classification via Critical Anomalies for Robust Visual Servoing

Adaptive Simulation Experiment for LLM Policy Optimization

EvoLen: Evolution-Guided Tokenization for DNA Language Model

Smartwatch-Based Sitting Time Estimation in Real-World Office Settings

Structural Evaluation Metrics for SVG Generation via Leave-One-Out Analysis

Loom: A Scalable Analytical Neural Computer Architecture

Spectral Geometry of LoRA Adapters Encodes Training Objective and Predicts Harmful Compliance

Finite-Sample Analysis of Nonlinear Independent Component Analysis:Sample Complexity and Identifiability Bounds

How does Chain of Thought decompose complex tasks?

Uncertainty-Aware Transformers: Conformal Prediction for Language Models

Adaptive Candidate Point Thompson Sampling for High-Dimensional Bayesian Optimization

Using Synthetic Data for Machine Learning-based Childhood Vaccination Prediction in Narok, Kenya

Delve into the Applicability of Advanced Optimizers for Multi-Task Learning

Bridging SFT and RL: Dynamic Policy Optimization for Robust Reasoning

Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication

Predictive Entropy Links Calibration and Paraphrase Sensitivity in Medical Vision-Language Models

Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning

Modality-Aware Zero-Shot Pruning and Sparse Attention for Efficient Multimodal Edge Inference

The nextAI Solution to the NeurIPS 2023 LLM Efficiency Challenge

Feature-Label Modal Alignment for Robust Partial Multi-Label Learning

Integrated electro-optic attention nonlinearities for transformers

Toward World Models for Epidemiology

Tracing the Chain: Deep Learning for Stepping-Stone Intrusion Detection

Batch Distillation Data for Developing Machine Learning Anomaly Detection Methods

Predicting Metabolic Dysfunction-Associated Steatotic Liver Disease using Machine Learning Methods: A Retrospective Cohort Study

Adaptive Tuning of Parameterized Traffic Controllers via Multi-Agent Reinforcement Learning

Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning

Neural Two-Stage Stochastic Optimization for Solving Unit Commitment Problem

SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning

Continuous Orthogonal Mode Decomposition: Haptic Signal Prediction in Tactile Internet

Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories

PhysInOne: Visual Physics Learning and Reasoning in One Suite

Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer

FIRE-CIR: Fine-grained Reasoning for Composed Fashion Image Retrieval

Detecting Diffusion-generated Images via Dynamic Assembly Forests

PDE-regularized Dynamics-informed Diffusion with Uncertainty-aware Filtering for Long-Horizon Dynamics

Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection

Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA

Identification and Anonymization of Named Entities in Unstructured Information Sources for Use in Social Engineering Detection

Hypergraph Neural Networks Accelerate MUS Enumeration

ASTRA: Adaptive Semantic Tree Reasoning Architecture for Complex Table Question Answering

Neighbourhood Transformer: Switchable Attention for Monophily-Aware Graph Learning

WOMBET: World Model-Based Experience Transfer for Robust and Sample-efficient Reinforcement Learning

Low-Data Supervised Adaptation Outperforms Prompting for Cloud Segmentation Under Domain Shift

Revisiting the Capacity Gap in Chain-of-Thought Distillation from a Practical Perspective

A Mathematical Framework for Temporal Modeling and Counterfactual Policy Simulation of Student Dropout

Temporal Dropout Risk in Learning Analytics: A Harmonized Survival Benchmark Across Dynamic and Early-Window Representations

MedFormer-UR: Uncertainty-Routed Transformer for Medical Image Classification

Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs

Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis

Post-Hoc Guidance for Consistency Models by Joint Flow Distribution Learning

SenBen: Sensitive Scene Graphs for Explainable Content Moderation

R2G: A Multi-View Circuit Graph Benchmark Suite from RTL to GDSII

Policy-Aware Design of Large-Scale Factorial Experiments

$p1$: Better Prompt Optimization with Fewer Prompts

A Little Rank Goes a Long Way: Random Scaffolds with LoRA Adapters Are All You Need

Deep Learning-Based Tracking and Lineage Reconstruction of Ligament Breakup

Every Response Counts: Quantifying Uncertainty of LLM-based Multi-Agent Systems through Tensor Decomposition

Unified Multimodal Uncertain Inference

EfficientSign: An Attention-Enhanced Lightweight Architecture for Indian Sign Language Recognition

Skip-Connected Policy Optimization for Implicit Advantage

PRAGMA: Revolut Foundation Model

3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding

Creator Incentives in Recommender Systems: A Cooperative Game-Theoretic Approach for Stable and Fair Collaboration in Multi-Agent Bandits

Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Post-hoc Uncertainty Estimation

StructRL: Recovering Dynamic Programming Structure from Learning Dynamics in Distributional Reinforcement Learning

Generative 3D Gaussian Splatting for Arbitrary-ResolutionAtmospheric Downscaling and Forecasting

From Selection to Scheduling: Federated Geometry-Aware Correction Makes Exemplar Replay Work Better under Continual Dynamic Heterogeneity

Needle in a Haystack: One-Class Representation Learning for Detecting Rare Malignant Cells in Computational Cytology

Detection of Hate and Threat in Digital Forensics: A Case-Driven Multimodal Approach

Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines

Joint Interference Detection and Identification via Adversarial Multi-task Learning

HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models

R3PM-Net: Real-time, Robust, Real-world Point Matching Network

From Dispersion to Attraction: Spectral Dynamics of Hallucination Across Whisper Model Scales

AlphaLab: Autonomous Multi-Agent Research Across Optimization Domains with Frontier LLMs

Act or Escalate? Evaluating Escalation Behavior in Automation with Language Models

Multivariate Time Series Anomaly Detection via Dual-Branch Reconstruction and Autoregressive Flow-based Residual Density Estimation

On the Spectral Geometry of Cross-Modal Representations: A Functional Map Diagnostic for Multimodal Alignment

Structured Exploration and Exploitation of Label Functions for Automated Data Annotation

MolPaQ: Modular Quantum-Classical Patch Learning for Interpretable Molecular Generation

CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention

Reinforcement-aware Knowledge Distillation for LLM Reasoning

SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework

A Horizon-Aware Decision-Support Framework for Demand Forecasting Model Selection in Resilient Production Planning

Measurement-Consistent Langevin Corrector for Stabilizing Latent Diffusion Inverse Problem Solvers

Multi-agent Adaptive Mechanism Design

When & How to Write for Personalized Demand-aware Query Rewriting in Video Search

Relational Visual Similarity

From Navigation to Refinement: Revealing the Two-Stage Nature of Flow-based Diffusion Models through Oracle Velocity

On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs

STCast: Adaptive Boundary Alignment for Global and Regional Weather Forecasting

OmniPrism: Learning Disentangled Visual Concept for Image Generation

FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening

推荐订阅源

cs.LG updates on arXiv.org