惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
D
Docker
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
WordPress大学
WordPress大学
Hugging Face - Blog
Hugging Face - Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The Cloudflare Blog
N
News and Events Feed by Topic
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
W
WeLiveSecurity
TaoSecurity Blog
TaoSecurity Blog
Y
Y Combinator Blog
小众软件
小众软件
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
E
Exploit-DB.com RSS Feed
F
Full Disclosure
Hacker News - Newest:
Hacker News - Newest: "LLM"
博客园 - Franky
宝玉的分享
宝玉的分享
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
罗磊的独立博客
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Forbes - Security
Forbes - Security
L
LangChain Blog
V
Visual Studio Blog
腾讯CDC
V
V2EX
N
News | PayPal Newsroom
博客园_首页
L
LINUX DO - 热门话题
博客园 - 【当耐特】
P
Privacy & Cybersecurity Law Blog
P
Proofpoint News Feed
Google Online Security Blog
Google Online Security Blog
S
Security Archives - TechRepublic
博客园 - 司徒正美
T
Threatpost
SecWiki News
SecWiki News
C
Cyber Attacks, Cyber Crime and Cyber Security
人人都是产品经理
人人都是产品经理
AWS News Blog
AWS News Blog
Scott Helme
Scott Helme
博客园 - 三生石上(FineUI控件)
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Recorded Future
Recorded Future
V
Vulnerabilities – Threatpost
H
Hackread – Cybersecurity News, Data Breaches, AI and More
量子位

cs.LG updates on arXiv.org

Memory-Guided Trust-Region Bayesian Optimization (MG-TuRBO) for High Dimensions EngageTriBoost: Predictive Modeling of User Engagement in Digital Mental Health Intervention Using Explainable Machine Learning Reservoir observer enhanced with residual calibration and attention mechanism Creator Incentives in Recommender Systems: A Cooperative Game-Theoretic Approach for Stable and Fair Collaboration in Multi-Agent Bandits Efficient RL Training for LLMs with Experience Replay Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning A Little Rank Goes a Long Way: Random Scaffolds with LoRA Adapters Are All You Need Adversarial Sensor Errors for Safe and Robust Wind Turbine Fleet Control IKKA: Inversion Classification via Critical Anomalies for Robust Visual Servoing Adaptive Simulation Experiment for LLM Policy Optimization EvoLen: Evolution-Guided Tokenization for DNA Language Model Smartwatch-Based Sitting Time Estimation in Real-World Office Settings Structural Evaluation Metrics for SVG Generation via Leave-One-Out Analysis Loom: A Scalable Analytical Neural Computer Architecture Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis Spectral Geometry of LoRA Adapters Encodes Training Objective and Predicts Harmful Compliance Finite-Sample Analysis of Nonlinear Independent Component Analysis:Sample Complexity and Identifiability Bounds How does Chain of Thought decompose complex tasks? Uncertainty-Aware Transformers: Conformal Prediction for Language Models Adaptive Candidate Point Thompson Sampling for High-Dimensional Bayesian Optimization Using Synthetic Data for Machine Learning-based Childhood Vaccination Prediction in Narok, Kenya Delve into the Applicability of Advanced Optimizers for Multi-Task Learning Bridging SFT and RL: Dynamic Policy Optimization for Robust Reasoning Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication Predictive Entropy Links Calibration and Paraphrase Sensitivity in Medical Vision-Language Models Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning Modality-Aware Zero-Shot Pruning and Sparse Attention for Efficient Multimodal Edge Inference The nextAI Solution to the NeurIPS 2023 LLM Efficiency Challenge Feature-Label Modal Alignment for Robust Partial Multi-Label Learning Integrated electro-optic attention nonlinearities for transformers Toward World Models for Epidemiology Tracing the Chain: Deep Learning for Stepping-Stone Intrusion Detection Policy-Aware Design of Large-Scale Factorial Experiments Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer Continuous Orthogonal Mode Decomposition: Haptic Signal Prediction in Tactile Internet Batch Distillation Data for Developing Machine Learning Anomaly Detection Methods Predicting Metabolic Dysfunction-Associated Steatotic Liver Disease using Machine Learning Methods: A Retrospective Cohort Study Adaptive Tuning of Parameterized Traffic Controllers via Multi-Agent Reinforcement Learning Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning Neural Two-Stage Stochastic Optimization for Solving Unit Commitment Problem SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories PhysInOne: Visual Physics Learning and Reasoning in One Suite FIRE-CIR: Fine-grained Reasoning for Composed Fashion Image Retrieval Detecting Diffusion-generated Images via Dynamic Assembly Forests PDE-regularized Dynamics-informed Diffusion with Uncertainty-aware Filtering for Long-Horizon Dynamics Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA Identification and Anonymization of Named Entities in Unstructured Information Sources for Use in Social Engineering Detection Hypergraph Neural Networks Accelerate MUS Enumeration ASTRA: Adaptive Semantic Tree Reasoning Architecture for Complex Table Question Answering Neighbourhood Transformer: Switchable Attention for Monophily-Aware Graph Learning WOMBET: World Model-Based Experience Transfer for Robust and Sample-efficient Reinforcement Learning Low-Data Supervised Adaptation Outperforms Prompting for Cloud Segmentation Under Domain Shift Revisiting the Capacity Gap in Chain-of-Thought Distillation from a Practical Perspective A Mathematical Framework for Temporal Modeling and Counterfactual Policy Simulation of Student Dropout Temporal Dropout Risk in Learning Analytics: A Harmonized Survival Benchmark Across Dynamic and Early-Window Representations MedFormer-UR: Uncertainty-Routed Transformer for Medical Image Classification Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs Post-Hoc Guidance for Consistency Models by Joint Flow Distribution Learning SenBen: Sensitive Scene Graphs for Explainable Content Moderation R2G: A Multi-View Circuit Graph Benchmark Suite from RTL to GDSII $p1$: Better Prompt Optimization with Fewer Prompts Deep Learning-Based Tracking and Lineage Reconstruction of Ligament Breakup Every Response Counts: Quantifying Uncertainty of LLM-based Multi-Agent Systems through Tensor Decomposition Unified Multimodal Uncertain Inference EfficientSign: An Attention-Enhanced Lightweight Architecture for Indian Sign Language Recognition Skip-Connected Policy Optimization for Implicit Advantage PRAGMA: Revolut Foundation Model 3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Post-hoc Uncertainty Estimation StructRL: Recovering Dynamic Programming Structure from Learning Dynamics in Distributional Reinforcement Learning Generative 3D Gaussian Splatting for Arbitrary-ResolutionAtmospheric Downscaling and Forecasting From Selection to Scheduling: Federated Geometry-Aware Correction Makes Exemplar Replay Work Better under Continual Dynamic Heterogeneity Needle in a Haystack: One-Class Representation Learning for Detecting Rare Malignant Cells in Computational Cytology Detection of Hate and Threat in Digital Forensics: A Case-Driven Multimodal Approach Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines Joint Interference Detection and Identification via Adversarial Multi-task Learning HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models R3PM-Net: Real-time, Robust, Real-world Point Matching Network From Dispersion to Attraction: Spectral Dynamics of Hallucination Across Whisper Model Scales AlphaLab: Autonomous Multi-Agent Research Across Optimization Domains with Frontier LLMs Act or Escalate? Evaluating Escalation Behavior in Automation with Language Models Multivariate Time Series Anomaly Detection via Dual-Branch Reconstruction and Autoregressive Flow-based Residual Density Estimation On the Spectral Geometry of Cross-Modal Representations: A Functional Map Diagnostic for Multimodal Alignment Structured Exploration and Exploitation of Label Functions for Automated Data Annotation MolPaQ: Modular Quantum-Classical Patch Learning for Interpretable Molecular Generation CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention Reinforcement-aware Knowledge Distillation for LLM Reasoning SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework A Horizon-Aware Decision-Support Framework for Demand Forecasting Model Selection in Resilient Production Planning Measurement-Consistent Langevin Corrector for Stabilizing Latent Diffusion Inverse Problem Solvers Multi-agent Adaptive Mechanism Design When & How to Write for Personalized Demand-aware Query Rewriting in Video Search Relational Visual Similarity From Navigation to Refinement: Revealing the Two-Stage Nature of Flow-based Diffusion Models through Oracle Velocity On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs STCast: Adaptive Boundary Alignment for Global and Regional Weather Forecasting OmniPrism: Learning Disentangled Visual Concept for Image Generation FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening
DAL: A Practical Prior-Free Black-Box Framework for Piecewise Stationary Bandits
Argyrios Gerogiannis, Yu-Han Huang, Subhonmesh Bose, Venugopal V · 2025-02-01 · via cs.LG updates on arXiv.org

We introduce a practical, black-box framework termed Detection Augmented Learning (DAL) for the problem of piecewise stationary bandits without knowledge of the underlying non-stationarity. DAL accepts any stationary bandit algorithm with order-optimal regret as input and augments it with a change detector, enabling applicability to all common bandit variants. Extensive experimentation demonstrates that DAL consistently surpasses all state-of-the-art methods across diverse non-stationary scenarios, including synthetic benchmarks and real-world datasets, underscoring its versatility and scalability. We provide theoretical insights into DAL's strong empirical performance, complemented by thorough empirical validation.