惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Hugging Face - Blog
Hugging Face - Blog
博客园 - 司徒正美
The GitHub Blog
The GitHub Blog
WordPress大学
WordPress大学
小众软件
小众软件
云风的 BLOG
云风的 BLOG
S
Security Archives - TechRepublic
Google DeepMind News
Google DeepMind News
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Blog — PlanetScale
Blog — PlanetScale
J
Java Code Geeks
A
Arctic Wolf
Security Latest
Security Latest
P
Palo Alto Networks Blog
T
Threat Research - Cisco Blogs
G
GRAHAM CLULEY
Know Your Adversary
Know Your Adversary
博客园 - 叶小钗
Recent Announcements
Recent Announcements
U
Unit 42
P
Proofpoint News Feed
Simon Willison's Weblog
Simon Willison's Weblog
S
Securelist
博客园 - 【当耐特】
Apple Machine Learning Research
Apple Machine Learning Research
C
CERT Recently Published Vulnerability Notes
C
Cisco Blogs
S
Secure Thoughts
P
Privacy International News Feed
Schneier on Security
Schneier on Security
C
Check Point Blog
雷峰网
雷峰网
量子位
美团技术团队
N
News | PayPal Newsroom
腾讯CDC
酷 壳 – CoolShell
酷 壳 – CoolShell
PCI Perspectives
PCI Perspectives
G
Google Developers Blog
E
Exploit-DB.com RSS Feed
MyScale Blog
MyScale Blog
S
SegmentFault 最新的问题
博客园 - Franky
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Stack Overflow Blog
Stack Overflow Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Webroot Blog
Webroot Blog
F
Fortinet All Blogs

cs.LG updates on arXiv.org

Synthetic Tabular Generators Fail to Preserve Behavioral Fraud Patterns: A Benchmark on Temporal, Velocity, and Multi-Account Signals Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates Automated co-design of high-performance thermodynamic cycles via graph-based hierarchical reinforcement learning Does Dimensionality Reduction via Random Projections Preserve Landscape Features? Analog Optical Inference on Million-Record Mortgage Data ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection Context Sensitivity Improves Human-Machine Visual Alignment Artificial intelligence application in lymphoma diagnosis with Vision Transformer using weakly supervised training Design and Behavior of Sparse Mixture-of-Experts Layers in CNN-based Semantic Segmentation Automatic Charge State Tuning of 300 mm FDSOI Quantum Dots Using Neural Network Segmentation of Charge Stability Diagram MyoVision: A Mobile Research Tool and NEATBoost-Attention Ensemble Framework for Real Time Chicken Breast Myopathy Detection The Spectrascapes Dataset: Street-view imagery beyond the visible captured using a mobile platform Deep Spatially-Regularized and Superpixel-Based Diffusion Learning for Unsupervised Hyperspectral Image Clustering DroneScan-YOLO: Redundancy-Aware Lightweight Detection for Tiny Objects in UAV Imagery Rethinking Uncertainty in Segmentation: From Estimation to Decision A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction Depth-Resolved Coral Reef Thermal Fields from Satellite SST and Sparse In-Situ Loggers Using Physics-Informed Neural Networks Spatial Atlas: Compute-Grounded Reasoning for Spatial-Aware Research Agent Benchmarks Spectral Entropy Collapse as a Phase Transition in Delayed Generalisation: An Interventional and Predictive Framework for Grokkin LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling Evaluating Cooperation in LLM Social Groups through Elected Leadership Towards Autonomous Mechanistic Reasoning in Virtual Cells Symmetry Reveals Layerwise Dynamics: How Transformers Perform In-Context Classification A Triadic Suffix Tokenization Scheme for Numerical Reasoning Not All Forgetting Is Equal: Architecture-Dependent Retention Dynamics in Fine-Tuned Image Classifiers Revisiting Compositionality in Dual-Encoder Vision-Language Models: The Role of Inference From Attribution to Action: A Human-Centered Application of Activation Steering THEIA: Learning Complete Kleene Three-Valued Logic in a Pure-Neural Modular Architecture Cost-optimal Sequential Testing via Doubly Robust Q-learning Lightweight Low-Light Image Enhancement via Distribution-Normalizing Preprocessing and Depthwise U-Net A Faster Path to Continual Learning Where Hindsight Credit Can Reside: A Signed-Capacity View of Token Updates in RLVR Optimal Stability of KL Divergence under Gaussian Perturbations Memory-Guided Trust-Region Bayesian Optimization (MG-TuRBO) for High Dimensions EngageTriBoost: Predictive Modeling of User Engagement in Digital Mental Health Intervention Using Explainable Machine Learning Reservoir observer enhanced with residual calibration and attention mechanism Creator Incentives in Recommender Systems: A Cooperative Game-Theoretic Approach for Stable and Fair Collaboration in Multi-Agent Bandits Efficient RL Training for LLMs with Experience Replay Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning A Little Rank Goes a Long Way: Random Scaffolds with LoRA Adapters Are All You Need Adversarial Sensor Errors for Safe and Robust Wind Turbine Fleet Control IKKA: Inversion Classification via Critical Anomalies for Robust Visual Servoing Adaptive Simulation Experiment for LLM Policy Optimization EvoLen: Evolution-Guided Tokenization for DNA Language Model Smartwatch-Based Sitting Time Estimation in Real-World Office Settings Structural Evaluation Metrics for SVG Generation via Leave-One-Out Analysis Loom: A Scalable Analytical Neural Computer Architecture Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis Spectral Geometry of LoRA Adapters Encodes Training Objective and Predicts Harmful Compliance Finite-Sample Analysis of Nonlinear Independent Component Analysis:Sample Complexity and Identifiability Bounds How does Chain of Thought decompose complex tasks? Uncertainty-Aware Transformers: Conformal Prediction for Language Models Adaptive Candidate Point Thompson Sampling for High-Dimensional Bayesian Optimization Using Synthetic Data for Machine Learning-based Childhood Vaccination Prediction in Narok, Kenya Delve into the Applicability of Advanced Optimizers for Multi-Task Learning Bridging SFT and RL: Dynamic Policy Optimization for Robust Reasoning Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication Predictive Entropy Links Calibration and Paraphrase Sensitivity in Medical Vision-Language Models Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning Modality-Aware Zero-Shot Pruning and Sparse Attention for Efficient Multimodal Edge Inference The nextAI Solution to the NeurIPS 2023 LLM Efficiency Challenge Feature-Label Modal Alignment for Robust Partial Multi-Label Learning Integrated electro-optic attention nonlinearities for transformers Toward World Models for Epidemiology Tracing the Chain: Deep Learning for Stepping-Stone Intrusion Detection Policy-Aware Design of Large-Scale Factorial Experiments Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer Continuous Orthogonal Mode Decomposition: Haptic Signal Prediction in Tactile Internet Batch Distillation Data for Developing Machine Learning Anomaly Detection Methods Predicting Metabolic Dysfunction-Associated Steatotic Liver Disease using Machine Learning Methods: A Retrospective Cohort Study Adaptive Tuning of Parameterized Traffic Controllers via Multi-Agent Reinforcement Learning Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning Neural Two-Stage Stochastic Optimization for Solving Unit Commitment Problem Mini-Batch Covariance, Diffusion Limits, and Oracle Complexity in Stochastic Gradient Descent: A Sampling-Design Perspective A Quantitative Definition of Intelligence SpectralLoRA: Is Low-Frequency Structure Sufficient for LoRA Adaptation? A Spectral Analysis of Weight Updates A Queueing-Theoretic Framework for Dynamic Attack Surfaces: Data-Integrated Risk Analysis and Adaptive Defense The Amazing Agent Race: Strong Tool Users, Weak Navigators MAVEN-T: Reinforced Heterogeneous Distillation for Real-Time Multi-Agent Trajectory Prediction Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions COMPOSITE-Stem SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories PhysInOne: Visual Physics Learning and Reasoning in One Suite FIRE-CIR: Fine-grained Reasoning for Composed Fashion Image Retrieval Detecting Diffusion-generated Images via Dynamic Assembly Forests PDE-regularized Dynamics-informed Diffusion with Uncertainty-aware Filtering for Long-Horizon Dynamics Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA Identification and Anonymization of Named Entities in Unstructured Information Sources for Use in Social Engineering Detection Hypergraph Neural Networks Accelerate MUS Enumeration ASTRA: Adaptive Semantic Tree Reasoning Architecture for Complex Table Question Answering Neighbourhood Transformer: Switchable Attention for Monophily-Aware Graph Learning WOMBET: World Model-Based Experience Transfer for Robust and Sample-efficient Reinforcement Learning Low-Data Supervised Adaptation Outperforms Prompting for Cloud Segmentation Under Domain Shift Revisiting the Capacity Gap in Chain-of-Thought Distillation from a Practical Perspective A Mathematical Framework for Temporal Modeling and Counterfactual Policy Simulation of Student Dropout
Hierarchical Deep Reinforcement Learning Approach for Multi-Objective Scheduling With Varying Queue Sizes
Yoni Birman, Ziv Ido, Gilad Katz, Asaf Shabtai · 2020-07-18 · via cs.LG updates on arXiv.org

Multi-objective task scheduling (MOTS) is the task scheduling while optimizing multiple and possibly contradicting constraints. A challenging extension of this problem occurs when every individual task is a multi-objective optimization problem by itself. While deep reinforcement learning (DRL) has been successfully applied to complex sequential problems, its application to the MOTS domain has been stymied by two challenges. The first challenge is the inability of the DRL algorithm to ensure that every item is processed identically regardless of its position in the queue. The second challenge is the need to manage large queues, which results in large neural architectures and long training times. In this study we present MERLIN, a robust, modular and near-optimal DRL-based approach for multi-objective task scheduling. MERLIN applies a hierarchical approach to the MOTS problem by creating one neural network for the processing of individual tasks and another for the scheduling of the overall queue. In addition to being smaller and with shorted training times, the resulting architecture ensures that an item is processed in the same manner regardless of its position in the queue. Additionally, we present a novel approach for efficiently applying DRL-based solutions on very large queues, and demonstrate how we effectively scale MERLIN to process queue sizes that are larger by orders of magnitude than those on which it was trained. Extensive evaluation on multiple queue sizes show that MERLIN outperforms multiple well-known baselines by a large margin (>22%).