惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

WordPress大学
WordPress大学
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
小众软件
小众软件
美团技术团队
腾讯CDC
C
CERT Recently Published Vulnerability Notes
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 三生石上(FineUI控件)
Security Latest
Security Latest
T
Threat Research - Cisco Blogs
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Spread Privacy
Spread Privacy
Cloudbric
Cloudbric
T
Tor Project blog
Google Online Security Blog
Google Online Security Blog
SecWiki News
SecWiki News
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Webroot Blog
Webroot Blog
W
WeLiveSecurity
Google DeepMind News
Google DeepMind News
Latest news
Latest news
P
Privacy International News Feed
P
Privacy & Cybersecurity Law Blog
Attack and Defense Labs
Attack and Defense Labs
T
Threatpost
N
News and Events Feed by Topic
Know Your Adversary
Know Your Adversary
www.infosecurity-magazine.com
www.infosecurity-magazine.com
The Last Watchdog
The Last Watchdog
P
Proofpoint News Feed
L
LINUX DO - 最新话题
Recent Commits to openclaw:main
Recent Commits to openclaw:main
L
LINUX DO - 热门话题
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
The Hacker News
The Hacker News
Cyberwarzone
Cyberwarzone
Project Zero
Project Zero
S
Security Affairs
T
The Exploit Database - CXSecurity.com
C
Cybersecurity and Infrastructure Security Agency CISA
Scott Helme
Scott Helme
V
Vulnerabilities – Threatpost
I
Intezer
S
Security @ Cisco Blogs
TaoSecurity Blog
TaoSecurity Blog
PCI Perspectives
PCI Perspectives
Schneier on Security
Schneier on Security
Hugging Face - Blog
Hugging Face - Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Jina AI
Jina AI

cs.LG updates on arXiv.org

Synthetic Tabular Generators Fail to Preserve Behavioral Fraud Patterns: A Benchmark on Temporal, Velocity, and Multi-Account Signals Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates Automated co-design of high-performance thermodynamic cycles via graph-based hierarchical reinforcement learning Does Dimensionality Reduction via Random Projections Preserve Landscape Features? Analog Optical Inference on Million-Record Mortgage Data Bias-Corrected Adaptive Conformal Inference for Multi-Horizon Time Series Forecasting Counterfactual Peptide Editing for Causal TCR--pMHC Binding Inference Binomial Gradient-Based Meta-Learning for Enhanced Meta-Gradient Estimation Enhancing Confidence Estimation in Telco LLMs via Twin-Pass CoT-Ensembling MOONSHOT : A Framework for Multi-Objective Pruning of Vision and Large Language Models Physics-informed reservoir characterization from bulk and extreme pressure events with a differentiable simulator Some Theoretical Limitations of t-SNE Concrete Jungle: Towards Concreteness Paved Contrastive Negative Mining for Compositional Understanding Multi-Task LLM with LoRA Fine-Tuning for Automated Cancer Staging and Biomarker Extraction ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection Context Sensitivity Improves Human-Machine Visual Alignment Artificial intelligence application in lymphoma diagnosis with Vision Transformer using weakly supervised training Design and Behavior of Sparse Mixture-of-Experts Layers in CNN-based Semantic Segmentation Automatic Charge State Tuning of 300 mm FDSOI Quantum Dots Using Neural Network Segmentation of Charge Stability Diagram MyoVision: A Mobile Research Tool and NEATBoost-Attention Ensemble Framework for Real Time Chicken Breast Myopathy Detection Beyond Uniform Sampling: Synergistic Active Learning and Input Denoising for Robust Neural Operators The Spectrascapes Dataset: Street-view imagery beyond the visible captured using a mobile platform Deep Spatially-Regularized and Superpixel-Based Diffusion Learning for Unsupervised Hyperspectral Image Clustering DroneScan-YOLO: Redundancy-Aware Lightweight Detection for Tiny Objects in UAV Imagery Rethinking Uncertainty in Segmentation: From Estimation to Decision Out of Context: Reliability in Multimodal Anomaly Detection Requires Contextual Inference A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction Depth-Resolved Coral Reef Thermal Fields from Satellite SST and Sparse In-Situ Loggers Using Physics-Informed Neural Networks Spatial Atlas: Compute-Grounded Reasoning for Spatial-Aware Research Agent Benchmarks Spectral Entropy Collapse as a Phase Transition in Delayed Generalisation: An Interventional and Predictive Framework for Grokkin LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling Evaluating Cooperation in LLM Social Groups through Elected Leadership Towards Autonomous Mechanistic Reasoning in Virtual Cells Symmetry Reveals Layerwise Dynamics: How Transformers Perform In-Context Classification A Triadic Suffix Tokenization Scheme for Numerical Reasoning Not All Forgetting Is Equal: Architecture-Dependent Retention Dynamics in Fine-Tuned Image Classifiers Revisiting Compositionality in Dual-Encoder Vision-Language Models: The Role of Inference From Attribution to Action: A Human-Centered Application of Activation Steering THEIA: Learning Complete Kleene Three-Valued Logic in a Pure-Neural Modular Architecture Cost-optimal Sequential Testing via Doubly Robust Q-learning Lightweight Low-Light Image Enhancement via Distribution-Normalizing Preprocessing and Depthwise U-Net A Faster Path to Continual Learning Where Hindsight Credit Can Reside: A Signed-Capacity View of Token Updates in RLVR Optimal Stability of KL Divergence under Gaussian Perturbations Memory-Guided Trust-Region Bayesian Optimization (MG-TuRBO) for High Dimensions EngageTriBoost: Predictive Modeling of User Engagement in Digital Mental Health Intervention Using Explainable Machine Learning Reservoir observer enhanced with residual calibration and attention mechanism Creator Incentives in Recommender Systems: A Cooperative Game-Theoretic Approach for Stable and Fair Collaboration in Multi-Agent Bandits Efficient RL Training for LLMs with Experience Replay Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning A Little Rank Goes a Long Way: Random Scaffolds with LoRA Adapters Are All You Need Adversarial Sensor Errors for Safe and Robust Wind Turbine Fleet Control IKKA: Inversion Classification via Critical Anomalies for Robust Visual Servoing Adaptive Simulation Experiment for LLM Policy Optimization EvoLen: Evolution-Guided Tokenization for DNA Language Model Smartwatch-Based Sitting Time Estimation in Real-World Office Settings Structural Evaluation Metrics for SVG Generation via Leave-One-Out Analysis Loom: A Scalable Analytical Neural Computer Architecture Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis Spectral Geometry of LoRA Adapters Encodes Training Objective and Predicts Harmful Compliance Finite-Sample Analysis of Nonlinear Independent Component Analysis:Sample Complexity and Identifiability Bounds How does Chain of Thought decompose complex tasks? Uncertainty-Aware Transformers: Conformal Prediction for Language Models Adaptive Candidate Point Thompson Sampling for High-Dimensional Bayesian Optimization Using Synthetic Data for Machine Learning-based Childhood Vaccination Prediction in Narok, Kenya Delve into the Applicability of Advanced Optimizers for Multi-Task Learning Bridging SFT and RL: Dynamic Policy Optimization for Robust Reasoning Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication Predictive Entropy Links Calibration and Paraphrase Sensitivity in Medical Vision-Language Models Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning Modality-Aware Zero-Shot Pruning and Sparse Attention for Efficient Multimodal Edge Inference The nextAI Solution to the NeurIPS 2023 LLM Efficiency Challenge Feature-Label Modal Alignment for Robust Partial Multi-Label Learning Integrated electro-optic attention nonlinearities for transformers Toward World Models for Epidemiology Tracing the Chain: Deep Learning for Stepping-Stone Intrusion Detection Policy-Aware Design of Large-Scale Factorial Experiments Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer Continuous Orthogonal Mode Decomposition: Haptic Signal Prediction in Tactile Internet FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening Batch Distillation Data for Developing Machine Learning Anomaly Detection Methods Predicting Metabolic Dysfunction-Associated Steatotic Liver Disease using Machine Learning Methods: A Retrospective Cohort Study Adaptive Tuning of Parameterized Traffic Controllers via Multi-Agent Reinforcement Learning Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning Neural Two-Stage Stochastic Optimization for Solving Unit Commitment Problem Mini-Batch Covariance, Diffusion Limits, and Oracle Complexity in Stochastic Gradient Descent: A Sampling-Design Perspective A Quantitative Definition of Intelligence SpectralLoRA: Is Low-Frequency Structure Sufficient for LoRA Adaptation? A Spectral Analysis of Weight Updates A Queueing-Theoretic Framework for Dynamic Attack Surfaces: Data-Integrated Risk Analysis and Adaptive Defense The Amazing Agent Race: Strong Tool Users, Weak Navigators MAVEN-T: Reinforced Heterogeneous Distillation for Real-Time Multi-Agent Trajectory Prediction Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions COMPOSITE-Stem SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories PhysInOne: Visual Physics Learning and Reasoning in One Suite FIRE-CIR: Fine-grained Reasoning for Composed Fashion Image Retrieval
Neural Scaling Laws of Deep ReLU and Deep Operator Network: A Theoretical Study
Hao Liu, Zecheng Zhang, Wenjing Liao, Hayden Schaeffer · 2024-10-01 · via cs.LG updates on arXiv.org

Neural scaling laws play a pivotal role in the performance of deep neural networks and have been observed in a wide range of tasks. However, a complete theoretical framework for understanding these scaling laws remains underdeveloped. In this paper, we explore the neural scaling laws for deep operator networks, which involve learning mappings between function spaces, with a focus on the Chen and Chen style architecture. These approaches, which include the popular Deep Operator Network (DeepONet), approximate the output functions using a linear combination of learnable basis functions and coefficients that depend on the input functions. We establish a theoretical framework to quantify the neural scaling laws by analyzing its approximation and generalization errors. We articulate the relationship between the approximation and generalization errors of deep operator networks and key factors such as network model size and training data size. Moreover, we address cases where input functions exhibit low-dimensional structures, allowing us to derive tighter error bounds. These results also hold for deep ReLU networks and other similar structures. Our results offer a partial explanation of the neural scaling laws in operator learning and provide a theoretical foundation for their applications.