惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

P
Privacy International News Feed
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Jina AI
Jina AI
T
Tailwind CSS Blog
WordPress大学
WordPress大学
Scott Helme
Scott Helme
C
Cybersecurity and Infrastructure Security Agency CISA
博客园 - Franky
C
CERT Recently Published Vulnerability Notes
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
雷峰网
雷峰网
Schneier on Security
Schneier on Security
博客园 - 聂微东
T
Tor Project blog
Hugging Face - Blog
Hugging Face - Blog
博客园 - 司徒正美
AI
AI
T
Troy Hunt's Blog
Security Latest
Security Latest
T
The Blog of Author Tim Ferriss
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Check Point Blog
T
Threat Research - Cisco Blogs
W
WeLiveSecurity
V
Vulnerabilities – Threatpost
Recorded Future
Recorded Future
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Cisco Talos Blog
Cisco Talos Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
Cloudbric
Cloudbric
J
Java Code Geeks
罗磊的独立博客
C
Cyber Attacks, Cyber Crime and Cyber Security
aimingoo的专栏
aimingoo的专栏
L
LangChain Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy & Cybersecurity Law Blog
Google DeepMind News
Google DeepMind News
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
L
Lohrmann on Cybersecurity
I
InfoQ
MongoDB | Blog
MongoDB | Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
The GitHub Blog
The GitHub Blog
The Hacker News
The Hacker News
H
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
P
Proofpoint News Feed
N
News and Events Feed by Topic

stat.ML updates on arXiv.org

Adaptive multi-fidelity optimization with fast learning rates Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias Correction Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback Stylistic-STORM (ST-STORM) : Perceiving the Semantic Nature of Appearance Collective Kernel EFT for Pre-activation ResNets PRIM-cipal components analysis One-Shot Generative Flows: Existence and Obstructions Structural interpretability in SVMs with truncated orthogonal polynomial kernels Amortized Optimal Transport from Sliced Potentials MinShap: A Modified Shapley Value Approach for Feature Selection Unsupervised feature selection using Bayesian Tucker decomposition Multi-User mmWave Beam and Rate Adaptation via Combinatorial Satisficing Bandits Best of both worlds: Stochastic & adversarial best-arm identification Scalable Model-Based Clustering with Sequential Monte Carlo Expert-Guided Class-Conditional Goodness-of-Fit Scores for Interpretable Classification with Informative Missingness: An Application to Seismic Monitoring Lightweight Geometric Adaptation for Training Physics-Informed Neural Networks Gating Enables Curvature: A Geometric Expressivity Gap in Attention Zeroth-Order Optimization at the Edge of Stability Differentially Private Conformal Prediction CLion: Efficient Cautious Lion Optimizer with Enhanced Generalization Generative Augmented Inference Improving Machine Learning Performance with Synthetic Augmentation PAC-MCTS: Bias-Aware Pruning for Robust LLM-Guided Search and Planning Path-Sampled Integrated Gradients Heat and Matérn Kernels on Matchings Doubly Outlier-Robust Online Infinite Hidden Markov Model Momentum Further Constrains Sharpness at the Edge of Stochastic Stability Multistage Conditional Compositional Optimization BOAT: Navigating the Sea of In Silico Predictors for Antibody Design via Multi-Objective Bayesian Optimization Sandpile Economics: Theory, Identification, and Evidence Online learning with noisy side observations Spectral Thompson sampling Covariance-adapting algorithm for semi-bandits with application to sparse rewards Ordinary Least Squares is a Special Case of Transformer Metric-Aware Principal Component Analysis (MAPCA):A Unified Framework for Scale-Invariant Representation Learning Robust Low-Rank Tensor Completion based on M-product with Weighted Correlated Total Variation and Sparse Regularization Joint Representation Learning and Clustering via Gradient-Based Manifold Optimization Universality of Gaussian-Mixture Reverse Kernels in Conditional Diffusion Interpretable and Explainable Surrogate Modeling for Simulations: A State-of-the-Art Survey and Perspectives on Explainable AI for Decision-Making Estimating Continuous Treatment Effects with Two-Stage Kernel Ridge Regression A short proof of near-linear convergence of adaptive gradient descent under fourth-order growth and convexity Some Theoretical Limitations of t-SNE Bias-Corrected Adaptive Conformal Inference for Multi-Horizon Time Series Forecasting Identifiability of Potentially Degenerate Gaussian Mixture Models With Piecewise Affine Mixing Rare Event Analysis via Stochastic Optimal Control Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates Minimizing classical resources in variational measurement-based quantum computation for generative modeling Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers ADD for Multi-Bit Image Watermarking Beyond Fixed False Discovery Rates: Post-Hoc Conformal Selection with E-Variables Regional Explanations: Bridging Local and Global Variable Importance ShapShift: Explaining Model Prediction Shifts with Subgroup Conditional Shapley Values Cost-optimal Sequential Testing via Doubly Robust Q-learning Query Lower Bounds for Diffusion Sampling Tail-Aware Information-Theoretic Generalization for RLHF and SGLD Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis Policy-Aware Design of Large-Scale Factorial Experiments Towards Verified and Targeted Explanations through Formal Methods Portfolio Optimization Proxies under Label Scarcity and Regime Shifts via Bayesian and Deterministic Students under Semi-Supervised Sandwich Training Spectral methods: crucial for machine learning, natural for quantum computers? The Devil Is in Gradient Entanglement: Energy-Aware Gradient Coordinator for Robust Generalized Category Discovery A Tutorial Review of Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches Certified and accurate computation of function space norms of deep neural networks Mini-Batch Covariance, Diffusion Limits, and Oracle Complexity in Stochastic Gradient Descent: A Sampling-Design Perspective Conformal Policy Control Diagnostics for Individual-Level Prediction Instability in Machine Learning for Healthcare Neural Networks With Dense Weights Are Not Universal Approximators Continuous-time reinforcement learning: ellipticity enables model-free value function approximation Scalable spatial point process models for forensic footwear analysis A Review of Diffusion-based Simulation-Based Inference: Foundations and Applications in Non-Ideal Data Scenarios Active Learning with Selective Time-Step Acquisition for PDEs Joint Score-Threshold Optimization for Interpretable Risk Assessment Revisiting Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching PnP-CM: Consistency Models as Plug-and-Play Priors for Inverse Problems Online Distributionally Robust LLM Alignment via Regression to Relative Reward Heavy-Tailed Class-Conditional Priors for Long-Tailed Generative Modeling Random Walk Learning and the Pac-Man Attack Sequential Regression Learning with Randomized Algorithms Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes Towards AI-assisted Neutrino Flavor Theory Design Towards Reasonable Concept Bottleneck Models Practical estimation of the optimal classification error with soft labels and calibration Flow-based Generative Modeling of Potential Outcomes and Counterfactuals The Gaussian Latent Machine: Efficient Prior and Posterior Sampling for Inverse Problems Two-Dimensional Deep ReLU CNN Approximation for Korobov Functions: A Constructive Approach FSPO: Few-Shot Optimization of Synthetic Preferences Personalizes to Real Users Identifying Information from Observations with Uncertainty and Novelty A ghost mechanism: An analytical model of abrupt learning in recurrent networks A Multiparty Homomorphic Encryption Approach to Confidential Federated Kaplan Meier Survival Analysis Large Language Models for Market Research: A Data-augmentation Approach Transformer Neural Processes - Kernel Regression FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening Estimating Joint Interventional Distributions from Marginal Interventional Data Nonparametric Sparse Online Learning of the Koopman Operator
Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives
Antoine Dedieu, Hussein Hazimeh, Rahul Mazumder · 2020-01-18 · via stat.ML updates on arXiv.org

We consider a discrete optimization formulation for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to solve (to optimality) $\ell_0$-regularized regression problems at scales much larger than what was conventionally considered possible. Despite their usefulness, MIP-based global optimization approaches are significantly slower compared to the relatively mature algorithms for $\ell_1$-regularization and heuristics for nonconvex regularized problems. We aim to bridge this gap in computation times by developing new MIP-based algorithms for $\ell_0$-regularized classification. We propose two classes of scalable algorithms: an exact algorithm that can handle $p\approx 50,000$ features in a few minutes, and approximate algorithms that can address instances with $p\approx 10^6$ in times comparable to the fast $\ell_1$-based algorithms. Our exact algorithm is based on the novel idea of \textsl{integrality generation}, which solves the original problem (with $p$ binary variables) via a sequence of mixed integer programs that involve a small number of binary variables. Our approximate algorithms are based on coordinate descent and local combinatorial search. In addition, we present new estimation error bounds for a class of $\ell_0$-regularized estimators. Experiments on real and synthetic data demonstrate that our approach leads to models with considerably improved statistical performance (especially, variable selection) when compared to competing methods.