惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

cs.CV updates on arXiv.org

Benchmarking Composed Image Retrieval for Applied Earth Observation CLiViS: Unleashing Cognitive Map through Linguistic-Visual Synergy for Embodied Visual Reasoning CAFD: Concept-Aware DNN Fault Detection using VLMs MinerU-Popo: Universal Post-Processing Model for Structured Document Parsing Hierarchical Local-Global Transformer for Temporal Sentence Grounding Φ-Noise: Training-Free Temporal Video Conditioning via Phase-Based Noise Manipulation Cross-Modal Action Recognition in Egocentric Video Using Mamba: Integrating RGB and Hand Skeleton Streams via CLS Token Fusion Strategies ArtSplat: Feed-Forward Articulated 3D Gaussian Splatting from Sparse Multi-State Uncalibrated Views CMAP: Cross-Modal Adaptive Prompting for Multi-Domain Task-Incremental Learning Forgettable Federated Linear Learning with Certified Data Unlearning Paris 2.0: A Decentralized Diffusion Model for Video Generation Everything at Every Scale: Scale-Invariant Diffusion with Continuous Super-Resolution SliceWorld: A Predictive and Controllable World-State Model for CT Report Generation NudgeVAD: Language-Nudged End-to-End Driving via FiLM Residuals STORM: Internalized Modeling for Spatial-Temporal Reasoning in Video-Language Models PQDT: Pseudo-Query Dual Transformer for Robust Point Cloud Restoration FDDet: Achieving Data-Efficient Food Defect Detection Under Real-World Scenarios Reason--Imagine--Act: Closed-Loop LLM Decision Making with World Models for Autonomous Driving Phase-Aware Wavelet-Based-Scattering Encoder-Decoder for Dense Predictions CoDA: Color Distribution Probing for Efficient and Generalizable AI-Generated Image Detection Filtered Posterior Mean Collections: A Unified Framework for Analytical Models of Diffusion Generalization Investigating the Effect of Network Pruning on Performance and Interpretability A Multimodal 3D Foundation Model for Light Sheet Fluorescence Microscopy Enables Few-Shot Segmentation, Classification, and Deblurring SparseWorld: Enhancing End-to-End Autonomous Driving via World Models with Sparse Scene Representation World Models as Group Actions MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence IVR-R1: Refining Trajectories through Iterative Visual-Grounded Reasoning in Reinforcement Learning Parameter Efficient Multi-Class Intelligent Scheduling for Multimodal Online Distributed Industrial Anomaly Detection ViViD-5K: Vineyard vision dataset for field-based berry detection and segmentation and grape cluster closure estimation Appearance-Invariant Detection of Suggestive Motion via Laban Movement Descriptors on SMPL Skeletons Lattice theory and algebraic models for deep convolutional learning based on mathematical morphology Image-Conditioned Instance Prompt Network for Referring Remote Sensing Image Segmentation SILSM: A Sustainable Interactive Level Set Method for Progressive Refinement Robust Fuzzy Multi-view Learning under View Conflict Muon in Vision Transformers: Optimizer-Recipe Interactions and Gradient Spectra Causal Physics Steering in Video World Models via Concept Activation Vectors EgoAdapt: A Multi-Scene Egocentric Adaptation Method for CVPR 2026 HD-EPIC VQA Challenge When Search Becomes Memory: Turning Robot Design Trials into Transferable Skills MAGIC: Multimodal Alignment & Grounding-aware Instruction Coreset for Vision-Language Models PDEInvBench: A Comprehensive Dataset and Design Space Exploration of Neural Networks for PDE Inverse Problems Physen-Noise2Noise: Physics-Guided Self-Supervised Defocus Deblurring with Bias Correction under Low-Light Conditions PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training EgoAction: Egocentric Action Composition with Reliability-Aware Temporal Fusion for the EPIC-KITCHENS Action Detection Challenge at CVPR 2026 Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Inference-Time Alignment of Diffusion Models via Trust-Region Iterative Twisted Sequential Monte Carlo Mitigating Object Hallucinations in Vision-Language Models through Region-Aware Attention Recalibration Uni-DPO: A Unified Paradigm for Dynamic Preference Optimization of LLMs Prism: A Plug-in Reproducible Infrastructure for Scalable Multimodal Continual Instruction Tuning GIBLy: Improving 3D Semantic Segmentation through an Architecture-Agnostic Lightweight Geometric Inductive Bias Layer Nano World Models: A Minimalist Implementation of Future Video Prediction Med-R2: An Adversarial Benchmark for Evidence-Grounded Reasoning in Medical VLMs Single View Seafloor Recovery from Imaging Sonar via Differentiable Rendering ERNIE-Image Technical Report Cross-Domain Generalization Limits of Vision Foundation Models in Facial Deepfake Detection Learnable Shape Prototypes with Occlusion-Geometry-Guided Injection for Amodal Instance Segmentation Parameter-Efficient CT Reconstruction via Deep Graph Laplacian Regularization Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models Beyond Generative Priors: Minority Sampling with JEPA-Guided Diffusion Motion-Compensated Weight Compression Squeezing Capacity from Multimodal Large Language Models for Subject-driven Generation Rethinking Continual Anomaly Detection on the Edge: Benchmarking Under Realistic Industrial Conditions Generalized Evidential Deep Learning: From a Bayesian Perspective TimeSpot: Benchmarking Geo-Temporal Understanding in Vision-Language Models in Real-World Settings From Theory to Decision Rule: Calibrating the Noisy-Label Crossover for Vision-Language Model Weak Supervision Across Three Medical-Imaging Benchmarks HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos Self-supervised Dynamic Heterogeneous Degradation Modeling for Unified Zero-Shot Image Restoration Uncertainty-DTW for Sequences and Visual Tokens A Principled Self-Referenced Early Stopping Approach for Deep Image Prior PEDESTRIANQA: A Benchmark for Vision-Language Models on Pedestrian Intention and Trajectory Prediction Generating 3D models from sketches of human faces using a combined approach of Convolutional Neural Networks, Procedural Modeling, and Contour Mapping Concept Unlearning via Cross-Attention Activation Projection for Diffusion Models Remote sensing data imputation using deep learning for multispectral imagery V3H: View Variation and View Heredity for Incomplete Multi-view Clustering Dual Prototype-Conditioned Diffusion Model for Scalable Multi-Class Unsupervised Anomaly Detection in Large Category Spaces Coarse-to-Fine Domain Incremental Learning with Attentive Distillation for Mining Footprint Segmentation in Multispectral Imagery Unveil: Unified Visual-Textual Integration and Distillation for Multi-modal Document Retrieval CRISP -- Clustering-Based Redundancy-Reduced Instance Sampling for Pathology Case Representation and Retrieval Dale meets Langevin: A Multiplicative Denoising Diffusion Model DUEL: Adversarial Self-Play for Multimodal Reasoning Plume Segmentation from MethaneSAT with Cross-Sensor Transfer Learning and Physics-Informed Postprocessing When Interpretability Becomes a Liability: Adversarial Attacks on CBM Concept Layers Universal Boosts, Specific Suppressors: Sparse Autoencoder Steering of Medical Vision-Language Models VectorArk: Learning Practical Image Vectorization with Rounded Polygon Representation Opportunistic Target Selection: Early Directional Commitment for Query-Efficient Black-Box Adversarial Attacks In Search of the Ingredients of Open-Endedness: Replicating Picbreeder with Large Vision-Language Models EgoProx: Evaluating MLLMs on Egocentric 3D Proximity Reasoning Across a Cognitive Hierarchy AdvantageFlow: Advantage-Weighted Least Squares for RL in Flow Models EMA: Effort Metric Attention for Anatomical Effort-Guided Human Motion Diffusion Artiverse: A Diverse and Physically Grounded Dataset for Articulated Objects MGVQ: Synergizing Multi-dimensional Sensitivity-Aware and Gradient-Hessian Fusion for Vector Quantization Gaussian Rank-Based Neighborhood Degree for Graph Neural Networks in Image Classification TempRet: Temporal Enhancement and Two-Stage Reranking for CVPR 2026 EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge Prism: Spectral-Aware Block-Sparse Attention Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks FoodMonitor: Benchmarking MLLMs for Explainable Compliance Analysis IQA-Spider: Unifying Multi-Granularity Image Quality Assessment with Reasoning, Grounding and Referring Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL OmniEgo-R$^2$: A Routed Reasoning Framework for the 1st Cross-Domain EgoCross Challenge at CVPR 2026 Towards Large Model Feature Coding Trust-Aware Joint Feature-Prediction Discrepancy for Robust Domain Adaptation
Task-Driven Subspace Decomposition for Knowledge Sharing and Isolation in LoRA-based Continual Learning
Lingfeng He, · 2026-05-05 · via cs.CV updates on arXiv.org

View PDF HTML (experimental)

Abstract:Continual Learning (CL) requires models to sequentially adapt to new tasks without forgetting old knowledge. Recently, Low-Rank Adaptation (LoRA), a representative Parameter-Efficient Fine-Tuning (PEFT) method, has gained increasing attention in CL. Several LoRA-based CL methods reduce interference across tasks by separating their update spaces, typically building the new space from the estimated null space of past tasks. However, they (i) overlook task-shared directions, which suppresses knowledge transfer, and (ii) fail to capture truly effective task-specific directions since these ``null bases" of old tasks can remain nearly inactive for new task under correlated tasks. To address this, we study LoRA learning capability from a projection energy perspective, and propose Low-rank Decomposition and Adaptation (LoDA). It performs a task-driven decomposition to build general and truly task-specific LoRA subspaces by solving two energy-based objectives, decoupling directions for knowledge sharing and isolation. LoDA fixes LoRA down-projections on two subspaces and learns robust up-projections via a Gradient-Aligned Optimization (GAO) approach. After each task, before integrating the LoRA updates into the backbone, LoDA derives a closed-form recalibration for the general update, approximating a feature-level joint optimum along this task-shared direction. Experiments indicate that LoDA outperforms existing CL methods. Our code is available at this https URL.
Comments: Accepted by ICML 2026
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2603.00191 [cs.LG]
  (or arXiv:2603.00191v4 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.2603.00191

arXiv-issued DOI via DataCite

Submission history

From: Lingfeng He [view email]
[v1] Fri, 27 Feb 2026 02:31:00 UTC (7,144 KB)
[v2] Sat, 2 May 2026 01:48:44 UTC (7,411 KB)
[v3] Tue, 12 May 2026 07:35:28 UTC (7,411 KB)
[v4] Sun, 24 May 2026 03:29:33 UTC (7,413 KB)