인셔셔RSS 관심 있는 블로그, 뉴스, 기술 정보를 효율적으로 추적하고 읽으세요
원문 읽기 InertiaRSS에서 열기

추천 피드

The GitHub Blog
The GitHub Blog
T
ThreatConnect
C
Check Point Blog
T
The Exploit Database - CXSecurity.com
U
Unit 42
云风的 BLOG
云风的 BLOG
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
T
Tenable Blog
博客园 - 叶小钗
D
Docker
T
Threatpost
WordPress大学
WordPress大学
腾讯CDC
I
Intezer
T
Tailwind CSS Blog
Engineering at Meta
Engineering at Meta
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Hugging Face - Blog
Hugging Face - Blog
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
The Register - Security
The Register - Security
Stack Overflow Blog
Stack Overflow Blog
PCI Perspectives
PCI Perspectives
S
Security Archives - TechRepublic
Simon Willison's Weblog
Simon Willison's Weblog
A
Arctic Wolf
MongoDB | Blog
MongoDB | Blog
小众软件
小众软件
Hacker News: Ask HN
Hacker News: Ask HN
O
OpenAI News
博客园 - 【当耐特】
L
LINUX DO - 最新话题
C
Comments on: Blog
S
Securelist
月光博客
月光博客
S
Secure Thoughts
Security Latest
Security Latest
MyScale Blog
MyScale Blog
NISL@THU
NISL@THU
F
Full Disclosure
M
Microsoft Research Blog - Microsoft Research
T
True Tiger Recordings
SecWiki News
SecWiki News
aimingoo的专栏
aimingoo的专栏
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 热门话题
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
AWS News Blog
AWS News Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
L
Lohrmann on Cybersecurity
H
Help Net Security

cs updates on arXiv.org

HydraPrompt: An Adaptive and Asymmetric Framework of Vision-Language Models for Synthetic Image Detection Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction 3D Gaussian Map with Open-Set Semantic Grouping for Vision-Language Navigation On the Push-Based Asynchronous Federated Learning: A Bias-Correction Aggregation Approach OmniInteract: Benchmarking Real-World Streaming Interaction for Real-Time Omnimodal Assistants CNNs, Transformers, Hybrid, and Vision Language Models for Skin Cancer Detection VesselSim: learning 3D blood vessel segmentation without expert annotations Erased but Exploitable: Black-box Embedding-Aware Prompting Against Unlearned Text-to-Image Diffusion Models VisualNeedle: Benchmarking Active Visual Search in Information-Dense Scenes DuoGesture: Neuro-Inspired and Biomechanically Informed Dual-Stream Co-Speech Gesture Generation RadarSim: Simulating Single-Chip Radar via Multimodal Neural Fields RoMo: A Large-Scale, Richly Organized Dataset and Semantic Taxonomy for Human Motion Generation The Rescue Effect: Spatio-Semantic Early Exit Bypasses Quantization Collapse in CLIP When Rule Violations Are Rare: Chimera Training for Logical Anomaly Detection Detail Consistent Stage-Wise Distillation for Efficient 3D MRI Segmentation Sparse-LiDAR Prompting of Monocular Geometry Foundations: An Empirical Study Toward Long-Range Driving Depth AirCast-SR: A Foundation Model for Kilometer-Scale Atmospheric Super-Resolution via Latent Consistency Diffusion Personalized Generative Models for Contextual Debiasing Cross-scale Aligned Supervision for Training GANs Joint Instance Segmentation and Geometric Attribute Regression for Roof Structures in Aerial Imagery Dimensional Distribution Emotion State: Leveraging Valence and Arousal as a Common Embedding Space for Visual Emotion Analysis TSFMAudit: Data Contamination Auditing in Forecasting Time Series Foundation Models Clinically-Grounded Counterfactual Reasoning for Medical Video Diagnosis Triadic Dynamics Aware Diffusion Posterior Sampling for Inverse Problems: Optimizing Guidance and Stochasticity Schedules Comparative Study of Vision-Based Metric Measurement for Large-Scale Planar Scenes LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV The Constraint Tax: Measuring Validity-Correctness Tradeoffs in Structured Outputs for Small Language Models SilIF: Silhouette-Augmented Isolation Forest for Unsupervised Transaction Fraud Detection Multi-Modal Building Inspection via Perceiver IO Fusion of Satellite and Street-Level Imagery E$^3$C: Video Generation with 3D Environmental Memory and Ego-Exo Human Pose Control Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling Unveiling the Fragility of Vision-Language Models: Multi-Modal Adversarial Synergy via Texture-Constrained Perturbations and Cross-Modal Optimization Sleep-stage efficient classification using a lightweight self-supervised model Underwater360: Reconstructing Underwater Scenes from Panoramic Images with Omnidirectional Gaussian Splatting Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective Sentinel: Embodied Cooperative Spatial Reasoning and Planning LongCat-Video-Avatar 1.5 Technical Report GEM: Geometric Entropy Mixing for Optimal LLM Data Curation Uncertainty-Aware Gaussian Map for Vision-Language Navigation Frequency-Guided Fusion For RGB-Thermal Semantic Segmentation BioFact-MoE: Biologically Factorized Mixture of Experts for Vision-Language Prognostic Modeling in Hepatocellular Carcinoma A multifractal-based masked auto-encoder: an application to medical images Benchmarking Convolutional, Transformer, Hybrid, and Vision Language Models for Multi Disease Retinal Screening Unified Panoramic Geometry Estimation via Multi-View Foundation Models Not All Modalities Are Equal: Instruction-Aware Gating for Multimodal Videos OmniGF: A Dual-Branch Vision-Language Framework for Unified Gaze Following Zero-Shot Object Re-Identification in Egocentric Kitchen Videos via Multi-Stage SAM3 Feature Fusion Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning Planning Neural Dynamics with Lie Group Embedding through Supervised Projective Manifold Learning AnchorDiff: Training-Free Concept Grounding for MM-DiTs via Anchor-Based Graph Propagation
NestedKV: 중첩 메모리 라우팅을 위한 긴 컨텍스트 KV 캐시 압축
Hong Chen, X · 2026-05-27 · via cs updates on arXiv.org

PDF 보기 HTML (실험 중)

요약: 긴 맥락 언어 모델은 키-값 (KV) 캐시의 메모리 푸인트로 제한됩니다. 기존의 훈련 없는 KV 압축 방법은 보통 토큰을 중요도 신호 하나로 순위를 매겨 — 주의, 최신성, 레이어별 할당, 키의 독특성 — 이 신호가 유용한 맥락이 전역적으로 독특하거나 지역적으로 에피소드적이거나 즉각적으로 관련될 때 취약해집니다. 저희는 NestedKV를 소개합니다. Nested Learning의 Continuum Memory System에서 영감을 받은 키만을 사용하는 KV 캐시 압축 방법입니다. NestedKV는 전역, 블록 수준, 슬라이딩 윈도우 키 앵커를 유지하며, 다수의 시간 스케일 코사인 이상으로 토큰을 평가하고, 결과 순위를 훈련 없는 외부 학습기와 헤드 적응 혼합 및 놀라움 게이트 토큰 라우팅을 사용하여 결합합니다. 점수는 적응형 헤드별 예산과 짝지어지며 훈련이나 LLM 수정이 필요 없습니다. RULER (4k--32k), LooGLE, LongBench, LongBench-E, InfiniteBench, MMLU-Pro에서 Qwen3과 Llama-3.2 모델에서 NestedKV는 유지된 캐시가 작을 때 가장 강력합니다. Qwen3-4B에서 $r=0.75$일 때 RULER에서 KeyDiff보다 최대 19.10점, LongBench에서 19.29점 향상을 이뤘으며, $r=0.95$일 때 LongBench에서 37.32를 유지했고 KeyDiff은 17.55을 기록했습니다.
주제: 컴퓨테이션과언어(cs.CL)
참조: arXiv:2605.26678 [cs.CL]
  (또는 arXiv:2605.26678v1 [cs.CL] 이 버전용)
  https://doi.org/10.48550/arXiv.2605.26678

DataCite를 통해 발행된 arXiv DOI (등록 대기 중)

제출 이력

From: Hong Chen [이메일 보기]
[v1] 화, 26 5월 2026 08:14:39 UTC (393 KB)