惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

PCI Perspectives
PCI Perspectives
博客园 - Franky
M
MIT News - Artificial intelligence
B
Blog
P
Privacy International News Feed
T
The Exploit Database - CXSecurity.com
F
Full Disclosure
The Register - Security
The Register - Security
P
Proofpoint News Feed
T
Threat Research - Cisco Blogs
腾讯CDC
Project Zero
Project Zero
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
T
Threatpost
人人都是产品经理
人人都是产品经理
Last Week in AI
Last Week in AI
Hugging Face - Blog
Hugging Face - Blog
Simon Willison's Weblog
Simon Willison's Weblog
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Know Your Adversary
Know Your Adversary
Vercel News
Vercel News
C
CXSECURITY Database RSS Feed - CXSecurity.com
P
Privacy & Cybersecurity Law Blog
Cyberwarzone
Cyberwarzone
罗磊的独立博客
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
S
Schneier on Security
月光博客
月光博客
酷 壳 – CoolShell
酷 壳 – CoolShell
B
Blog RSS Feed
IT之家
IT之家
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
V2EX
T
The Blog of Author Tim Ferriss
P
Palo Alto Networks Blog
Google DeepMind News
Google DeepMind News
Apple Machine Learning Research
Apple Machine Learning Research
Scott Helme
Scott Helme
Recorded Future
Recorded Future
A
About on SuperTechFans
博客园 - 【当耐特】
H
Help Net Security
宝玉的分享
宝玉的分享
C
CERT Recently Published Vulnerability Notes
D
DataBreaches.Net
美团技术团队
T
Tenable Blog
C
Cybersecurity and Infrastructure Security Agency CISA
雷峰网
雷峰网
V
Vulnerabilities – Threatpost

cs.DC updates on arXiv.org

Multi-Round Visibility: A Post-Consensus Ordering Layer for DAG-Based BFT AlignedServe: Orchestrating Prefix-aware Batching to Build a High-throughput and Computing-efficient LLM Serving System XWind: A Cross-site Router for Large Language Model Inference Serving at Renewable Energy Farms Ontological Knowledge Blocks: Executable Compliance and Profile-Based Validation for Trustworthy AI Systems SolarChain: Bridging Physical Law, Verifiable Trust, and Sustainable Markets for Urban Energy Resilience Inductive Deductive Synthesis: Enabling AI to Generate Formally Verified Systems Orbax: Distributed Checkpointing with JAX Asymmetric Virtual Memory Paging for Hybrid Mamba-Transformer Inference SepsisAI Orchestrator: A Containerized and Scalable Platform for Deploying AI Models and Real-Time Monitoring in Early Sepsis Detection Secure and Parallel Determinant Computation for Large-Scale Matrices in Edge Environments Budgeted Dynamic Trace Structures for Token-Efficient Sequential Computation PALS: Power-Aware LLM Serving for Mixture-of-Experts Models Frontier: Towards Comprehensive and Accurate LLM Inference Simulation High-speed Networking for Giga-Scale AI Factories Automated Byzantine-Resilient Clustered Decentralized Federated Learning for Battery Intelligence in Connected EVs Diagnosing Overhead in Dispatch Operations: Cross-architecture Observatory LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging PlexRL: Cluster-Level Orchestration of Serviceized LLM Execution for RLVR Instant GPU Efficiency Visibility at Fleet Scale Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU GEM: GPU-Variability-Aware Expert to GPU Mapping for MoE Systems Deep Tech to Space: Space Data Centers and AI Revolution at the Edge Towards Multi-Model LLM Schedulers: Empirical Insights into Offloading and Preemption Resilient Byzantine Agreement with Predictions LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications Heterogeneous Tasks Offloading in Vehicular Edge Computing: A Federated Meta Deep Reinforcement Learning Approach Ringmaster LMO: Asynchronous Linear Minimization Oracle Momentum Method AdaptiveLoad: Towards Efficient Video Diffusion Transformer Training Guard: Scalable Straggler Detection and Node Health Management for Large-Scale Training TierCheck: Tiered Checkpointing for Fault Tolerance in Large Language Model Training OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning Charon: A Unified and Fine-Grained Simulator for Large-Scale LLM Training and Inference ObjectCache: Layerwise Object-Storage Retrieval for KV Cache Reuse S-Bus: Automatic Read-Set Reconstruction for Multi-Agent LLM State Coordination Conflict-Free Replicated Data Types for Neural Network Model Merging: A Two-Layer Architecture Enabling CRDT-Compliant Model Merging Across 26 Strategies Designing Datacenter Power Delivery Hierarchies for the AI Era Runtime-Orchestrated Second-Order Optimization for Scalable LLM Training ADAPT: A Self-Calibrating Proactive Autoscaler for Container Orchestration A Few GPUs, A Whole Lotta Scale: Faithful LLM Training Emulation with PrismLLM On the Fragility of Data Attribution When Learning Is Distributed APWA: A Distributed Architecture for Parallelizable Agentic Workflows EMA: Efficient Model Adaptation for Learning-based Systems MinT: Managed Infrastructure for Training and Serving Millions of LLMs Towards the Next Frontier of LLMs, Training on Private Data: A Cross-Domain Benchmark for Federated Fine-Tuning KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving DisAgg: Distributed Aggregators for Efficient Secure Aggregation in Federated Learning MARLIN: Multi-Agent Game-Theoretic Reinforcement Learning for Sustainable LLM Inference in Cloud Datacenters INAR-VL: Input-Aware Routing for Edge-Cloud Vision-Language Inference Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity TurboGR: An Accelerated Training System for Large-Scale Generative Recommendation Constitutional Governance in Metric Spaces Hierarchical Transformer Preconditioning for Interactive Physics Simulation Parallel-in-Time Training of Recurrent Neural Networks for Dynamical Systems Reconstruction Not All Tokens Are Worth Caching: Learning Semantic-Aware Eviction for LLM Prefix Caches The Illusion of Power Capping in LLM Decode: A Phase-Aware Energy Characterisation Across Attention Architectures DynaTrain: Fast Online Parallelism Switching for Elastic LLM Training Trade-offs in Decentralized Agentic AI Discovery Across the Compute Continuum ChunkFlow: Communication-Aware Chunked Prefetching for Layerwise Offloading in Distributed Diffusion Transformer Inference MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces ReCoVer: Resilient LLM Pre-Training System via Fault-Tolerant Collective and Versatile Workload ShardTensor: Domain Parallelism for Scientific Machine Learning Agentic Performance at the Edge: Insights from Benchmarking Autonomous FAIR Digital Objects: From Passive Assertions to Active Knowledge DP-LAC: Lightweight Adaptive Clipping for Differentially Private Federated Fine-tuning of Language Models BatchWeave: A Consistent Object-Store-Native Data Plane for Large Foundation Model Training Kelvin v1.0: A Neural Pre-Encoder for H.264: A standards-compliant learned preprocessor with -27.62% BD-VMAF on UVG Metal-Sci: A Scientific Compute Benchmark for Evolutionary LLM Kernel Search on Apple Silicon From Detection to Recovery: Operational Analysis on LLM Pre-training with 504 GPUs DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism FedGMI: Generative Model-Driven Federated Learning for Probabilistic Mixture Inference PAAC: Privacy-Aware Agentic Device-Cloud Collaboration Transforming the Use of Earth Observation Data: Exascale Training of a Generative Compression Model with Historical Priors for up to 10,000x Data Reduction MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration Private Vertical Federated Inference for Time-Series Dooly: Configuration-Agnostic, Redundancy-Aware Profiling for LLM Inference Simulation FLAM: Evaluating Model Performance with Aggregatable Measures in Federated Learning \mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments UMEDA: Unified Multi-modal Efficient Data Fusion for Privacy-Preserving Graph Federated Learning via Spectral-Gated Attention and Diffusion-Based Operator Alignment SparseRL-Sync: Lossless Weight Synchronization with ~100x Less Communication Resource-Element Energy Difference for Noncoherent Over-the-Air Federated Learning Execution Envelopes: A Shared Admission Contract for Backend AI Execution Requests Regulating Branch Parallelism in LLM Serving CLAD: A Clustered Label-Agnostic Federated Learning Framework for Joint Anomaly Detection and Attack Classification CCL-Bench 1.0: A Trace-Based Benchmark for LLM Infrastructure Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence VibeServe: Can AI Agents Build Bespoke LLM Serving Systems? Relay Buffer Independent Communication over Pooled HBM for Efficient MoE Inference on Ascend From Coordinate Matching to Structural Alignment: Rethinking Prototype Alignment in Heterogeneous Federated Learning Irminsul: MLA-Native Position-Independent Caching for Agentic LLM Serving A Scalable Digital Twin Framework for Energy Optimization in Data Centers OpenG2G: A Simulation Platform for AI Datacenter-Grid Runtime Coordination Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism CCL-D: A High-Precision Diagnostic System for Slow and Hang Anomalies in Large-Scale Model Training One Pool, Two Caches: Adaptive HBM Partitioning for Accelerating Generative Recommender Serving Coral: Cost-Efficient Multi-LLM Serving over Heterogeneous Cloud GPUs Resilient AI Supercomputer Networking using MRC and SRv6 A Workflow-Oriented Framework for Asynchronous Human-AI Collaboration in Hybrid and Compute-Intensive HPC Environments
Forgetting Alternation and Blossoms: A New Framework for Fast Matching Augmentation and Its Applications to Sequential/Distributed/Streaming Computation
Taisuke Izumi, Naoki Kitamura, Yutaro Yamaguchi · 2025-11-11 · via cs.DC updates on arXiv.org

Finding a maximum cardinality matching in a graph is one of the most fundamental problems. An algorithm proposed by Micali and Vazirani (1980) is well-known to solve the problem in $O(m\sqrt{n})$ time, which is still one of the fastest algorithms in general. While the MV algorithm itself is not so complicated and is indeed convincing, its correctness proof is extremely challenging, which can be seen from the history: after the first algorithm paper had appeared in 1980, Vazirani has made several attempts to give a complete proof for more than 40 years. It seems, roughly speaking, caused by the nice but highly complex structure of the shortest alternating paths in general graphs that are deeply intertwined with the so-called (nested) blossoms. In this paper, we propose a new structure theorem on the shortest alternating paths in general graphs without taking into the details of blossoms. The high-level idea is to forget the alternation (of matching and non-matching edges) as early as possible. A key ingredient is a notion of alternating base trees (ABTs) introduced by Izumi, Kitamura, and Yamaguchi (2024) to develop a nearly linear-time distributed algorithm. Our structure theorem refines the properties of ABTs exploited in their algorithm, and we also give simpler alternative proofs for them. Based on our structure theorem, we propose a new algorithm, which is slightly slower but more implementable and much easier to confirm its correctness than the MV algorithm. As applications of our framework, we also present new $(1 - ε)$-approximation algorithms in the distributed and semi-streaming settings. Both algorithms are deterministic, and substantially improve the best known upper bounds on the running time. The algorithms are built on the top of a novel framework of amplifying approximation factors of given matchings, which is of independent interest.