Forgetting Alternation and Blossoms: A New Framework for Fast Matching Augmentation and Its Applications to Sequential/Distributed/Streaming Computation

cs.DC updates on arXiv.org

Multi-Round Visibility: A Post-Consensus Ordering Layer for DAG-Based BFT AlignedServe: Orchestrating Prefix-aware Batching to Build a High-throughput and Computing-efficient LLM Serving System XWind: A Cross-site Router for Large Language Model Inference Serving at Renewable Energy Farms Ontological Knowledge Blocks: Executable Compliance and Profile-Based Validation for Trustworthy AI Systems SolarChain: Bridging Physical Law, Verifiable Trust, and Sustainable Markets for Urban Energy Resilience Inductive Deductive Synthesis: Enabling AI to Generate Formally Verified Systems Orbax: Distributed Checkpointing with JAX Asymmetric Virtual Memory Paging for Hybrid Mamba-Transformer Inference SepsisAI Orchestrator: A Containerized and Scalable Platform for Deploying AI Models and Real-Time Monitoring in Early Sepsis Detection Secure and Parallel Determinant Computation for Large-Scale Matrices in Edge Environments Budgeted Dynamic Trace Structures for Token-Efficient Sequential Computation PALS: Power-Aware LLM Serving for Mixture-of-Experts Models Frontier: Towards Comprehensive and Accurate LLM Inference Simulation High-speed Networking for Giga-Scale AI Factories Automated Byzantine-Resilient Clustered Decentralized Federated Learning for Battery Intelligence in Connected EVs Diagnosing Overhead in Dispatch Operations: Cross-architecture Observatory LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging PlexRL: Cluster-Level Orchestration of Serviceized LLM Execution for RLVR Instant GPU Efficiency Visibility at Fleet Scale Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU GEM: GPU-Variability-Aware Expert to GPU Mapping for MoE Systems Deep Tech to Space: Space Data Centers and AI Revolution at the Edge Towards Multi-Model LLM Schedulers: Empirical Insights into Offloading and Preemption Resilient Byzantine Agreement with Predictions LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications Heterogeneous Tasks Offloading in Vehicular Edge Computing: A Federated Meta Deep Reinforcement Learning Approach Ringmaster LMO: Asynchronous Linear Minimization Oracle Momentum Method AdaptiveLoad: Towards Efficient Video Diffusion Transformer Training Guard: Scalable Straggler Detection and Node Health Management for Large-Scale Training TierCheck: Tiered Checkpointing for Fault Tolerance in Large Language Model Training OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning Charon: A Unified and Fine-Grained Simulator for Large-Scale LLM Training and Inference ObjectCache: Layerwise Object-Storage Retrieval for KV Cache Reuse S-Bus: Automatic Read-Set Reconstruction for Multi-Agent LLM State Coordination Conflict-Free Replicated Data Types for Neural Network Model Merging: A Two-Layer Architecture Enabling CRDT-Compliant Model Merging Across 26 Strategies Designing Datacenter Power Delivery Hierarchies for the AI Era Runtime-Orchestrated Second-Order Optimization for Scalable LLM Training ADAPT: A Self-Calibrating Proactive Autoscaler for Container Orchestration A Few GPUs, A Whole Lotta Scale: Faithful LLM Training Emulation with PrismLLM On the Fragility of Data Attribution When Learning Is Distributed APWA: A Distributed Architecture for Parallelizable Agentic Workflows EMA: Efficient Model Adaptation for Learning-based Systems MinT: Managed Infrastructure for Training and Serving Millions of LLMs Towards the Next Frontier of LLMs, Training on Private Data: A Cross-Domain Benchmark for Federated Fine-Tuning KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving DisAgg: Distributed Aggregators for Efficient Secure Aggregation in Federated Learning MARLIN: Multi-Agent Game-Theoretic Reinforcement Learning for Sustainable LLM Inference in Cloud Datacenters INAR-VL: Input-Aware Routing for Edge-Cloud Vision-Language Inference Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity TurboGR: An Accelerated Training System for Large-Scale Generative Recommendation Constitutional Governance in Metric Spaces Hierarchical Transformer Preconditioning for Interactive Physics Simulation Parallel-in-Time Training of Recurrent Neural Networks for Dynamical Systems Reconstruction Not All Tokens Are Worth Caching: Learning Semantic-Aware Eviction for LLM Prefix Caches The Illusion of Power Capping in LLM Decode: A Phase-Aware Energy Characterisation Across Attention Architectures DynaTrain: Fast Online Parallelism Switching for Elastic LLM Training Trade-offs in Decentralized Agentic AI Discovery Across the Compute Continuum ChunkFlow: Communication-Aware Chunked Prefetching for Layerwise Offloading in Distributed Diffusion Transformer Inference MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces ReCoVer: Resilient LLM Pre-Training System via Fault-Tolerant Collective and Versatile Workload ShardTensor: Domain Parallelism for Scientific Machine Learning Agentic Performance at the Edge: Insights from Benchmarking Autonomous FAIR Digital Objects: From Passive Assertions to Active Knowledge DP-LAC: Lightweight Adaptive Clipping for Differentially Private Federated Fine-tuning of Language Models BatchWeave: A Consistent Object-Store-Native Data Plane for Large Foundation Model Training Kelvin v1.0: A Neural Pre-Encoder for H.264: A standards-compliant learned preprocessor with -27.62% BD-VMAF on UVG Metal-Sci: A Scientific Compute Benchmark for Evolutionary LLM Kernel Search on Apple Silicon From Detection to Recovery: Operational Analysis on LLM Pre-training with 504 GPUs DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism FedGMI: Generative Model-Driven Federated Learning for Probabilistic Mixture Inference PAAC: Privacy-Aware Agentic Device-Cloud Collaboration Transforming the Use of Earth Observation Data: Exascale Training of a Generative Compression Model with Historical Priors for up to 10,000x Data Reduction MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration Private Vertical Federated Inference for Time-Series Dooly: Configuration-Agnostic, Redundancy-Aware Profiling for LLM Inference Simulation FLAM: Evaluating Model Performance with Aggregatable Measures in Federated Learning \mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments UMEDA: Unified Multi-modal Efficient Data Fusion for Privacy-Preserving Graph Federated Learning via Spectral-Gated Attention and Diffusion-Based Operator Alignment SparseRL-Sync: Lossless Weight Synchronization with ~100x Less Communication Resource-Element Energy Difference for Noncoherent Over-the-Air Federated Learning Execution Envelopes: A Shared Admission Contract for Backend AI Execution Requests Regulating Branch Parallelism in LLM Serving CLAD: A Clustered Label-Agnostic Federated Learning Framework for Joint Anomaly Detection and Attack Classification CCL-Bench 1.0: A Trace-Based Benchmark for LLM Infrastructure Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence VibeServe: Can AI Agents Build Bespoke LLM Serving Systems? Relay Buffer Independent Communication over Pooled HBM for Efficient MoE Inference on Ascend From Coordinate Matching to Structural Alignment: Rethinking Prototype Alignment in Heterogeneous Federated Learning Irminsul: MLA-Native Position-Independent Caching for Agentic LLM Serving A Scalable Digital Twin Framework for Energy Optimization in Data Centers OpenG2G: A Simulation Platform for AI Datacenter-Grid Runtime Coordination Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism CCL-D: A High-Precision Diagnostic System for Slow and Hang Anomalies in Large-Scale Model Training One Pool, Two Caches: Adaptive HBM Partitioning for Accelerating Generative Recommender Serving Coral: Cost-Efficient Multi-LLM Serving over Heterogeneous Cloud GPUs Resilient AI Supercomputer Networking using MRC and SRv6 A Workflow-Oriented Framework for Asynchronous Human-AI Collaboration in Hybrid and Compute-Intensive HPC Environments

Taisuke Izumi, Naoki Kitamura, Yutaro Yamaguchi · 2025-11-11 · via cs.DC updates on arXiv.org

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

cs.DC updates on arXiv.org