惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园 - 叶小钗
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
MongoDB | Blog
MongoDB | Blog
V
Visual Studio Blog
Security Archives - TechRepublic
Security Archives - TechRepublic
Jina AI
Jina AI
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
S
Secure Thoughts
Simon Willison's Weblog
Simon Willison's Weblog
博客园_首页
T
Threat Research - Cisco Blogs
Attack and Defense Labs
Attack and Defense Labs
H
Heimdal Security Blog
L
Lohrmann on Cybersecurity
爱范儿
爱范儿
Stack Overflow Blog
Stack Overflow Blog
Last Week in AI
Last Week in AI
T
Troy Hunt's Blog
C
CERT Recently Published Vulnerability Notes
P
Proofpoint News Feed
小众软件
小众软件
Security Latest
Security Latest
F
Fortinet All Blogs
Vercel News
Vercel News
博客园 - 司徒正美
C
Cisco Blogs
T
Tailwind CSS Blog
Recorded Future
Recorded Future
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Latest news
Latest news
V
Vulnerabilities – Threatpost
S
Schneier on Security
Forbes - Security
Forbes - Security
www.infosecurity-magazine.com
www.infosecurity-magazine.com
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
The Last Watchdog
The Last Watchdog
G
GRAHAM CLULEY
D
Darknet – Hacking Tools, Hacker News & Cyber Security
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Microsoft Azure Blog
Microsoft Azure Blog
Google DeepMind News
Google DeepMind News
The Register - Security
The Register - Security
博客园 - 三生石上(FineUI控件)
O
OpenAI News
F
Full Disclosure
L
LINUX DO - 热门话题
Help Net Security
Help Net Security
H
Hackread – Cybersecurity News, Data Breaches, AI and More
博客园 - Franky

cs.CV updates on arXiv.org

One Token per Highly Selective Frame: Towards Extreme Compression for Long Video Understanding Seedance 2.0: Advancing Video Generation for World Complexity ROSE: Retrieval-Oriented Segmentation Enhancement SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments Don't Let the Video Speak: Audio-Contrastive Preference Optimization for Audio-Visual Language Models HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding Training-Free Semantic Multi-Object Tracking with Vision-Language Models Towards Unconstrained Human-Object Interaction OneHOI: Unifying Human-Object Interaction Generation and Editing Free Geometry: Refining 3D Reconstruction from Longer Versions of Itself Decoding the Delta: Unifying Remote Sensing Change Detection and Understanding with Multimodal Large Language Models Seek-and-Solve: Benchmarking MLLMs for Visual Clue-Driven Reasoning in Daily Scenarios POINTS-Seeker: Towards Training a Multimodal Agentic Search Model from Scratch Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective Depth-Aware Image and Video Orientation Estimation Remote Sensing Image Super-Resolution for Imbalanced Textures: A Texture-Aware Diffusion Framework HiProto: Hierarchical Prototype Learning for Interpretable Object Detection Under Low-quality Conditions MApLe: Multi-instance Alignment of Diagnostic Reports and Large Medical Images Heuristic Style Transfer for Real-Time, Efficient Weather Attribute Detection SceneGlue: Scene-Aware Transformer for Feature Matching without Scene-Level Annotation A Multi-Stage Optimization Pipeline for Bethesda Cell Detection in Pap Smear Cytology ASTRA: Enhancing Multi-Subject Generation with Retrieval-Augmented Pose Guidance and Disentangled Position Embedding PartNerFace: Part-based Neural Radiance Fields for Animatable Facial Avatar Reconstruction Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model Rethinking Image-to-3D Generation with Sparse Queries: Efficiency, Capacity, and Input-View Bias Context Sensitivity Improves Human-Machine Visual Alignment PostureObjectstitch: Anomaly Image Generation Considering Assembly Relationships in Industrial Scenarios Any3DAvatar: Fast and High-Quality Full-Head 3D Avatar Reconstruction from Single Portrait Image DiffMagicFace: Identity Consistent Facial Editing of Real Videos A Resource-Efficient Hybrid CNN-LSTM network for image-based bean leaf disease classification Gaslight, Gatekeep, V1-V3: Early Visual Cortex Alignment Shields Vision-Language Models from Sycophantic Manipulation DRG-Font: Dynamic Reference-Guided Few-shot Font Generation via Contrastive Style-Content Disentanglement Artificial intelligence application in lymphoma diagnosis with Vision Transformer using weakly supervised training From Synchrony to Sequence: Exo-to-Ego Generation via Interpolation PBE-UNet: A light weight Progressive Boundary-Enhanced U-Net with Scale-Aware Aggregation for Ultrasound Image Segmentation Temporally Consistent Long-Term Memory for 3D Single Object Tracking Design and Behavior of Sparse Mixture-of-Experts Layers in CNN-based Semantic Segmentation ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion Multi-View Dynamic Scene Reconstruction ReConText3D: Replay-based Continual Text-to-3D Generation Granularity-Aware Transfer for Tree Instance Segmentation in Synthetic and Real Forests SLQ: Bridging Modalities via Shared Latent Queries for Retrieval with Frozen MLLMs Med-CAM: Minimal Evidence for Explaining Medical Decision Making Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data From Pixels to Nucleotides: End-to-End Token-Based Video Compression for DNA Storage VRAG-DFD: Verifiable Retrieval-Augmentation for MLLM-based Deepfake Detection ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation What Are We Really Measuring? Rethinking Dataset Bias in Web-Scale Natural Image Collections via Unsupervised Semantic Clustering VGGT-Segmentor: Geometry-Enhanced Cross-View Segmentation Dehaze-then-Splat: Generative Dehazing with Physics-Informed 3D Gaussian Splatting for Smoke-Free Novel View Synthesis Efficient Multi-View 3D Object Detection by Dynamic Token Selection and Fine-Tuning SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance Radar-Informed 3D Multi-Object Tracking under Adverse Conditions ZoomSpec: A Physics-Guided Coarse-to-Fine Framework for Wideband Spectrum Sensing UHR-BAT: Budget-Aware Token Compression Vision-Language model for Ultra-High-Resolution Remote Sensing CLIP Architecture for Abdominal CT Image-Text Alignment and Zero-Shot Learning: Investigating Batch Composition and Data Scaling AI Powered Image Analysis for Phishing Detection Reconstruction of a 3D wireframe from a single line drawing via generative depth estimation Free Lunch for Unified Multimodal Models: Enhancing Generation via Reflective Rectification with Inherent Understanding DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer Enhancing Mixture-of-Experts Specialization via Cluster-Aware Upcycling ADP-DiT: Text-Guided Diffusion Transformer for Brain Image Generation in Alzheimer's Disease Progression FiRe: Fine-grained Multimodal Reasoning for Enhanced Image Generation MyoVision: A Mobile Research Tool and NEATBoost-Attention Ensemble Framework for Real Time Chicken Breast Myopathy Detection A Study of Failure Modes in Two-Stage Human-Object Interaction Detection MaMe & MaRe: Matrix-Based Token Merging and Restoration for Efficient Visual Perception and Synthesis A Unified Conditional Flow for Motion Generation, Editing, and Intra-Structural Retargeting Event-Adaptive State Transition and Gated Fusion for RGB-Event Object Tracking VibeFlow: Versatile Video Chroma-Lux Editing through Self-Supervised Learning Physically-Guided Optical Inversion Enable Non-Contact Side-Channel Attack on Isolated Screens MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis CausalDisenSeg: A Causality-Guided Disentanglement Framework with Counterfactual Reasoning for Robust Brain Tumor Segmentation Under Missing Modalities Why Multimodal In-Context Learning Lags Behind? Unveiling the Inner Mechanisms and Bottlenecks A Multimodal Clinically Informed Coarse-to-Fine Framework for Longitudinal CT Registration in Proton Therapy UniBlendNet: Unified Global, Multi-Scale, and Region-Adaptive Modeling for Ambient Lighting Normalization A 3D SAM-Based Progressive Prompting Framework for Multi-Task Segmentation of Radiotherapy-induced Normal Tissue Injuries in Limited-Data Settings Multi-Agent Object Detection Framework Based on Raspberry Pi YOLO Detector and Slack-Ollama Natural Language Interface MSGS: Multispectral 3D Gaussian Splatting SEDTalker: Emotion-Aware 3D Facial Animation Using Frame-Level Speech Emotion Diarization SSD-GS: Scattering and Shadow Decomposition for Relightable 3D Gaussian Splatting Right Regions, Wrong Labels: Semantic Label Flips in Segmentation under Correlation Shift Towards Successful Implementation of Automated Raveling Detection: Effects of Training Data Size, Illumination Difference, and Spatial Shift Why MLLMs Struggle to Determine Object Orientations The Spectrascapes Dataset: Street-view imagery beyond the visible captured using a mobile platform Deep Spatially-Regularized and Superpixel-Based Diffusion Learning for Unsupervised Hyperspectral Image Clustering Bias at the End of the Score Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision PAT-VCM: Plug-and-Play Auxiliary Tokens for Video Coding for Machines See&Say: Vision Language Guided Safe Zone Detection for Autonomous Package Delivery Drones Explainable Fall Detection for Elderly Monitoring via Temporally Stable SHAP in Skeleton-Based Human Activity Recognition DroneScan-YOLO: Redundancy-Aware Lightweight Detection for Tiny Objects in UAV Imagery Indexing Multimodal Language Models for Large-scale Image Retrieval Rethinking Uncertainty in Segmentation: From Estimation to Decision 4th Workshop on Maritime Computer Vision (MaCVi): Challenge Overview A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation Neural 3D Reconstruction of Planetary Surfaces from Descent-Phase Wide-Angle Imagery Multitasking Embedding for Embryo Blastocyst Grading Prediction (MEmEBG) Towards Patient-Specific Deformable Registration in Laparoscopic Surgery
Higher-Order Group Synchronization
Adriana L. Duncan, Joe Kileel · 2025-05-28 · via cs.CV updates on arXiv.org

Group synchronization is the problem of determining reliable global estimates from noisy local measurements on networks. The typical task for group synchronization is to assign elements of a group to the nodes of a graph in a way that respects group elements given on the edges which encode information about local pairwise relationships between the nodes. In this paper, we introduce a novel higher-order group synchronization problem which operates on a hypergraph and seeks to synchronize higher-order local measurements on the hyperedges to obtain global estimates on the nodes. Higher-order group synchronization is motivated by applications to computer vision and image processing, among other computational problems. First, we define the problem of higher-order group synchronization and discuss its mathematical foundations. Specifically, we give necessary and sufficient synchronizability conditions which establish the importance of cycle consistency in higher-order group synchronization. Then, we propose the first computational framework for general higher-order group synchronization; it acts globally and directly on higher-order measurements using a message passing algorithm. We discuss theoretical guarantees for our framework, including convergence analyses under outliers and noise. Finally, we show potential advantages of our method through numerical experiments. In particular, we show that in certain cases our higher-order method applied to rotational and angular synchronization outperforms standard pairwise synchronization methods and is more robust to outliers. We also show that our method has comparable performance on simulated cryo-electron microscopy (cryo-EM) data compared to a standard cryo-EM reconstruction package.