惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

cs updates on arXiv.org

RoMo: A Large-Scale, Richly Organized Dataset and Semantic Taxonomy for Human Motion Generation On the Role of Inductive Bias in Time-Series Pretraining: A Case Study in Learning Generalizable Representations for Clinical Time Series Triadic Dynamics Aware Diffusion Posterior Sampling for Inverse Problems: Optimizing Guidance and Stochasticity Schedules Joint Instance Segmentation and Geometric Attribute Regression for Roof Structures in Aerial Imagery MULTISEISMO: A Multimodal Seismic Dataset and Model for Cross-Modal Seismic Understanding Stateful Inference for Low-Latency Multi-Agent Tool Calling Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning Re-M3Dr: Rebalanced MultiModal Mean Deviation Regression TSFMAudit: Data Contamination Auditing in Forecasting Time Series Foundation Models CNNs, Transformers, Hybrid, and Vision Language Models for Skin Cancer Detection Two-Parameter Flows for Learning Population Dynamics of Physical Systems Rotation-Invariant Spherical Watermarking via Third-Order SO(3) Representation Coupling QAM-W: Joint 2D Codebook Quantization for LLM Weights via Hadamard Rotation and Activation-Aware Scaling Unified Neural Scaling Laws The Rescue Effect: Spatio-Semantic Early Exit Bypasses Quantization Collapse in CLIP HydraPrompt: An Adaptive and Asymmetric Framework of Vision-Language Models for Synthetic Image Detection Sleep-stage efficient classification using a lightweight self-supervised model LongCat-Video-Avatar 1.5 Technical Report Max-Window Scale Estimation for Near-Lossless HiF8 W8A8 Quantization-Aware Training CmIVTP: Cross-modal Interaction-based Vessel Trajectory Prediction for Maritime Intelligence 3D Gaussian Map with Open-Set Semantic Grouping for Vision-Language Navigation Multi-Modal Building Inspection via Perceiver IO Fusion of Satellite and Street-Level Imagery Curriculum Learning for Safety Alignment Scaling World-Model Reinforcement Learning Through Diffusion Policy Optimization MSCGC-KAN: Multi-scale Causal Graph Convolution and Kolmogorov-Arnold Feature Mapping for EEG Emotion Recognition Classification and detection of multiple UAVs using rational Gaussian wavelet neural networks Scheduled Style Injection: Expanding the Style-Content Pareto Frontier in Training-Free Diffusion-based Style Transfer DynFrame: Adaptive Reasoning-Driven Multimodal Framework with Dynamic Frame Augmentation for Complex Video Understanding Semigroup Consistency as a Diagnostic for Learned Physics Simulators Attenuation-Resilient Alternating Optimization for Laparoscopic Liver Landmark Detection Sentinel: Embodied Cooperative Spatial Reasoning and Planning VesselSim: learning 3D blood vessel segmentation without expert annotations Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective Comparative Study of Vision-Based Metric Measurement for Large-Scale Planar Scenes Unified Panoramic Geometry Estimation via Multi-View Foundation Models Detail Consistent Stage-Wise Distillation for Efficient 3D MRI Segmentation Erased but Exploitable: Black-box Embedding-Aware Prompting Against Unlearned Text-to-Image Diffusion Models InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization The Constraint Tax: Measuring Validity-Correctness Tradeoffs in Structured Outputs for Small Language Models Clinically-Grounded Counterfactual Reasoning for Medical Video Diagnosis HRVConformer: Neonatal Hypoxic-Ischemic Encephalopathy Classification from the Heart Rate signals Not All Modalities Are Equal: Instruction-Aware Gating for Multimodal Videos When Rule Violations Are Rare: Chimera Training for Logical Anomaly Detection $R^3$: 3D Reconstruction via Relative Regression Planning Neural Dynamics with Lie Group Embedding through Supervised Projective Manifold Learning E$^3$C: Video Generation with 3D Environmental Memory and Ego-Exo Human Pose Control JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search Provably Communication-Efficient and Privacy-Preserving Federated Graph Neural Networks Dynamic Link Prediction with Temporally Enhanced Signed Graph Neural Networks Respecting Modality Gap in Post-hoc Out-of-distribution Detection with Pre-trained Vision-Language Models Bridging Classification and Reconstruction: Cooperative Time Series Anomaly Detection OmniRetriever: Any-to-Any Audio-Video-Text Retrieval via Fusion-as-Teacher Distillation DelowlightSplat: Feed-Forward Gaussian Splatting for Lowlight 3D Scene Reconstruction Reparametrizing Shampoo and SOAP for Subspace Basis Updates and BFloat16 Storage PinPoint: Prompting with Informative Interior Points Memory-Distilled Selection for Noise-Robust Anomaly Detection Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction Adaptation-Free Heterogeneous Collaborative Perception with Unseen Agent Configurations A PAC-Bayesian View of Generalisation for Physics-Informed Machine Learning DV-SFT: Direct Vision Supervision for Fine-Grained Visual Understanding Quantized Keys Steal Attention: Bias Correction for KV-Cache Compression in Video Diffusion A multifractal-based masked auto-encoder: an application to medical images RadarSim: Simulating Single-Chip Radar via Multimodal Neural Fields BioFact-MoE: Biologically Factorized Mixture of Experts for Vision-Language Prognostic Modeling in Hepatocellular Carcinoma VisualNeedle: Benchmarking Active Visual Search in Information-Dense Scenes Personalized Generative Models for Contextual Debiasing Benchmarking Convolutional, Transformer, Hybrid, and Vision Language Models for Multi Disease Retinal Screening Sparse-LiDAR Prompting of Monocular Geometry Foundations: An Empirical Study Toward Long-Range Driving Depth Underwater360: Reconstructing Underwater Scenes from Panoramic Images with Omnidirectional Gaussian Splatting OmniGF: A Dual-Branch Vision-Language Framework for Unified Gaze Following Cross-scale Aligned Supervision for Training GANs SilIF: Silhouette-Augmented Isolation Forest for Unsupervised Transaction Fraud Detection DuoGesture: Neuro-Inspired and Biomechanically Informed Dual-Stream Co-Speech Gesture Generation GEM: Geometric Entropy Mixing for Optimal LLM Data Curation Zero-Shot Object Re-Identification in Egocentric Kitchen Videos via Multi-Stage SAM3 Feature Fusion LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV On the Push-Based Asynchronous Federated Learning: A Bias-Correction Aggregation Approach OmniInteract: Benchmarking Real-World Streaming Interaction for Real-Time Omnimodal Assistants Unveiling the Fragility of Vision-Language Models: Multi-Modal Adversarial Synergy via Texture-Constrained Perturbations and Cross-Modal Optimization ARBITER: Reasoning Trajectory Basins and Majority Vote Failures in Test-Time Sampling CSV-ViT: A Vision Transformer with the Variable-sized Cortical Supervertices for Detection of Alzheimer's Disease Pathologies Uncertainty-Aware Gaussian Map for Vision-Language Navigation Neural Bayesian Sequential Routing A Hybrid Vision-Language Architecture for Automated Defect Reasoning and Report Generation in Industrial Inspection ReCA: Multi-Shot Long Video Extrapolation via Recursive Context Allocation InterSketch: An Interleaved Reasoning Model with Self-correcting Visual Sketch and Stepwise Reward Modeling Dynamic Mixtures of Time-Delay Systems from Streaming Time Series From Privacy to Generalization: Linear Max-Information Bounds for DP-SGD AnchorDiff: Training-Free Concept Grounding for MM-DiTs via Anchor-Based Graph Propagation Co-folding model guided by structural proteomics GAC: Noise-Aware Adaptive Mixing for Hybrid SFT-RL Post-Training AirCast-SR: A Foundation Model for Kilometer-Scale Atmospheric Super-Resolution via Latent Consistency Diffusion Frequency-Guided Fusion For RGB-Thermal Semantic Segmentation The Bridge-Garden Dilemma in LLM Distillation: Why Mixing Hard and Soft Labels Works TrackRef3D: Multi-View Consistent Track-then-Label for Open-World Referring Segmentation in 3D Gaussian Splatting MedVol-R1: Reward-Driven Evidence Grounding for Volumetric Reasoning Segmentation Gaussian-Voxel Duet: A Dual-Scaffolding Hybrid Representation for Fast and Accurate Monocular Surface Reconstruction O-MARC: Omni Memory-Augmented Compression Distillation for Efficient Video Understanding FTibSuite: A Comprehensive Resource Suite for Tibetan Vision-Language Modeling
Towards Shared Embodied Intelligence in Humanoid Robots through Optimization Development and Testing of the Human Aware ergoCub Robot
Carlotta Sar · 2026-05-27 · via cs updates on arXiv.org

View PDF

Abstract:Collaboration is central to human behavior, enabling tasks beyond individual capability. This ability arises from coordinating actions through internal representations of others, a concept known as shared intelligence. Additionally, humans are characterized by physical bodies and cognitive abilities that are optimized in response to their environment, a phenomenon referred to as embodied cognition. Designing humanoid robots that collaborate safely and effectively with people requires unifying these principles. Here we propose an architecture that integrates shared intelligence and embodied cognition to enable robots to physically collaborate with humans, where robot hardware and control are optimized for human metrics, using representations of the human body and motion intelligence. The ultimate goal is to achieve a form of shared embodied intelligence. Specifically, our architecture optimizes robot hardware and physical intelligence parameters with respect to human ergonomic metrics. This is accomplished by modeling human-robot interaction as a function of hardware configurations and embedding human models into the robot's physical intelligence. As a concrete implementation, we present the humanoid robot ergoCub, whose morphology and control have been optimized for collaborative tasks with humans. Our approach provides a framework for designing humanoid robots that prioritize human ergonomics at both the hardware and physical intelligence levels, with applications in industrial and assistive robotics.
Subjects: Robotics (cs.RO)
Cite as: arXiv:2605.26991 [cs.RO]
  (or arXiv:2605.26991v1 [cs.RO] for this version)
  https://doi.org/10.48550/arXiv.2605.26991

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Carlotta Sartore [view email]
[v1] Tue, 26 May 2026 13:13:41 UTC (24,243 KB)