惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

阮一峰的网络日志
阮一峰的网络日志
大猫的无限游戏
大猫的无限游戏
J
Java Code Geeks
Microsoft Security Blog
Microsoft Security Blog
The Register - Security
The Register - Security
Blog — PlanetScale
Blog — PlanetScale
G
Google Developers Blog
N
News and Events Feed by Topic
P
Privacy & Cybersecurity Law Blog
博客园_首页
T
Tailwind CSS Blog
L
Lohrmann on Cybersecurity
F
Full Disclosure
雷峰网
雷峰网
Project Zero
Project Zero
The Cloudflare Blog
Security Latest
Security Latest
V
Visual Studio Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
MyScale Blog
MyScale Blog
Latest news
Latest news
PCI Perspectives
PCI Perspectives
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
酷 壳 – CoolShell
酷 壳 – CoolShell
博客园 - 【当耐特】
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
P
Privacy International News Feed
Stack Overflow Blog
Stack Overflow Blog
Cisco Talos Blog
Cisco Talos Blog
V
Vulnerabilities – Threatpost
SecWiki News
SecWiki News
TaoSecurity Blog
TaoSecurity Blog
The GitHub Blog
The GitHub Blog
U
Unit 42
A
About on SuperTechFans
IT之家
IT之家
有赞技术团队
有赞技术团队
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
S
Secure Thoughts
V
V2EX
Attack and Defense Labs
Attack and Defense Labs
美团技术团队
C
Cybersecurity and Infrastructure Security Agency CISA
MongoDB | Blog
MongoDB | Blog
Hugging Face - Blog
Hugging Face - Blog
aimingoo的专栏
aimingoo的专栏
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Cloudbric
Cloudbric
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
C
Cyber Attacks, Cyber Crime and Cyber Security

cs.CV updates on arXiv.org

A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation Neural 3D Reconstruction of Planetary Surfaces from Descent-Phase Wide-Angle Imagery Multitasking Embedding for Embryo Blastocyst Grading Prediction (MEmEBG) Towards Patient-Specific Deformable Registration in Laparoscopic Surgery GeoLink: A 3D-Aware Framework Towards Better Generalization in Cross-View Geo-Localization 3DRealHead: Few-Shot Detailed Head Avatar PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction Graph Propagated Projection Unlearning: A Unified Framework for Vision and Audio Discriminative Models Solving Physics Olympiad via Reinforcement Learning on Physics Simulators Budget-Aware Uncertainty for Radiotherapy Segmentation QA Using nnU-Net ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Efficient KernelSHAP Explanations for Patch-based 3D Medical Image Segmentation StarVLA-$α$: Reducing Complexity in Vision-Language-Action Systems On the Robustness of Watermarking for Autoregressive Image Generation CLAY: Conditional Visual Similarity Modulation in Vision-Language Embedding Space Beyond Attention Scores: SVD-Based Vision Token Pruning for Efficient Vision-Language Models Revisiting Compositionality in Dual-Encoder Vision-Language Models: The Role of Inference Anthropogenic Regional Adaptation in Multimodal Vision-Language Model From Redaction to Restoration: Deep Learning for Medical Image Anonymization and Reconstruction A Compact and Efficient 1.251 Million Parameter Machine Learning CNN Model PD36-C for Plant Disease Detection: A Case Study The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems Towards Adaptive Open-Set Object Detection via Category-Level Collaboration Knowledge Mining BoxTuning: Directly Injecting the Object Box for Multimodal Model Fine-Tuning Semantic-Geometric Dual Compression: Training-Free Visual Token Reduction for Ultra-High-Resolution Remote Sensing Understanding FlowCoMotion: Text-to-Motion Generation via Token-Latent Flow Modeling ReSpinQuant: Efficient Layer-Wise LLM Quantization via Subspace Residual Rotation Approximation Lightweight Low-Light Image Enhancement via Distribution-Normalizing Preprocessing and Depthwise U-Net Panoptic Pairwise Distortion Graph WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark Back to the Barn with LLAMAs: Evolving Pretrained LLM Backbones in Finetuning Vision Language Models MMR-AD: A Large-Scale Multimodal Dataset for Benchmarking General Anomaly Detection with Multimodal Large Language Models Towards Automated Solar Panel Integrity: Hybrid Deep Feature Extraction for Advanced Surface Defect Identification You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass Pseudo-Unification: Entropy Probing Reveals Divergent Information Patterns in Unified Multimodal Models QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits ReXSonoVQA: A Video QA Benchmark for Procedure-Centric Ultrasound Understanding Evaluating the Impact of Medical Image Reconstruction on Downstream AI Fairness and Performance Product Review Based on Optimized Facial Expression Detection Retinal Cyst Detection from Optical Coherence Tomography Images Lung Cancer Detection Using Deep Learning Turning Generators into Retrievers: Unlocking MLLMs for Natural Language-Guided Geo-Localization Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing Architecture-Agnostic Modality-Isolated Gated Fusion for Robust Multi-Modal Prostate MRI Segmentation Camyla: Scaling Autonomous Research in Medical Image Segmentation LoViF 2026 The First Challenge on Weather Removal in Videos A Lightweight Multi-Metric No-Reference Image Quality Assessment Framework for UAV Imaging COREY: Entropy-Guided Runtime Chunk Scheduling for Selective Scan Kernels GeoMeld: Toward Semantically Grounded Foundation Models for Remote Sensing STORM: End-to-End Referring Multi-Object Tracking in Videos Data-Efficient Surgical Phase Segmentation in Small-Incision Cataract Surgery: A Controlled Study of Vision Foundation Models UDAPose: Unsupervised Domain Adaptation for Low-Light Human Pose Estimation Rethinking the Diffusion Model from a Langevin Perspective Toward Accountable AI-Generated Content on Social Platforms: Steganographic Attribution and Multimodal Harm Detection IMPACT: A Dataset for Multi-Granularity Human Procedural Action Understanding in Industrial Assembly Rethinking Video Human-Object Interaction: Set Prediction over Time for Unified Detection and Anticipation FishRoPE: Projective Rotary Position Embeddings for Omnidirectional Visual Perception Multinex: Lightweight Low-light Image Enhancement via Multi-prior Retinex Zero-shot World Models Are Developmentally Efficient Learners Class-Adaptive Cooperative Perception for Multi-Class LiDAR-based 3D Object Detection in V2X Systems FashionMV: Product-Level Composed Image Retrieval with Multi-View Fashion Data Adapting 2D Multi-Modal Large Language Model for 3D CT Image Analysis Edu-MMBias: A Three-Tier Multimodal Benchmark for Auditing Social Bias in Vision-Language Models under Educational Contexts Semantic Manipulation Localization VGA-Bench: A Unified Benchmark and Multi-Model Framework for Video Aesthetics and Generation Quality Evaluation A Dual Cross-Attention Graph Learning Framework For Multimodal MRI-Based Major Depressive Disorder Detection Degradation-Consistent Paired Training for Robust AI-Generated Image Detection MatRes: Zero-Shot Test-Time Model Adaptation for Simultaneous Matching and Restoration LVSum: A Benchmark for Timestamp-Aware Long Video Summarization FREE-Switch: Frequency-based Dynamic LoRA Switch for Style Transfer Demographic and Linguistic Bias Evaluation in Omnimodal Language Models FlowPalm: Optical Flow Driven Non-Rigid Deformation for Geometrically Diverse Palmprint Generation Cross-Cultural Value Awareness in Large Vision-Language Models I Walk the Line: Examining the Role of Gestalt Continuity in Object Binding for Vision Transformers GLEaN: A Text-to-image Bias Detection Approach for Public Comprehension From UAV Imagery to Agronomic Reasoning: A Multimodal LLM Benchmark for Plant Phenotyping Not Your Stereo-Typical Estimator: Combining Vision and Language for Volume Perception Genie 4D: Semantic-Prior-Guided 4D Dynamic Scene Reconstruction Efficient Personalization of Generative User Interfaces PAS: Estimating the target accuracy before domain adaptation Is There Knowledge Left to Extract? Evidence of Fragility in Medically Fine-Tuned Vision-Language Models F3G-Avatar : Face Focused Full-body Gaussian Avatar ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models ACCIDENT: A Benchmark Dataset for Vehicle Accident Detection from Traffic Surveillance Videos MedLVR: Latent Visual Reasoning for Reliable Medical Visual Question Answering Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories PhysInOne: Visual Physics Learning and Reasoning in One Suite Through Their Eyes: Fixation-aligned Tuning for Personalized User Emulation Neural Distribution Prior for LiDAR Out-of-Distribution Detection Adding Another Dimension to Image-based Animal Detection Long-SCOPE: Fully Sparse Long-Range Cooperative 3D Perception CT-1: Vision-Language-Camera Models Transfer Spatial Reasoning Knowledge to Camera-Controllable Video Generation FIRE-CIR: Fine-grained Reasoning for Composed Fashion Image Retrieval Detecting Diffusion-generated Images via Dynamic Assembly Forests Memory-Efficient Transfer Learning with Fading Side Networks via Masked Dual Path Distillation Tora3: Trajectory-Guided Audio-Video Generation with Physical Coherence Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection Domain-generalizable Face Anti-Spoofing with Patch-based Multi-tasking and Artifact Pattern Conversion Dynamic Class-Aware Active Learning for Unbiased Satellite Image Segmentation Low-Data Supervised Adaptation Outperforms Prompting for Cloud Segmentation Under Domain Shift
4S-DT: Self Supervised Super Sample Decomposition for Transfer learning with application to COVID-19 detection
Asmaa Abbas, Mohammed M. Abdelsamea, Mohamed Gaber · 2020-06-27 · via cs.CV updates on arXiv.org

Due to the high availability of large-scale annotated image datasets, knowledge transfer from pre-trained models showed outstanding performance in medical image classification. However, building a robust image classification model for datasets with data irregularity or imbalanced classes can be a very challenging task, especially in the medical imaging domain. In this paper, we propose a novel deep convolutional neural network, we called Self Supervised Super Sample Decomposition for Transfer learning (4S-DT) model. 4S-DT encourages a coarse-to-fine transfer learning from large-scale image recognition tasks to a specific chest X-ray image classification task using a generic self-supervised sample decomposition approach. Our main contribution is a novel self-supervised learning mechanism guided by a super sample decomposition of unlabelled chest X-ray images. 4S-DT helps in improving the robustness of knowledge transformation via a downstream learning strategy with a class-decomposition layer to simplify the local structure of the data. 4S-DT can deal with any irregularities in the image dataset by investigating its class boundaries using a downstream class-decomposition mechanism. We used 50,000 unlabelled chest X-ray images to achieve our coarse-to-fine transfer learning with an application to COVID-19 detection, as an exemplar. 4S-DT has achieved a high accuracy of 99.8% (95% CI: 99.44%, 99.98%) in the detection of COVID-19 cases on a large dataset and an accuracy of 97.54% (95%$ CI: 96.22%, 98.91%) on an extended test set enriched by augmented images of a small dataset, out of which all real COVID-19 cases were detected, which was the highest accuracy obtained when compared to other methods.