惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
T
Troy Hunt's Blog
The Register - Security
The Register - Security
量子位
Hugging Face - Blog
Hugging Face - Blog
T
Tailwind CSS Blog
I
InfoQ
B
Blog RSS Feed
酷 壳 – CoolShell
酷 壳 – CoolShell
WordPress大学
WordPress大学
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
V
Visual Studio Blog
博客园 - Franky
H
Hackread – Cybersecurity News, Data Breaches, AI and More
C
Check Point Blog
A
About on SuperTechFans
S
SegmentFault 最新的问题
阮一峰的网络日志
阮一峰的网络日志
Apple Machine Learning Research
Apple Machine Learning Research
Last Week in AI
Last Week in AI
罗磊的独立博客
Y
Y Combinator Blog
U
Unit 42
The Cloudflare Blog
T
The Blog of Author Tim Ferriss
月光博客
月光博客
GbyAI
GbyAI
博客园 - 三生石上(FineUI控件)
IT之家
IT之家
N
Netflix TechBlog - Medium
Cyberwarzone
Cyberwarzone
Vercel News
Vercel News
C
CXSECURITY Database RSS Feed - CXSecurity.com
T
Tor Project blog
博客园 - 叶小钗
大猫的无限游戏
大猫的无限游戏
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
aimingoo的专栏
aimingoo的专栏
The Hacker News
The Hacker News
Recent Announcements
Recent Announcements
博客园_首页
有赞技术团队
有赞技术团队
Jina AI
Jina AI
Simon Willison's Weblog
Simon Willison's Weblog
雷峰网
雷峰网
人人都是产品经理
人人都是产品经理
S
Schneier on Security
Spread Privacy
Spread Privacy
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog

cs.CV updates on arXiv.org

Solving Physics Olympiad via Reinforcement Learning on Physics Simulators Budget-Aware Uncertainty for Radiotherapy Segmentation QA Using nnU-Net ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents On the Robustness of Watermarking for Autoregressive Image Generation Revisiting Compositionality in Dual-Encoder Vision-Language Models: The Role of Inference Anthropogenic Regional Adaptation in Multimodal Vision-Language Model From Redaction to Restoration: Deep Learning for Medical Image Anonymization and Reconstruction The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems BoxTuning: Directly Injecting the Object Box for Multimodal Model Fine-Tuning Semantic-Geometric Dual Compression: Training-Free Visual Token Reduction for Ultra-High-Resolution Remote Sensing Understanding Lightweight Low-Light Image Enhancement via Distribution-Normalizing Preprocessing and Depthwise U-Net WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark Back to the Barn with LLAMAs: Evolving Pretrained LLM Backbones in Finetuning Vision Language Models Pseudo-Unification: Entropy Probing Reveals Divergent Information Patterns in Unified Multimodal Models QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits Evaluating the Impact of Medical Image Reconstruction on Downstream AI Fairness and Performance Retinal Cyst Detection from Optical Coherence Tomography Images Camyla: Scaling Autonomous Research in Medical Image Segmentation LoViF 2026 The First Challenge on Weather Removal in Videos STORM: End-to-End Referring Multi-Object Tracking in Videos Data-Efficient Surgical Phase Segmentation in Small-Incision Cataract Surgery: A Controlled Study of Vision Foundation Models Rethinking the Diffusion Model from a Langevin Perspective Zero-shot World Models Are Developmentally Efficient Learners Edu-MMBias: A Three-Tier Multimodal Benchmark for Auditing Social Bias in Vision-Language Models under Educational Contexts VGA-Bench: A Unified Benchmark and Multi-Model Framework for Video Aesthetics and Generation Quality Evaluation Degradation-Consistent Paired Training for Robust AI-Generated Image Detection FREE-Switch: Frequency-based Dynamic LoRA Switch for Style Transfer Demographic and Linguistic Bias Evaluation in Omnimodal Language Models FlowPalm: Optical Flow Driven Non-Rigid Deformation for Geometrically Diverse Palmprint Generation Cross-Cultural Value Awareness in Large Vision-Language Models I Walk the Line: Examining the Role of Gestalt Continuity in Object Binding for Vision Transformers GLEaN: A Text-to-image Bias Detection Approach for Public Comprehension From UAV Imagery to Agronomic Reasoning: A Multimodal LLM Benchmark for Plant Phenotyping Not Your Stereo-Typical Estimator: Combining Vision and Language for Volume Perception Genie 4D: Semantic-Prior-Guided 4D Dynamic Scene Reconstruction Efficient Personalization of Generative User Interfaces Is There Knowledge Left to Extract? Evidence of Fragility in Medically Fine-Tuned Vision-Language Models ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models MedLVR: Latent Visual Reasoning for Reliable Medical Visual Question Answering Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories PhysInOne: Visual Physics Learning and Reasoning in One Suite Through Their Eyes: Fixation-aligned Tuning for Personalized User Emulation Neural Distribution Prior for LiDAR Out-of-Distribution Detection Adding Another Dimension to Image-based Animal Detection Long-SCOPE: Fully Sparse Long-Range Cooperative 3D Perception CT-1: Vision-Language-Camera Models Transfer Spatial Reasoning Knowledge to Camera-Controllable Video Generation FIRE-CIR: Fine-grained Reasoning for Composed Fashion Image Retrieval Detecting Diffusion-generated Images via Dynamic Assembly Forests Memory-Efficient Transfer Learning with Fading Side Networks via Masked Dual Path Distillation Tora3: Trajectory-Guided Audio-Video Generation with Physical Coherence Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection Domain-generalizable Face Anti-Spoofing with Patch-based Multi-tasking and Artifact Pattern Conversion Dynamic Class-Aware Active Learning for Unbiased Satellite Image Segmentation Low-Data Supervised Adaptation Outperforms Prompting for Cloud Segmentation Under Domain Shift Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios Adaptive Dual Residual U-Net with Attention Gate and Multiscale Spatial Attention Mechanisms (ADRUwAMS) MedFormer-UR: Uncertainty-Routed Transformer for Medical Image Classification BIAS: A Biologically Inspired Algorithm for Video Saliency Detection DeFakeQ: Enabling Real-Time Deepfake Detection on Edge Devices via Adaptive Bidirectional Quantization Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs CatalogStitch: Dimension-Aware and Occlusion-Preserving Object Compositing for Catalog Image Generation Post-Hoc Guidance for Consistency Models by Joint Flow Distribution Learning SenBen: Sensitive Scene Graphs for Explainable Content Moderation Towards Responsible Multimodal Medical Reasoning via Context-Aligned Vision-Language Models R2G: A Multi-View Circuit Graph Benchmark Suite from RTL to GDSII State Space Models are Effective Sign Language Learners: Exploiting Phonological Compositionality for Vocabulary-Scale Recognition Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring Deep Learning-Based Tracking and Lineage Reconstruction of Ligament Breakup Unified Multimodal Uncertain Inference Unsupervised Local Plasticity in a Multi-Frequency VisNet Hierarchy EfficientSign: An Attention-Enhanced Lightweight Architecture for Indian Sign Language Recognition InsEdit: Towards Instruction-based Visual Editing via Data-Efficient Video Diffusion Models Adaptation 3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding On Semiotic-Grounded Interpretive Evaluation of Generative Art Generative 3D Gaussian Splatting for Arbitrary-ResolutionAtmospheric Downscaling and Forecasting From Selection to Scheduling: Federated Geometry-Aware Correction Makes Exemplar Replay Work Better under Continual Dynamic Heterogeneity ViSAGE @ NTIRE 2026 Challenge on Video Saliency Prediction Needle in a Haystack: One-Class Representation Learning for Detecting Rare Malignant Cells in Computational Cytology A Semi-Automated Framework for 3D Reconstruction of Medieval Manuscript Miniatures Detection of Hate and Threat in Digital Forensics: A Case-Driven Multimodal Approach Training Deep Visual Networks Beyond Loss and Accuracy Through a Dynamical Systems Approach FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios LAST: Leveraging Tools as Hints to Enhance Spatial Reasoning for Multimodal Large Language Models Orthogonal Quadratic Complements for Vision Transformer Feed-Forward Networks The Deployment Gap in AI Media Detection: Platform-Aware and Visually Constrained Adversarial Evaluation HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models Identity-Aware U-Net: Fine-grained Cell Segmentation via Identity-Aware Representation Learning Pretrain-then-Adapt: Uncertainty-Aware Test-Time Adaptation for Text-based Person Search VideoStir: Understanding Long Videos via Spatio-Temporally Structured and Intent-Aware RAG I Can't Believe TTA Is Not Better: When Test-Time Augmentation Hurts Medical Image Classification Assessing Privacy Preservation and Utility in Online Vision-Language Models R3PM-Net: Real-time, Robust, Real-world Point Matching Network TaFall: Balance-Informed Fall Detection via Passive Thermal Sensing Tipiano: Cascaded Piano Hand Motion Synthesis via Fingertip Priors CAGE: Bridging the Accuracy-Aesthetics Gap in Educational Diagrams via Code-Anchored Generative Enhancement Face Density as a Proxy for Data Complexity: Quantifying the Hardness of Instance Count StableTTA: Improving Vision Model Performance by Training-free Test-Time Adaptation Methods Grid2Matrix: Revealing Digital Agnosia in Vision-Language Models Belief-Aware VLM Model for Human-like Reasoning GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents
Leveraging Diffusion For Strong and High Quality Face Morphing Attacks
Zander W. Blasingame, Chen Liu · 2023-01-11 · via cs.CV updates on arXiv.org

Face morphing attacks seek to deceive a Face Recognition (FR) system by presenting a morphed image consisting of the biometric qualities from two different identities with the aim of triggering a false acceptance with one of the two identities, thereby presenting a significant threat to biometric systems. The success of a morphing attack is dependent on the ability of the morphed image to represent the biometric characteristics of both identities that were used to create the image. We present a novel morphing attack that uses a Diffusion-based architecture to improve the visual fidelity of the image and the ability of the morphing attack to represent characteristics from both identities. We demonstrate the effectiveness of the proposed attack by evaluating its visual fidelity via the Frechet Inception Distance (FID). Also, extensive experiments are conducted to measure the vulnerability of FR systems to the proposed attack. The ability of a morphing attack detector to detect the proposed attack is measured and compared against two state-of-the-art GAN-based morphing attacks along with two Landmark-based attacks. Additionally, a novel metric to measure the relative strength between different morphing attacks is introduced and evaluated.