4M Models Scanned: Protect AI + Hugging Face 6 Months In

Hugging Face - Blog

Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs ALTK‑Evolve: On‑the‑Job Learning for AI Agents Safetensors is Joining the PyTorch Foundation Holo3: Breaking the Computer Use Frontier Any Custom Frontend with Gradio's Backend A New Framework for Evaluating Voice Agents (EVA) Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine‑Tuning, and On‑Device Optimizations One-Shot Any Web App with Gradio's gr.HTML CUGA on Hugging Face: Democratizing Configurable AI Agents New in llama.cpp: Model Management Building Deep Research: How we Achieved State of the Art OVHcloud on Hugging Face Inference Providers 🔥 20x Faster TRL Fine-tuning with RapidFire AI Building for an Open Future - our new partnership with Google Cloud Aligning to What? Rethinking Agent Generalization in MiniMax M2 Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac Sentence Transformers is joining Hugging Face! Unlock the power of images with AI Sheets Supercharge your OCR Pipelines with Open Models Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face Get your VLM running in 3 simple steps on Intel CPUs Nemotron-Personas-India: Synthesized Data for Sovereign AI Introducing RTEB: A New Standard for Retrieval Evaluation Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models VibeGame: Exploring Vibe Coding Games Nemotron-Personas-Japan: ソブリン AI のための合成データセット Swift Transformers Reaches 1.0 – and Looks to the Future Smol2Operator: Post-Training GUI Agents for Computer Use SyGra: The One-Stop Framework for Building Data for LLMs and SLMs Gaia2 and ARE: Empowering the community to study agents Scaleway on Hugging Face Inference Providers 🔥 Democratizing AI Safety with RiskRubric.ai Public AI on Hugging Face Inference Providers 🔥 `LeRobotDataset:v3.0`: Bringing large-scale datasets to `lerobot` Visible Watermarking with Gradio Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason! Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers Fine-tune Any LLM from the Hugging Face Hub with Together AI Jupyter Agents: training LLMs to reason with notebooks mmBERT: ModernBERT goes Multilingual Welcome EmbeddingGemma, Google's new efficient embedding model SAIR: Accelerating Pharma R&D with AI-Powered Structural Intelligence Make your ZeroGPU Spaces go brrr with ahead-of-time compilation NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset Generate Images with Claude and Hugging Face From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels MCP for Research: How to Connect AI to Research Tools Kimina-Prover-RL Arm & ExecuTorch 0.7: Bringing Generative AI to the masses Neural Super Sampling is here! TextQuests: How Good are LLMs at Text-Based Video Games? 🇵🇭 FilBench - Can LLMs Understand and Generate Filipino? Introducing AI Sheets: a tool to work with datasets using open AI models! Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training Vision Language Model Alignment in TRL ⚡️ Welcome GPT OSS, the new open-source model family from OpenAI! Measuring Open-Source Llama Nemotron Models on DeepResearch Bench 📚 3LM: A Benchmark for Arabic LLMs in STEM and Code Implementing MCP Servers in Python: An AI Shopping Assistant with Gradio Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ Parquet Content-Defined Chunking TimeScope: How Long Can Your Video Large Multimodal Model Go? Fast LoRA inference for Flux with Diffusers and PEFT Accelerate a World of LLMs on Hugging Face with NVIDIA NIM Arc Virtual Cell Challenge: A Primer Consilium: When Multiple LLMs Collaborate Back to The Future: Evaluating AI Agents on Predicting Future Events Five Big Improvements to Gradio MCP Servers Ettin Suite: SoTA Paired Encoders and Decoders Migrating the Hub from Git LFS to Xet Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models Asynchronous Robot Inference: Decoupling Action Prediction and Execution ScreenEnv: Deploy your full stack Desktop Agent Building the Hugging Face MCP Server Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Creating custom kernels for the AMD MI300 Upskill your LLMs With Gradio MCP Servers SmolLM3: smol, multilingual, long-context reasoner Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure Efficient MultiModal Data Pipeline Announcing NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models Training and Finetuning Sparse Embedding Models with Sentence Transformers Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub Gemma 3n fully available in the open-source ecosystem! Transformers backend integration in SGLang (LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware Groq on Hugging Face Inference Providers 🔥 How Long Prompts Block Other Requests - Optimizing LLM Performance Learn the Hugging Face Kernel Hub in 5 Minutes Convert Transformers to ONNX with Hugging Face Optimum Intel and Hugging Face Partner to Democratize Machine Learning Hardware Acceleration Director of Machine Learning Insights [Part 3: Finance Edition] The Annotated Diffusion Model Deep Q-Learning with Space Invaders Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers Introducing Pull Requests and Discussions 🥳 Efficient Table Pre-training without Real Data: An Introduction to TAPEX An Introduction to Q-Learning Part 2/2 How Sempre Health is leveraging the Expert Acceleration Program to accelerate their ML roadmap

Sean Morgan · 2025-04-14 · via Hugging Face - Blog

Back to Articles

Hugging Face and Protect AI partnered in October 2024 to enhance machine learning (ML) model security through Guardian’s scanning technology for the community of developers who explore and use models from the Hugging Face Hub. The partnership has been a natural fit from the start—Hugging Face is on a mission to democratize the use of open source AI, with a commitment to safety and security; and Protect AI is building the guardrails to make open source models safe for all.

4 new threat detection modules launched

Since October, Protect AI has significantly expanded Guardian's detection capabilities, improving existing threat detection capabilities and launching four new detection modules:

PAIT-ARV-100: Archive slip can write to file system at load time
PAIT-JOBLIB-101: Joblib model suspicious code execution detected at model load time
PAIT-TF-200: TensorFlow SavedModel contains architectural backdoor
PAIT-LMAFL-300: Llamafile can execute malicious code during inference

With these updates, Guardian covers more model file formats and detects additional sophisticated obfuscation techniques, including the high severity CVE-2025-1550 vulnerability in Keras. Through enhanced detection tooling, Hugging Face users receive critical security information via inline alerts on the platform and gain access to comprehensive vulnerability reports on Insights DB. Clearly labeled findings are available on each model page, empowering users to make more informed decisions about which models to integrate into their projects.


Figure 1: Protect AI’s inline alerts on Hugging Face

By the numbers

As of April 1, 2025, Protect AI has successfully scanned 4.47 million unique model versions in 1.41 million repositories on the Hugging Face Hub.

To date, Protect AI has identified a total of 352,000 unsafe/suspicious issues across 51,700 models. In just the last 30 days, Protect AI has served 226 million requests from Hugging Face at a 7.94 ms response time.

Maintaining a Zero Trust Approach to Model Security

Protect AI’s Guardian applies a zero trust approach to AI/ML security. This especially comes into play when treating arbitrary code execution as inherently unsafe, regardless of intent. Rather than just classifying overtly malicious threats, Guardian flags execution risks as suspicious on InsightsDB, recognizing that even harmful code can look innocuous through obfuscation techniques (see more on payload obfuscation below). Attackers can disguise payloads within seemingly benign scripts or extensibility components of a framework, making payload inspection alone insufficient for ensuring security. By maintaining this cautious approach, Guardian helps mitigate risks posed by hidden threats in machine learning models.

Evolving Guardian’s Model Vulnerability Detection Capabilities

AI/ML security threats are evolving every single day. That's why Protect AI leverages both in-house threat research teams and huntr—the world's first and largest AI/ML bug bounty program powered by our community of over 17,000 security researchers.

Coinciding with our partnership launch in October, Protect AI launched a new program on huntr to crowdsource research on new Model File Vulnerabilities. Since the launch of the program, they have received over 200 reports that Protect AI teams have worked through and incorporated into Guardian—all of which are automatically applied to the model scans here on Hugging Face.


Figure 2: huntr’s bug bounty program

Common attack themes

As more huntr reports come in and more independent threat research is conducted, certain trends have emerged.

Library-dependent attack chains: These attacks focus on a bad actor’s ability to invoke functions from libraries present in the ML workstations environment. These are reminiscent of the “drive-by download” style of attacks that afflicted browsers and systems when common utilities like Java and Flash were present. Typically, the scale of impact of these attacks are proportional to the pervasiveness of a given library, with common ML libraries like Pytorch having a far wider potential impact than lesser used libraries.

Payload obfuscation: Several reports have highlighted ways to insert, obfuscate, or “hide” a payload in a model that bypasses common scanning techniques. These vulnerabilities use techniques like compression, encoding, and serialization to obfuscate the payload and are not easily detectable. Compression is an issue since libraries like Joblib allow compressed payloads to be loaded directly. Container formats like Keras and NeMo embed additional model files, each potentially vulnerable to their own specific attack vectors. Compression exposes users to TarSlip or ZipSlip vulnerabilities. While the impacts of these will often be limited to Denial of Service, in certain circumstances these vulnerabilities can lead to Arbitrary Code Execution by leveraging path traversal techniques, allowing malicious attackers to overwrite files that are often automatically executed.

Framework-extensibility vulnerabilities: ML frameworks provide numerous extensibility mechanisms that inadvertently create dangerous attack vectors: custom layers, external code dependencies, and configuration-based code loading. For example, CVE-2025-1550 in Keras, reported to us by the huntr community, demonstrates how custom layers can be exploited to execute arbitrary code despite security features. Configuration files with serialization vulnerabilities similarly allow dynamic code loading. These deserialization vulnerabilities make models exploitable through crafted payloads embedded in formats that users load without suspicion. Despite security improvements from vendors, older vulnerable versions and insecure dependency handling continue to present significant risk in ML ecosystems.

Attack vector chaining: Recent reports demonstrate how multiple vulnerabilities can be combined to create sophisticated attack chains that can bypass detection. By sequentially exploiting vulnerabilities like obfuscated payloads and extension mechanisms, researchers have shown complex pathways for compromise that appear benign when examined individually. This approach significantly complicates detection and mitigation efforts, as security tools focused on single-vector threats often miss these compound attacks. Effective defense requires identifying and addressing all links in the attack chain rather than treating each vulnerability in isolation.

Delivering Comprehensive Threat Detection for Hugging Face Users

The industry-leading Protect AI threat research team, with help from the huntr community, is continuously gathering data and insights in order to develop new and more robust model scans as well as automatic threat blocking (available to Guardian customers). In the last few months, Guardian has:

Enhanced detection of library-dependent attacks: Significant expansion of Guardian’s scanning capabilities for detecting library-dependent attack vectors. The scanners for PyTorch and Pickle now perform deep structure analysis of serialized code, examining execution paths and identifying potentially malicious code patterns that could be triggered through library dependencies. For example, the PyTorch torchvision.io functions can overwrite any file on the victim’s system to either include a payload or delete all of its content. Guardian can now detect many more of these dangerous functions in popular libraries such as PyTorch, Numpy, and Pandas.

Uncovered obfuscated attacks: Guardian performs multi-layered analyses across various archive formats, decompressing nested archives and examining compressed payloads for malicious models. This approach detects attempts to hide malicious code through compression, encoding, or serialization techniques. Joblib, for example, supports saving models using different compression formats which can obfuscate Pickle deserialization vulnerabilities, and the same can be done in other formats like Keras which can include Numpy weights files that have deserialization payloads in them.

Detected exploits in framework extensibility components: Guardian’s constantly improving detection modules alerted users on Hugging Face to models that were impacted by CVE-2025-1550 (a critical security finding) before the vulnerability was publicly disclosed. These detection modules comprehensively analyze ML framework extension mechanisms, allowing only standard or verified components and blocking potentially dangerous implementations, regardless of their apparent intent.

Identified additional architectural backdoors: Guardian’s architectural backdoor detection capabilities were expanded beyond ONNX formats to include additional model formats like TensorFlow.

Expanded model format coverage: Guardian’s true strength comes from the depth of its coverage, which has driven substantial expansion of detection modules to include additional formats like Joblib and an increasingly popular llamafile format, with support for additional ML frameworks coming soon.

Provided deeper model analysis: Actively research on additional ways to augment current detection capabilities for better analysis and detection of unsafe models. Expect to see significant enhancements in reducing both false positives and false negatives in the near future.

It Only Gets Better from Here

Through the partnership with Protect AI and Hugging Face, we’ve made third-party ML models safer and more accessible. We believe that having more eyes on model security can only be a good thing. We’re increasingly seeing the security world pay attention and lean in, making threats more discoverable and AI usage safer for all.

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

Hugging Face - Blog

4 new threat detection modules launched

By the numbers

Maintaining a Zero Trust Approach to Model Security

Evolving Guardian’s Model Vulnerability Detection Capabilities

Common attack themes

Delivering Comprehensive Threat Detection for Hugging Face Users

It Only Gets Better from Here