惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

WordPress大学
WordPress大学
Recent Announcements
Recent Announcements
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Microsoft Azure Blog
Microsoft Azure Blog
S
Security @ Cisco Blogs
P
Proofpoint News Feed
博客园 - 三生石上(FineUI控件)
T
Tailwind CSS Blog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
The Last Watchdog
The Last Watchdog
AI
AI
Webroot Blog
Webroot Blog
aimingoo的专栏
aimingoo的专栏
Hacker News: Ask HN
Hacker News: Ask HN
B
Blog RSS Feed
小众软件
小众软件
T
The Blog of Author Tim Ferriss
博客园 - 叶小钗
W
WeLiveSecurity
C
CXSECURITY Database RSS Feed - CXSecurity.com
H
Hackread – Cybersecurity News, Data Breaches, AI and More
T
Troy Hunt's Blog
云风的 BLOG
云风的 BLOG
P
Privacy International News Feed
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Vercel News
Vercel News
Y
Y Combinator Blog
P
Proofpoint News Feed
V2EX - 技术
V2EX - 技术
AWS News Blog
AWS News Blog
F
Fortinet All Blogs
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
The GitHub Blog
The GitHub Blog
A
Arctic Wolf
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Hugging Face - Blog
Hugging Face - Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
V
V2EX
MongoDB | Blog
MongoDB | Blog
SecWiki News
SecWiki News
The Register - Security
The Register - Security
博客园_首页
T
Threat Research - Cisco Blogs
Hacker News - Newest:
Hacker News - Newest: "LLM"
Recorded Future
Recorded Future
V
Vulnerabilities – Threatpost
I
InfoQ
雷峰网
雷峰网
C
Check Point Blog

The latest security news for developers - The GitHub Blog

Inside the Advisory Database and what happens when vulnerability volume breaks records Making secret scanning more trustworthy: Reducing false positives at scale Investigation update: GitHub Enterprise Server signing key rotation Raising the bar: Quality, shared responsibility, and the future of GitHub’s bug bounty program Securing the git push pipeline: Responding to a critical remote code execution vulnerability How exposed is your code? Find out in minutes—for free Securing the open source supply chain across GitHub A year of open source vulnerability trends: CVEs, advisories, and malware GitHub expands application security coverage with AI‑powered detections Investing in the people shaping open source and securing the future together How to scan for vulnerabilities with GitHub Security Lab’s open source AI-powered framework
Hack the AI agent: Build agentic AI security skills with the GitHub Secure Code Game
Joseph Katsioloudes · 2026-04-15 · via The latest security news for developers - The GitHub Blog

I was scrolling through my feed one evening when I came across OpenClaw, an open source personal AI assistant that people were calling everything from “Jarvis” to “a portal to a new reality.” The idea is beautiful: an AI that lives on your machine or in the cloud, talks to you over WhatsApp or Telegram, clears your inbox, manages your calendar, browses the web, runs shell commands, and even writes its own plugins. Users were having it check them in for flights, build entire websites from their phones, and automate things they never thought possible.

My first reaction was the same as everyone else’s: this is incredible.

My second reaction was…different. I started thinking about what happens when that kind of power meets a malicious prompt. What if someone tricks the agent into reading files it should not access? What if a poisoned web page rewrites the agent’s instructions? What if one agent in a multi-agent chain passes bad data to another that blindly trusts it?

Those questions became Season 4 of the Secure Code Game.

The Secure Code Game: Learn secure coding and have fun doing it

The Secure Code Game is a free, open source in-editor course where players exploit and fix intentionally vulnerable code. When I created the first season in March 2023, the goal was straightforward: make security training that developers would enjoy. Fix the vulnerable code, keep it functional, level up. That core philosophy has not changed across any season.

Season 2 expanded into multi-stack challenges with community contributions across JavaScript, Python, Go, and GitHub Actions. Season 3 took players into LLM security, where they learned to hack and then harden large language models. Along the way, over 10,000 developers across the industry, open source, and academia have played to sharpen their skills.

What has changed with each season is the landscape. When we launched Season 1, AI coding assistants were just starting to become mainstream. By Season 3, we were teaching players to craft malicious prompts and then defend against them. Now, with Season 4, we are tackling the security challenges of AI systems that can act autonomously. They can browse the web, call APIs, coordinate with other agents, and act on your behalf.

Why agentic AI security matters right now

The timing is not a coincidence. AI agents have moved from research prototypes to production tools at remarkable speed, and the security community is racing to keep up.

The OWASP Top 10 for Agentic Applications 2026, developed with input from over 100 security researchers, now catalogues risks like agent goal hijacking, tool misuse, identity abuse, and memory poisoning as critical threats. A Dark Reading poll found that 48% of cybersecurity professionals believe agentic AI will be the top attack vector by the end of 2026. And Cisco’s State of AI Security 2026 report highlighted that while 83% of organizations planned to deploy agentic AI capabilities, only 29% felt ready to do so securely.

The gap between adoption and readiness is exactly where vulnerabilities thrive. And the best way to close that gap is by learning to think like an attacker.

Meet ProdBot: your deliberately vulnerable AI assistant

Season 4 puts you inside ProdBot, your productivity bot, a deliberately vulnerable agentic coding assistant for your terminal. Inspired by tools like OpenClaw and GitHub Copilot CLI, ProdBot turns natural language into bash commands, browses a simulated web, connects to MCP (Model Context Protocol) servers, runs org-approved skills, stores persistent memory, and orchestrates multi-agent workflows.

Your mission across five progressive levels is simple: use natural language to get ProdBot to reveal a secret it should never expose. If you can read the contents of password.txt, you have found a security vulnerability.

No AI or coding experience is needed…just curiosity and willingness to experiment. Everything happens through natural language in the CLI.

Five levels, five upgrades, five vulnerabilities

Each level of the game mirrors a stage in how real AI-powered tools evolve. As ProdBot gains new capabilities, the upgrade opens a new attack surface for you to discover. Here is what ProdBot looks like as it grows:

  • Level 1 starts with the basics: ProdBot generates and executes bash commands inside a sandboxed workspace. Can you break out of the sandbox?
  • Level 2 gives ProdBot web access. It can now browse a simulated internet of news, finance, sports, and shopping sites. What could go wrong when an AI reads untrusted content?
  • Level 3 connects ProdBot to MCP servers…external tool providers for stock quotes, web browsing, and cloud backup. More tools, more power, more ways in.
  • Level 4 adds org-approved skills and persistent memory. ProdBot can now run pre-built automation plugins and remember your preferences across sessions. Trust is layered…but is it earned?
  • Level 5 is everything coming together: six specialized agents, three MCP servers, three skills, and a simulated open-source project web. The platform claims all agents are sandboxed and all data is pre-verified. Time to put that to the test.

Each level builds on the previous one, and that progression is the point.

We aren’t going to tell you exactly which vulnerabilities you will find at each level as that would ruin the fun. But we will say this: the attack patterns you will discover in Season 4 are not theoretical. They reflect the kinds of risks that security teams are grappling with right now as organizations deploy autonomous AI systems into production.

Think about CVE-2026-25253 (CVSS 8.8 – High): Known as “ClawBleed” or the one-click Remote Code Execution (RCE) vulnerability. It allowed attackers to steal authentication tokens via a malicious link and gain full control of the OpenClaw instance.

The goal is not just to learn a specific exploit. It is to build the instinct that helps you spot these patterns in the wild, whether you are reviewing an agent’s architecture, auditing a tool integration, or simply deciding how much autonomy to give the AI assistant that just landed on your team.

Get started in under 2 minutes

This entire experience runs in GitHub Codespaces, so there is nothing to install, nothing to configure, and it doesn’t cost you a penny (Codespaces offers up to 60 hours of free usage per month). You can be inside ProdBot’s terminal in under two minutes, and each season is self-contained, so you can jump straight into Season 4 without covering the earlier ones.

You may find Season 3 to be a helpful foundation since it builds the basics of AI security. But it is not required. Just bring your hacker mindset.

Special thanks to Rahul Zhade, Staff Product Security Engineer at GitHub, and Bartosz Gałek, creator of Season 3, for testing and improving Season 4.

FAQ

Do I need AI or coding experience to play Season 4?

No. Everything happens through natural language in the CLI. You type plain English, or any language, prompts and ProdBot responds. Curiosity and a willingness to experiment are all you need.

 

Do I need to complete previous seasons first?

No. Each season is self-contained. You can jump directly into Season 4 by running ProdBot and typing level <N>. That said, Season 3 builds a helpful foundation in AI security and takes about 1.5 hours.

 

How long does Season 4 take?

Approximately two hours, though it varies depending on how deeply you explore each level. Some players like to try multiple approaches per level.

 

Is this free?

Yes. The Secure Code Game is open source and free to play. It runs in GitHub Codespaces, which provides up to 60 hours of free usage per month.

 

What are the rate limits?

Season 4 uses GitHub Models, which have rate limits. If you hit a limit, wait for it to reset and resume. Learn more about responsible use of GitHub Models.


Written by

Joseph Katsioloudes

Explore more from GitHub

Docs

Docs

Everything you need to master GitHub, all in one place.

Go to Docs

GitHub

GitHub

Build what’s next on GitHub, the place for anyone from anywhere to build anything.

Start building

Customer stories

Customer stories

Meet the companies and engineering teams that build with GitHub.

Learn more

The GitHub Podcast

The GitHub Podcast

Catch up on the GitHub podcast, a show dedicated to the topics, trends, stories and culture in and around the open source developer community on GitHub.

Listen now