惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Forbes - Security
Forbes - Security
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
F
Fortinet All Blogs
B
Blog
T
The Blog of Author Tim Ferriss
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI
Y
Y Combinator Blog
Microsoft Azure Blog
Microsoft Azure Blog
L
LangChain Blog
Recent Announcements
Recent Announcements
U
Unit 42
Martin Fowler
Martin Fowler
M
MIT News - Artificial intelligence
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
The Register - Security
The Register - Security
Recorded Future
Recorded Future
C
Check Point Blog
V
V2EX
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Hugging Face - Blog
Hugging Face - Blog
WordPress大学
WordPress大学
Google DeepMind News
Google DeepMind News
酷 壳 – CoolShell
酷 壳 – CoolShell
F
Full Disclosure
小众软件
小众软件
A
About on SuperTechFans
云风的 BLOG
云风的 BLOG
宝玉的分享
宝玉的分享
Last Week in AI
Last Week in AI
有赞技术团队
有赞技术团队
MongoDB | Blog
MongoDB | Blog
爱范儿
爱范儿
P
Proofpoint News Feed
罗磊的独立博客
量子位
D
Docker
博客园_首页
D
DataBreaches.Net
Project Zero
Project Zero
博客园 - 司徒正美
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
博客园 - Franky
Security Latest
Security Latest
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
N
Netflix TechBlog - Medium
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
博客园 - 三生石上(FineUI控件)
H
Hackread – Cybersecurity News, Data Breaches, AI and More
大猫的无限游戏
大猫的无限游戏

Policy Archives - Creative Commons

From Signals to Infrastructure: Strengthening the Commons for the AI Era - Creative Commons From Signals to Infrastructure: Strengthening the Commons for the AI Era - Creative Commons Update on CC Signals: What Changed and Why - Creative Commons Update on CC Signals: What Changed and Why - Creative Commons AI's Infrastructure Era: Reflections from the AI Impact Summit in Delhi - Creative Commons AI's Infrastructure Era: Reflections from the AI Impact Summit in Delhi - Creative Commons How to Keep the Internet Human - Creative Commons How to Keep the Internet Human - Creative Commons Where CC Stands on Pay-to-Crawl - Creative Commons We Asked, You Answered: How Your Feedback Shapes CC Signals - Creative Commons We Asked, You Answered: How Your Feedback Shapes CC Signals - Creative Commons Why CC Signals: An Update - Creative Commons Why CC Signals: An Update - Creative Commons Introducing CC Signals: A New Social Contract for the Age of AI - Creative Commons Introducing CC Signals: A New Social Contract for the Age of AI - Creative Commons Understanding CC Licenses and AI Training: A Legal Primer - Creative Commons Understanding CC Licenses and AI Training: A Legal Primer - Creative Commons CC @ SXSW: Protecting the Commons in the Age of AI - Creative Commons CC @ SXSW: Protecting the Commons in the Age of AI - Creative Commons
Where CC Stands on Pay-to-Crawl - Creative Commons
Annemarie Eayrs · 2025-12-12 · via Policy Archives - Creative Commons

As we’ve discussed before, the rise of large artificial intelligence (AI) models has fundamentally disrupted the social contract governing machine use of web content. Today, machines don’t just access the web to make it more searchable or to help unlock new insights; they feed algorithms that fundamentally change (and threaten) the web we know. What once functioned as a mostly reciprocal ecosystem now risks becoming extractive by default.

In response, new approaches are emerging to support creators, publishers, and stewards of content to reclaim agency over how their works are used.

Pay-to-crawl is one approach beginning to come into focus. Pay-to-crawl refers to emerging technical systems used by websites to automate compensation for when their digital content—such as text, images, and structured data—is accessed by machines. We’ve recently published our interpretation and observations of pay-to-crawl systems in this dedicated issue brief.

A bird's eye view photo of an orange sand mine with transport lorries, but the image is slightly distorted by digital artefacts.
Distorted Sand Mine” by Lone Thomasky & Bits&Bäume, licensed under CC BY 4.0.

CC’s Position on Pay-to-Crawl

Implemented responsibly, pay-to-crawl could represent a way for websites to sustain the creation and sharing of their content, and manage substitutive uses, keeping content publicly accessible where it might otherwise not be shared or would disappear behind even more restrictive paywalls.

However, we do have significant reservations.

Pay-to-crawl may represent an appropriate strategy for independent websites seeking to prevent AI crawlers from knocking them offline or to generate supplementary revenue. But elsewhere, pay-to-crawl systems could be cynically exploited by rightsholders to generate excessive profits, at the expense of human access and without necessarily benefiting the original creators.

Pay-to-crawl systems themselves could become new concentrations of power, with the ability to dictate how we experience the web. They could seek to watch and control how content is used in ways that resemble the worst of Digital Rights Management (DRM), turning the web from a medium of sharing and remixing into a tightly monitored content delivery channel.

We’re also concerned that indiscriminate use of pay-to-crawl systems could block off access to content for researchers, nonprofits, cultural heritage institutions, educators, and other actors working in the public interest. Legal rights to access content afforded by exceptions and limitations to copyright law, such as noncommercial research (in the EU) or fair use exemptions (in the US), as well as provisions for translation and accessibility tools, have been carefully negotiated and adjusted over time. These rights could be impeded by the introduction of blunt, poorly designed pay-to-crawl systems.

Proposed Principles for Responsible Pay-to-Crawl 

Pay-to-crawl systems are not neutral infrastructure. It’s vital that these systems are built and used in ways that serve the interests of creators and the commons, rather than simply create barriers to the sharing of knowledge and creativity, and benefit the few.

We’re proposing the following set of principles as a way to guide the development of pay-to-crawl systems in alignment with this vision:

  1. Pay-to-crawl should not become a default setting.
    Pay-to-crawl represents a strategy that may work for some websites, and not all websites share the same underlying concerns. Pay-to-crawl systems should not be deployed as an automatic or assumed setting on behalf of websites by others, such as domain hosts, content delivery networks, and other web service providers.
  2. Pay-to-crawl systems should enable choice and nuance, not blanket rules.
    Pay-to-crawl systems should enable websites to distinguish between—and set variable controls for—different types of content users (such as commercial AI companies, nonprofits, researchers, or even specific organizations), as well as types and purposes of machine use (such as model training, indexing for search, and inference/retrieval). Systems should not affect direct human browsing and use of content, including by restricting translation or accessibility services.
  3. Pay-to-crawl systems should allow for throttling, not just blocking.
    Pay-to-crawl systems should enable websites to manage hosting costs and other impacts of heavy machine traffic without walling off content entirely. For instance, systems could allow websites to throttle traffic driven by ‘agentic browsing’ or ‘inference’ undertaken by large AI models, while permitting other forms of machine access that involve far lower traffic, such as for research or archival.
  4. Pay-to-crawl systems should preserve public interest access and legal rights.
    Pay-to-crawl systems should not obstruct access to content for researchers, nonprofits, cultural heritage institutions, educators and other actors working in the public interest. Nor should these systems block lawful uses of content protected by copyright exceptions and limitations, and other legal rights afforded in the public interest. The act of deciding not to abide by a pay-per-crawl system should not, by itself, convert an otherwise lawful use into an illegal act.
  5. Pay-to-crawl systems should use open, interoperable, and standardized components.
    Pay-to-crawl systems should not become proprietary chokepoints or gatekeepers. We urge particular caution in the use of proprietary components for authentication and payment that might result in websites getting locked into a particular pay-to-crawl system.
  6. Pay-to-crawl systems should enable collective contributions to the commons.
    Pay-to-crawl systems that only enable financial transactions between singular websites and content users risk creating a highly transactional future, where the value of content is atomized. Pay-to-crawl systems should support collective forms of payment, such as to coalitions of creators and publishers, and wider conceptions of what it means to contribute to the digital commons.
  7. Pay-to-crawl systems should avoid surveillance and DRM-like architectures.
    Pay-to-crawl systems must not introduce excessive logging, fingerprinting, or behavioral tracking related to the use of content. Systems should minimize data collection to only what is needed to authenticate users and settle payments, rather than seek to follow content downstream or dictate how it can be used.

The Path Forward: Showing Up Where the Future Is Being Decided

We believe now is the moment to engage, to influence, and to infuse pay-to-crawl systems with values that prioritize reciprocity, openness, and the commons.

We welcome feedback and dialogue on the principles outlined here. Your input will help guide our engagement with pay-to-crawl systems and related initiatives moving forward, as well as inform the wider CC community’s understanding of them.

Thank you to Jack Hardinges for his contributions to this post.

Posted 12 December 2025