惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
博客园_首页
H
Hackread – Cybersecurity News, Data Breaches, AI and More
T
ThreatConnect
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - 聂微东
H
Help Net Security
T
Threat Research - Cisco Blogs
Blog — PlanetScale
Blog — PlanetScale
A
Arctic Wolf
G
Google Developers Blog
量子位
U
Unit 42
I
InfoQ
V
V2EX
F
Fox-IT International blog
P
Privacy & Cybersecurity Law Blog
V
Visual Studio Blog
J
Java Code Geeks
大猫的无限游戏
大猫的无限游戏
C
CERT Recently Published Vulnerability Notes
博客园 - 三生石上(FineUI控件)
T
The Exploit Database - CXSecurity.com
T
Tailwind CSS Blog
SecWiki News
SecWiki News
Know Your Adversary
Know Your Adversary
MyScale Blog
MyScale Blog
宝玉的分享
宝玉的分享
The Hacker News
The Hacker News
Project Zero
Project Zero
Application and Cybersecurity Blog
Application and Cybersecurity Blog
月光博客
月光博客
Recent Commits to openclaw:main
Recent Commits to openclaw:main
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
G
GRAHAM CLULEY
C
Cisco Blogs
I
Intezer
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
Recorded Future
Recorded Future
T
Tenable Blog
W
WeLiveSecurity
腾讯CDC
Stack Overflow Blog
Stack Overflow Blog
T
The Blog of Author Tim Ferriss
www.infosecurity-magazine.com
www.infosecurity-magazine.com
D
Docker
C
Cybersecurity and Infrastructure Security Agency CISA
PCI Perspectives
PCI Perspectives

Fastly Blog

How to Configure Local Logging for an On-Prem Next-Gen WAF Agent Accountability Without Control Is Breaking Security Leadership Fastly Joins the Agentic AI Foundation (AAIF) to Guide Edge AI Interoperability The E-commerce Industry in the AI Era: Has the Agentic Flood Hit? No Margin for Error: What the FIFA World Cup Teaches Us About Performance at the Edge Why iGaming Infrastructure is Breaking and What Comes Next The Publishing Industry in the AI Era: Why Bot Strategy is Now a Business Strategy Bad Performance Kills SaaS/PaaS Growth — Why Your CDN Matters Why your code is safe from Copy Fail on Fastly Compute Myth or Marvel: Claude Mythos and What it Means for Security Introducing Compliance Audit Reports Supporting Google Private AI Compute with Privacy-Preserving Edge Infrastructure Adapting in the Era of AI Nearly Half the Web Isn’t Human: Inside Fastly’s Threat Insight Report Media over QUIC: Can Streaming Finally Have Both Scale and Low Latency? Introducing Fastly’s Redesigned Homepage: Your Central Hub for Actionable Insights The False Choice of Indiscriminate Blocking: Why Technical Precision is the New Standard for an Open Internet What is CVE-2026-23869? React Server Components Security Alert Fastly enables first-party tagging for Google Advertisers Shrink Your Bill With Efficient Software Your AI coding agent just got better at Fastly Fastly Ranked as a Leader in the 2026 Forrester Wave™ for Edge Development Platforms Fastly at RSAC 2026: New Advances in AppSec, Bot Management, and Deception Mastering the Edge: What Golf Can Teach Us About Speed, Precision, and Performance Real-Time CDN Monitoring for Live Events with Bronto Imperva Alternatives Optimizing your multi-CDN infrastructure to improve performance Fastly + Scalepost: Extending the Fastly platform to manage AI Crawlers Best content delivery networks for bot management Vibe Shift? Senior Developers Ship nearly 2.5x more AI Code than Junior Counterparts Maximizing Compute Performance with Log Explorer & Insights Fastly's Observability Unleashed: New Updates and Insights Stay ahead of attackers by pushing your security perimeter to the edge Are APIs the Key to Digital Innovation or a Trojan Horse? Fastly can teach you about the Wasm future in just 6 talks Fastly Academy: on-demand learning at your fingertips. | Fastly 30 Years of Web: Building for Tomorrow 4 Ways Legacy WAF Fails to Protect Your Apps Compute@Edge: Serverless Insights by Company | Fastly Adobe boosts performance and MTTR with Epsagon and Fastly logs | Fastly Best Practices for Multi-CDN Implementations | Fastly Beta" A New Serverless Compute Environment Early TLS at Fastly Technical trainings & the future of edge delivery at Altitude 2016: a year in review Scaling Fastly Network: Balancing Requests | Fastly Innovation Capacity Defined: Tech Stack Values | Fastly Deep Log Visibility Offered by Logentries | Fastly Caching the Uncacheable: CSRF Security Increase Your Hit Ratio With This Simple Tip Fastly CDN Expands
Give AI Agents the Markdown They Actually Want
2026-05-28 · via Fastly Blog

AI crawlers are going to ask for your pages whether you're ready for them or not. Today they get HTML, same as any browser, and they spend CPU cycles stripping your nav and footer to find the article underneath. This post walks through a small Fastly Compute service that meets them in the middle: normal requests still get your site, while agents get a clean Markdown version of the same content.

We can easily accomplish this with about 200 lines of JavaScript, which you can find in the repo here. You can skim the pipeline section to see the shape, or clone and deploy if you want to get there faster.

Why This Matters

Our own Security Research report found that bots account for 49% of requests. The vast majority is unwanted traffic, and verified AI is only a sliver of what's left, but that sliver carries outsized business impact. A single hit from GPTBot, PerplexityBot, or ChatGPT-User isn't one user. It's every real user who'll eventually see your content through a large language model instead of on your site. Getting that experience right is worth a little engineering.

The problem with serving those crawlers HTML: they don't want it. LLM training pipelines and retrieval systems operate on text. So when a crawler pulls your product documentation and needs to turn it into answers, HTML is overhead for them. It has to be parsed, stripped of boilerplate, de-noised of tracking pixels and menu chrome, and flattened into plain text. Some of that cleanup is lossy, especially tables, code blocks, and footnotes, which often show up mangled in downstream summaries.

Markdown sidesteps most of that, it's what those existing pipelines already speak natively. And it's small, a typical article compresses to 20-30% of its HTML size, which means less bandwidth and fewer tokens burned on your structure instead of your ideas.

The catch is that rewriting everything to serve Markdown at origin isn't realistic for most teams, and you don't want to anyway. Browsers still need the HTML. What you want is a transform that runs on the request path, doesn't slow things down, and caches well so you're not paying for the same work twice.

What We're Building

A small JavaScript service on Fastly Compute that sits in front of your origin and does three things based on who's asking:

  • A normal browser request gets HTML, passed through origin untouched.

  • An AI crawler user-agent (we detect 17 of them by default) or a request with Accept: text/markdown gets a Markdown version of the same page.

  • An explicit /md/<path> request always returns Markdown. Useful for debugging, internal tooling, and content teams who want to spot-check what crawlers see.

Here's what the output looks like for a request to /md/blog/rate-limits:

---
title: "Rate limits — API docs"
description: "How rate limits work, per-tier quotas, and the headers to inspect."
author: "Platform team"
date: "2026-03-02T00:00:00Z"
url: "https://example.com/docs/rate-limits"
source: "https://your-site.edgecompute.app/md/blog/rate-limits"
---

# Rate limits

Every API key is subject to a request budget per minute and per day...

## Quotas by tier

| Tier | Requests / min | Requests / day |
| --- | --- | --- |
| Free | 60 | 10,000 |
| Pro | 600 | 500,000 |
| Enterprise | Custom | Custom |

Clean headings, a real Markdown table, YAML frontmatter a downstream pipeline can parse without heuristics. Nav, footer, related-articles, newsletter prompts, inline scripts, are all stripped away.

The Stack

Four pieces do all the work:

  • Fastly Compute runs the whole thing as WebAssembly, close to the user. We use the JavaScript SDK (@fastly/js-compute).

  • linkedom parses the origin HTML into a DOM. It's a lightweight, standards-adjacent implementation that compiles cleanly to WASM, unlike jsdom, which pulls in a lot of Node-specific machinery.

  • Defuddle extracts the main content. It's a newer extractor from the Obsidian Web Clipper team, purpose-built for agent-facing Markdown. It handles site-specific quirks (per-site extractors for known publications), standardizes code blocks and footnotes into consistent HTML, and falls back to heuristic scoring when it has to.

  • Turndown walks the extracted DOM and emits Markdown. We add the GFM plugin for tables and strikethrough, plus one small custom rule to handle a linkedom quirk (more on that below).

Plus fastly:cache's SimpleCache for edge caching, no other dependencies.

The Conversion Pipeline

Everything that turns HTML into Markdown lives in one file, src/converter.js:

import Defuddle from 'defuddle';
import { parseHTML } from 'linkedom';
import TurndownService from 'turndown';
import { gfm } from '@joplin/turndown-plugin-gfm';

const turndown = new TurndownService({
  headingStyle: 'atx',
  codeBlockStyle: 'fenced',
  bulletListMarker: '-',
});
turndown.use(gfm);

export function htmlToMarkdown(html, sourceUrl) {
  const { document } = parseHTML(html);

  const result = new Defuddle(document, { url: sourceUrl }).parse();
  const articleDoc = parseHTML(result?.content || '').document;
  const markdown = turndown.turndown(articleDoc.documentElement).trim();

  if (!markdown) {
    throw new Error('Could not extract readable content from page');
  }

  const frontmatter = buildFrontmatter(result, document, sourceUrl);
  return `${frontmatter}\n\n${markdown}\n`;
}

The pipeline is linear: parse with linkedom, hand the Document to Defuddle, let Defuddle do its extraction and standardization, then re-parse its HTML output through linkedom one more time so Turndown has a real DOM node to walk. That second parse feels redundant, but it matters and we'll get to why in a moment.

The buildFrontmatter helper pulls title, description, author, and published date from Defuddle's metadata, falling back to standard <meta> tags when Defuddle doesn't have them. We also emit the canonical URL, so whatever consumes this Markdown can point back to the original page.

The DOM-node-not-string gotcha

If you read Defuddle's docs, you'll notice a markdown: true option that looks like it should do everything Turndown does for us. It does in Node, but it doesn't in Compute.

The reason: Defuddle's built-in Markdown step calls turndownService.turndown(htmlString). Turndown, given a string, parses it internally by calling document.implementation.createHTMLDocument. The Compute JS runtime is SpiderMonkey with linkedom providing the DOM, and linkedom doesn't expose document.implementation. Turndown throws, Defuddle swallows the throw, and you get a fallback message like "Partial conversion completed with errors" with the raw HTML appended.

Handing Turndown a DOM node sidesteps that parser entirely. It walks the tree we give it. That's why the second parseHTML call is there.

The Table Rule

One more linkedom quirk: HTMLTableElement.rows isn't populated. The GFM plugin's table rule checks node.rows[0] to decide whether to convert the table or skip it, and since rows is undefined, every table becomes flattened text.

The fix is a small custom rule registered after GFM:

turndown.addRule('linkedom-table', {
  filter: (node) => node.nodeName === 'TABLE',
  replacement: (_content, node) => {
    const rows = Array.from(node.querySelectorAll('tr'));
    if (!rows.length) return '';
    const cells = (tr) =>
      Array.from(tr.querySelectorAll('th, td')).map((c) =>
        c.textContent.replace(/\s+/g, ' ').trim().replace(/\|/g, '\\|'),
      );
    const header = cells(rows[0]);
    const body = rows.slice(1).map(cells);
    const sep = header.map(() => '---');
    const fmt = (row) => `| ${row.join(' | ')} |`;
    return `\n\n${[fmt(header), fmt(sep), ...body.map(fmt)].join('\n')}\n\n`;
  },
});

querySelectorAll('tr') works where .rows doesn't. Since our custom rule is registered last, Turndown picks it over the GFM default. A few extra lines that save any page with a table.

Routing and content negotiation

The Compute fetch handler lives in src/index.js. The whole routing layer is about 50 lines:

async function handleRequest(event) {
  const req = event.request;
  const url = new URL(req.url);

  if (url.pathname === '/health') return jsonResponse({ status: 'ok' });
  if (url.pathname === '/__html-2-md__') return landingResponse();

  if (url.pathname.startsWith('/md/') || url.pathname === '/md') {
    const originPath = url.pathname.replace(/^\/md/, '') || '/';
    return await convertAndRespond(req, url, originPath);
  }

  const ua = req.headers.get('User-Agent') || '';
  const accept = req.headers.get('Accept') || '';

  if (isAiCrawler(ua) || wantsMarkdown(accept)) {
    return await convertAndRespond(req, url, url.pathname);
  }

  return fetch(req, { backend: 'origin' });
}

Four decision points, in order. Health and debug routes are served locally. A /md/<path> prefix forces Markdown regardless of headers. After that, we look at the request: if it's from a known AI crawler or explicitly asks for Markdown, we convert. Otherwise, a straight pass-through to origin.

The crawler detection is a small list in src/agents.js, 17 user-agent patterns covering the mainstream ones: GPTBot, ChatGPT-User, ClaudeBot, anthropic-ai, PerplexityBot, GoogleOther, cohere-ai, and so on. It's a case-insensitive substring match. Agents evolve, so treat the list as a starting point and prune or extend based on what actually shows up in your logs.

Caching

Markdown conversion takes a few hundred milliseconds on a cold request, most of it in Defuddle's scoring. That's fine for the first crawler hit, painful for the hundredth. SimpleCache turns it into a one-liner:

const cacheKey = `html-2-md:${originUrl.pathname}${originUrl.search}`;
const cached = SimpleCache.get(cacheKey);

if (cached) {
  body = await cached.text();
} else {
  body = await fetchAndConvert(originUrl, url);
  SimpleCache.set(cacheKey, body, CACHE_TTL); // 5 minutes
}

Five minutes is a reasonable default for most content sites, just tune it to how often you publish. The cache is per-POP, so you'll see a cold conversion per region on first request, then cached responses after.

We also set Vary: Accept, User-Agent on the response. Any downstream caches (yours, the crawler's) will respect the same content negotiation we do.

Testing Locally

The converter is a pure function, HTML in, Markdown out. That makes it trivial to test with plain Node, no Compute runtime required:

import { test } from 'node:test';
import assert from 'node:assert/strict';
import { htmlToMarkdown } from '../src/converter.js';

test('docs page: preserves tables and nested lists', async () => {
  const html = await readFile('test/fixtures/docs-page.html', 'utf8');
  const md = htmlToMarkdown(html, 'https://example.com/docs/rate-limits');

  assert.match(md, /# Rate limits/);
  assert.match(md, /\|\s*Tier\s*\|/);  // markdown table header
  assert.match(md, /\|\s*Free\s*\|\s*60\s*\|/);
});

Drop a handful of representative fixtures into test/fixtures/ (a blog post, a docs page with tables, a news article with boilerplate), and assert on the properties you care about. Our companion repo ships with three. npm test runs in about 200ms, which means you can iterate on extraction quirks without rebuilding WASM.

For the full edge pipeline, fastly compute serve boots Viceroy (Fastly's local Compute emulator) on 127.0.0.1:7676:

curl -s "http://127.0.0.1:7676/" -H "Accept: text/markdown" | head -30
curl -s "http://127.0.0.1:7676/" -H "User-Agent: GPTBot/1.0" | head -30
curl -s "http://127.0.0.1:7676/md/blog/my-post" | head -30
curl -sI "http://127.0.0.1:7676/"   # confirm HTML pass-through

Point [local_server.backends.origin] in fastly.toml at whatever origin you want to proxy, and you've got a working end-to-end loop.

Deploying

Same two commands as any other Compute service:

npm run build        # compile to bin/main.wasm
fastly compute deploy

First run prompts you to create a service and configure your production origin backend. After that, you've got a Compute endpoint that'll respond at <service>.edgecompute.app. Point a custom domain at it, or front it with your existing Fastly service as a shielding config, whichever fits your topology.

What's actually happening on the wire

For a request from GPTBot to /blog/my-post:

  1. Compute gets the request. User-Agent matches GPTBot → route to conversion path.

  2. Check SimpleCache for html-2-md:/blog/my-post. Miss.

  3. Fetch HTML from origin (the origin backend declared in fastly.toml).

  4. Parse with linkedom → run Defuddle → re-parse → Turndown → frontmatter.

  5. Store in SimpleCache with 5-minute TTL. Return.

  6. Response: Content-Type: text/markdown; charset=utf-8, Vary: Accept, User-Agent, X-Markdown-Tokens: <estimate>.

For a regular browser hitting the same URL at the same time, step 2 is skipped entirely. They get HTML straight from origin, same as always.

Where to Take it From Here

A few directions worth considering once it's running:

Token counting: Our heuristic (length / 4) is a rough approximation of GPT-style tokenization. If you care about accurate accounting, swap in a real tokenizer. There are WASM-compatible tiktoken builds that work in Compute.

Link rewriting: The current output preserves relative URLs from origin, which means a crawler has to resolve them against the request URL. You can rewrite relative links to absolute inside the Defuddle result before Turndown runs it.

Per-site extractors: Defuddle supports custom extractors for sites with unusual structure. If you're proxying a specific publication or docs site, writing a one-off extractor produces much cleaner output than the generic heuristics.

Streaming: For very long articles, the current implementation buffers the whole body before emitting the response. Streaming the conversion would reduce TTFB. It's more complex (Defuddle wants the full document to score) but feasible by chunking on section boundaries.

Rate limiting by agent: If you want to serve GPTBot but throttle a noisier bot, pair this service with our Edge Rate Limiting offering.

Wrapping up

Serving Markdown to AI agents is one of those small efforts that can have an outsized impact. It respects the agent’s workload, but also your bandwidth (and ultimately your bottom line). Compute is a good fit for it because the work is close to the request, cacheable, and measured in milliseconds. What you want is a transform that runs on the request path, doesn't slow things down, and caches well so you're not paying for the same work twice.

Feel free to clone the service here. If you build something interesting on top of this (a token counter, a custom extractor, a link rewriter), we'd like to hear about it.