惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

I
InfoQ
Last Week in AI
Last Week in AI
大猫的无限游戏
大猫的无限游戏
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
V
V2EX
D
Darknet – Hacking Tools, Hacker News & Cyber Security
WordPress大学
WordPress大学
H
Help Net Security
P
Proofpoint News Feed
B
Blog
腾讯CDC
博客园 - 司徒正美
Recorded Future
Recorded Future
酷 壳 – CoolShell
酷 壳 – CoolShell
S
Security Archives - TechRepublic
N
News and Events Feed by Topic
T
The Exploit Database - CXSecurity.com
www.infosecurity-magazine.com
www.infosecurity-magazine.com
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
O
OpenAI News
GbyAI
GbyAI
Attack and Defense Labs
Attack and Defense Labs
T
Troy Hunt's Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
F
Future of Privacy Forum
V
Vulnerabilities – Threatpost
T
Threatpost
The Cloudflare Blog
Recent Announcements
Recent Announcements
爱范儿
爱范儿
S
Schneier on Security
Blog — PlanetScale
Blog — PlanetScale
Cyberwarzone
Cyberwarzone
T
The Blog of Author Tim Ferriss
T
True Tiger Recordings
P
Proofpoint News Feed
S
Secure Thoughts
F
Fox-IT International blog
aimingoo的专栏
aimingoo的专栏
阮一峰的网络日志
阮一峰的网络日志
M
Microsoft Research Blog - Microsoft Research
F
Full Disclosure
Google Online Security Blog
Google Online Security Blog
T
Threat Research - Cisco Blogs
S
Securelist
罗磊的独立博客
L
Lohrmann on Cybersecurity
博客园 - 三生石上(FineUI控件)
T
Tailwind CSS Blog
MongoDB | Blog
MongoDB | Blog

DEV Community

Laravel Waiting Request Why Google Can't See Your React Breadcrumbs (And the 4-Line Fix) AI Travel Assistant Powered by Gemma 4; With Streaming, Image Input, and Visual Recommendation Cards Microsoft tried to kill the printer driver. Healthcare said no. The Blueprint Beneath the Blueprint: Designing Data Model and Choosing Its Database REST APIs vs Webhooks in Telecom Billing - Which One Actually Makes Sense? Accounting Made Simple: AI-Powered Financial Insights of Japanese Companies with Gemma 4 Designing the Future of Payments — Why XML Still Matters in the Age of APIs From Legacy to Live — Reviving XMLPayments with GitHub Copilot Two Weeks Into Learning Solana XMLPayments — The Hidden Backbone of Modern Financial Orchestration AI Agents in Practice — Read from the beginning Reviving My Gemma Agentic Framework: From Prototype to Polished Repo Smart Contracts Demand Better Infrastructure: Building on contract.dev Self-Hosted LLM Tool Calling: Forge and the Build-vs-Buy Decision ORA-00072 오류 원인과 해결 방법 완벽 가이드 OpenWA for CTOs: Self-Hosted WhatsApp Gateway Trade-Offs NotebookLM Automation With notebooklm-py: Useful, But Classify Data First Docker v29.5.x Operator Upgrade Checklist Coding-Agent Instruction Design: The CLAUDE.md File That Prevents Rework When I Finally Realized My Runtime Was Holding Me Back GnokeOps: Host Your Own AI House Party The Death of Static Rate Limiters: Why Your Java Virtual Threads Need BBR-Style Adaptive Concurrency AI Agents in Practice — Part 2: What Makes Something an Agent Stop scattering LLM SDK/API calls across your codebase. Here is the 2-file rule that fixed mine Beyond Prompts: Structuring AI Workflows for Real Frontend Engineering From an Abandoned Hackathon Project to an AI Study Workspace 🚀 Terraform with AI: Build AWS Infra (Cursor + MCP) What If AI Didn’t Need the Internet? 750,000 Chips, 140 Trillion Tokens: The Math Behind DeepSeek's Permanent Price Cut You're Renting Someone Else's Compute — And It's Costing You More Than You Think CSS :has() Selector: The Layout Trick I Wish I Knew 5 Years Ago Five Clusters. Five Lessons. One Production System. Synaptic: A Local-First AI Dev Companion That Remembers How You Think Revolutionizing Edge MedTech: Building a Sovereign Sleep Apnea Companion ("XiHan Snore Coach") with Gemma 4 HDD Eksternal Tiba-Tiba Tidak Bisa Diakses di Windows? Ini Tiga Lapis Fix-nya DMARC p=none vs p=quarantine vs p=reject: what to use and when DSA Application in Real Life: How Git Diff Works: LCS Intuition, Myers Algorithm, and Real Code Changes I solo-built a reputation layer for AI agents on NEAR — and here's what I learned I built an AI faceless video generator in 2 months — here's the stack Diffusion Language Models: How NVIDIA Nemotron-Labs Diffusion Shatters the Autoregressive Speed Ceiling llm-nano-vm v0.8.0 — deterministic FSM runtime for LLM pipelines, now with output validation and per-step timeouts From the Renaissance to the Quantum Dawn: AI, Computation, and the Next Paradigm Shift How I Built a Review Site with 800+ Articles Using AI I Built a Smart Kitchen AI with Gemma 4 That Turns Fridge Photos Into Recipes Why your vulnerability dashboard is lying to you (and how to fix it) From Abandoned Prototype to Smart AI System: Reviving Trafiq AI with GitHub Copilot Why Country/State/City Pickers Are Weirdly Hard Node.js 22 LTS — EOL Date, Support Timeline, and What Comes Next The 7-Layer Memory Architecture Behind Modern AI Agents I Imagined Hermes Agent Running an Entire Smart City — And It Changed How I See AI One backend, four products: why we bet on platform-per-brand AI's tech debt is invisible — even to AI. I solved it at the architecture layer. Why ROAS 300% Can Still Mean Losses — Gross Margin in 5 Ecommerce Verticals You Don’t Need to Try Every AI Tool to Keep Up NovelPilot: A Novel Writing Agent Powered by Gemma 4 BoxAgnts is an Out-Of-The-Box Secure AI Agent ToolBox in a WASM SandBox Gemma 4 deep dive: why a 1.5 GB model scores 37.5% on competition mathematics, how the MoE routing actually works, and which model fits your hardware. Full breakdown inside. BeeLlama v0.2.0: 164 tok/s on a 27B model, one RTX 3090 Google Just Declared the Chat-Log Interface Dead. Here's What Neural Expressive Actually Signals for Developers. ARCHITECTURE SPECIFICATION & FORMAL SYSTEM REPORT: k501-AIONARC Notes from a Hammock What's Google Antigravity 2.0 ? Here's What the Agent Harness Actually Changes for Developers. Building an E2EE Chat App in Flask - Part 3: Keeping File Uploads Safe Google's Gemini Spark. Here's What It Actually Does for Developers. Microsoft Just Shipped MCP Governance for .NET. Here's What It Actually Enforces. How I Built a Pakistan Internet Speed Test Platform at 16 How to Build a Supervisor Agent Architecture Without Frameworks I Built My Own Corner of the Internet — Here's What It Looks Like How does VuReact compile Vue 3's defineExpose() to React? Neo-VECTR's Rift Ascent Idempotency Keys: The API Safety Net You Probably Aren't Using Building E-Commerce Sites for Niche Products: Technical Lessons from Specialty Outdoor Retailers Audit Logs: The Silent Guardian of Every Serious System Open-source SDS tooling for Japanese MHLW compliance: the gap nobody filled BetAGracevI I Built a Post-Quantum Cryptographic Identity SDK for AI Agents — Here's Why It Needs to Exist Running Claude Code across multiple repos without losing context There Are Cameras in Every Room of My House. I Put Them There. Why your AI agent loops forever (and how to break the cycle) How does VuReact compile Vue 3's defineSlots() to React? Building a Privacy-First Resume Editor with Typst WASM and React One Soul, Any Model: Portable Memory for Open-Source Agents with .klickd From Pixels to Prescriptions: Building an Autonomous Healthcare Booking Agent with LangGraph MonoGame - A Game Engine for Those Who Love Reinventing the Wheel # Day 24: In Solana, Everything is an Account Mastering Node.js HTTP Module: Build Servers, REST APIs, and Handle Requests Mastering Node.js HTTP Module: Build Servers, REST APIs, and Handle Requests RP2040 Wristwatch Tells Time With a Vintage VU Meter Needle observations about models / 2026, may From Video Transcripts to Source-Grounded AI Notes: A Practical Look at Notesnip AI Agent Dev Environment Guide — Real Experience from an AI Living Inside a Server How I Run 7 AI Models 24/7: Multi-Agent Architecture in Practice What exactly changes with the Claude Max plan? I Revived a Broken MLOps Platform — Now It's Self-Service, Policy-Guarded, and Operationally Credible OpenAI's $2M-tokens-for-equity YC deal, decoded Why DMX Infrastructure is Still Stuck in the 90s Agent Series (2): ReAct — The Most Important Agent Reasoning Paradigm Open Source Project (No.73): Sub2API - All-in-One Claude/OpenAI/Gemini Subscription-to-API Relay I Made the Wrong Bet on Event Streaming in Our Treasure Hunt Engine
The append-only AST trick that makes Flutter AI chat actually smooth
jay limbani · 2026-05-23 · via DEV Community

flutter_markdown re-parses the entire response string on every streamed token. The fix is an append-only AST with monotonic node IDs used as Flutter widget keys. I packaged it as streamdown — a drop-in replacement that's 188× faster on chunked input and produces zero visible flicker. Live on pub.dev today.

streamdown vs flutter_markdown — split screen demo


The problem

Every ChatGPT-style Flutter app I built had the same broken-feeling moment: code blocks flashing unstyled → styled → unstyled, tables jittering as new cells arrive, scroll position breaking, and the cursor jumping around like the UI is fighting itself.

The root cause is one line, repeated thousands of times during a single streamed response:

StreamBuilder<String>(
  stream: openai.responseStream,
  builder: (_, snap) => Markdown(data: snap.data ?? ''),
)

Enter fullscreen mode Exit fullscreen mode

flutter_markdown does exactly what its API promises — it takes a complete string and renders it. The problem is that every new chunk produces a new data value, and the entire string gets re-tokenized, re-parsed, and re-rendered from scratch. That O(n²) work is invisible on a 200-char response; on a 5KB code-heavy answer it's the source of every visible glitch.

You can confirm this in five minutes: feed an OpenAI completion into flutter_markdown with chunk_size=1 and watch a syntax-highlighted code block strobe like it's having a seizure.

The three tricks (and why all three are needed)

Fixing this needs three changes that have to land together — fixing only one or two doesn't move the needle.

Trick 1 — Incremental token-level parser (append-only)

Instead of re-tokenizing the full buffer on every chunk, keep the tokenizer's state machine alive across chunks. New characters extend the trailing token; characters already emitted as tokens are never revisited.

class Tokenizer {
  final List<Token> _tokens = [];
  _State _state = _State.start;
  String _buffer = '';

  void feed(String chunk) {
    _buffer += chunk;
    while (_canEmit()) {
      _tokens.add(_emit()); // never touches _tokens already emitted
    }
  }

  void complete() { /* flush trailing token if any */ }
}

Enter fullscreen mode Exit fullscreen mode

The block tokenizer is line-based and stateful — fences, lists, blockquotes, and tables all need to know "are we still inside the previous structure?" The inline tokenizer (emphasis, links, code spans) is pure and runs on short paragraph text, so it's fine to re-run from scratch when a paragraph's text changes.

Trick 2 — Append-only AST construction

The parser converts tokens into AST nodes — but only ever mutates the trailing path. A closed paragraph becomes immutable. A new paragraph node gets appended. A list keeps growing items until a blank line closes it.

sealed class AstNode { final int id; ... }
class Paragraph extends AstNode { final List<InlineSpan> spans; ... }
class CodeBlock extends AstNode { final String? lang; final String code; final bool isComplete; }
// ...

class Parser {
  int _nextId = 0;
  final List<AstNode> _nodes = [];

  void feed(Token token) {
    // Mutate ONLY the trailing node, or append a new node.
    // Closed nodes never get their `id` reassigned.
  }
}

Enter fullscreen mode Exit fullscreen mode

This is also where provisional rendering falls out for free: an unclosed code block becomes a CodeBlock(isComplete: false) node immediately. The renderer sees it, picks up the language from the fence info string, and starts syntax-highlighting in real time. No flash of unstyled content.

Trick 3 — Diff-stable widget keys

Here's the part that makes Flutter actually behave. Every AST node carries a monotonically increasing id. The renderer uses ValueKey(node.id) for the widget at each AST position:

ListView(
  children: [
    for (final node in nodes) _buildBlock(node, key: ValueKey(node.id)),
  ],
)

Enter fullscreen mode Exit fullscreen mode

Closed nodes never have their id reassigned. So when a new chunk arrives, Flutter's element diff sees the same key in the same slot and reuses the existing element. No teardown, no rebuild, no flicker. Only the trailing (open) node's widget rebuilds — which is exactly the work we wanted to do anyway.

This is the line that turns "incremental parser" into "actually smooth UI." Without it, even a perfect parser still gets all its widgets thrown away on every frame.

The benchmark

Test rig: 5KB markdown response with a mix of paragraphs, two code blocks, a table, and bold/italic — chunked at 4 characters per delivery (about OpenAI's typical streaming cadence). 100 trials, median time.

Approach Time to render full stream
Naive flutter_markdown re-parse 940 ms
streamdown (incremental + stable keys) 5 ms

That's a 188× speedup end-to-end. The bigger story isn't the raw number — it's that the cost stops scaling with response length the way the naive approach does. A 100KB response parsed end-to-end in under 10ms.

The micro-benchmark is in test/perf_benchmark_test.dart if you want to reproduce or tweak the chunk size.

What it looks like to use

The whole point was a drop-in replacement, so here's the entire common-case usage:

import 'package:streamdown/streamdown.dart';

Streamdown(stream: openai.responseStream)

Enter fullscreen mode Exit fullscreen mode

For static content:

Streamdown.text(fullMarkdown)

Enter fullscreen mode Exit fullscreen mode

Options you'll actually reach for:

Streamdown(
  stream: chunks,
  syntaxTheme: SyntaxTheme.githubDark,
  latex: true,                    // enables $..$ / $$..$$ via flutter_math_fork
  selectable: true,               // default
  onLinkTap: (uri) => launchUrl(uri),
  codeBlockBuilder: (lang, code, isComplete) => MyCustomCodeBlock(...),
)

Enter fullscreen mode Exit fullscreen mode

Streaming semantics: chunks are deltas (newly arrived tokens), not cumulative — matching OpenAI/Anthropic/Gemini SDK conventions and the entire point of not re-parsing. If you need cumulative mode, that's a v0.2 constructor.

Things I cut from v0.1 on purpose

Shipping in 5 days meant being honest about scope:

  • Loose-list distinction — any blank line closes a list. Predictable, easy to mentally model, and AI markdown uses blank lines liberally anyway.
  • Nested blockquotes — flattened to depth=1 in the AST. The tokenizer captures depth, so v0.2 can add this without a breaking change.
  • CommonMark "process emphasis" algorithm — stack-based delimiter pairing instead. Pathological cases like *foo**bar*baz** aren't spec-compliant, but real-world AI markdown always nests cleanly.
  • Mermaid, footnotes, definition lists — all v0.2+ candidates.

These were deliberate tradeoffs documented in the decision log, not oversights. Predictable behavior on the 95% case beats half-implemented spec compliance.

What's next

I'm tracking ideas and edge cases in GitHub Discussions. The v0.2 list right now:

  1. Nested blockquotes
  2. Loose-list distinction
  3. Mermaid diagrams behind a flag
  4. Per-line span caching for code blocks (the OPEN code block currently re-highlights on every chunk — fine for ~50-line code blocks, worth caching for longer)
  5. Golden-file tests for visual regression

If you're building AI features in Flutter and hit edge cases — markdown that flickers, breaks, or renders wrong — drop them in Discussions with the input and what you expected. That feedback shapes v0.2 more than my roadmap does.

Try it

dependencies:
  streamdown: ^0.0.1

Enter fullscreen mode Exit fullscreen mode

If this saves your week, ⭐ the repo. If it doesn't, open an issue and tell me what broke.