惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

阮一峰的网络日志
阮一峰的网络日志
D
Darknet – Hacking Tools, Hacker News & Cyber Security
S
Schneier on Security
The Last Watchdog
The Last Watchdog
Cyberwarzone
Cyberwarzone
S
Securelist
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cyber Attacks, Cyber Crime and Cyber Security
L
Lohrmann on Cybersecurity
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - 司徒正美
The Cloudflare Blog
V
V2EX
博客园_首页
博客园 - 聂微东
Vercel News
Vercel News
人人都是产品经理
人人都是产品经理
G
GRAHAM CLULEY
T
Tenable Blog
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
L
LINUX DO - 最新话题
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
SecWiki News
SecWiki News
博客园 - 三生石上(FineUI控件)
S
Secure Thoughts
N
News | PayPal Newsroom
T
The Blog of Author Tim Ferriss
The GitHub Blog
The GitHub Blog
T
Troy Hunt's Blog
博客园 - 【当耐特】
Forbes - Security
Forbes - Security
H
Hacker News: Front Page
A
About on SuperTechFans
B
Blog RSS Feed
Engineering at Meta
Engineering at Meta
MongoDB | Blog
MongoDB | Blog
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
罗磊的独立博客
D
DataBreaches.Net
P
Privacy & Cybersecurity Law Blog
Schneier on Security
Schneier on Security
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Google DeepMind News
Google DeepMind News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Jina AI
Jina AI
D
Docker
P
Proofpoint News Feed

Tags from textgen

Release v4.8 · oobabooga/textgen Release v4.7.3 · oobabooga/textgen Release v4.7.2 · oobabooga/textgen Release v4.7.1 · oobabooga/textgen Release v4.7 · oobabooga/textgen Release v4.6.2 · oobabooga/textgen Release v4.6.1 · oobabooga/textgen Release v4.6 · oobabooga/textgen Release v4.5.2 · oobabooga/textgen
Release v4.9 · oobabooga/textgen
oobabooga · 2026-05-21 · via Tags from textgen

Changes

  • MTP speculative decoding support: Add draft-mtp as a new --spec-type option. Auto-enabled when loading MTP GGUFs (e.g. Qwen 3.6 MoE MTP builds).
  • Web search improvements:
    • Add snippet support to the web_search tool: results now include a short text excerpt that often answers the query directly, eliminating the need for a follow-up fetch_webpage call (#7548).
    • Drop link URLs from fetch_webpage output (links now appear as plain text instead of [text](url) markdown), significantly reducing tokens used per page.
    • Prettier rendering of web_search results in the chat, with a spinner during the call.
    • Add an info message to the "Activate web search" checkbox.
  • Show live generation speed (tokens/s) and context size while generating (#7563).
  • DGX Spark support: Add Linux aarch64 portable builds.
  • Electron
    • Add "Check for updates" button in the Session tab.
    • Add a folder picker for the models directory.
    • Add right-click context menu for copying text.
    • Add a spellcheck toggle in the Session tab (#7550).
    • Store app data in user_data/cache/electron instead of the OS default location.
    • Disable DNS-over-HTTPS probes.
  • One-click installer: Track the latest release tag instead of bleeding-edge main.
  • Auto-detect and auto-select sibling mmproj files when loading a model (#7564).
  • Detect mmproj-*.gguf files in the main models folder: They appear in the mmproj dropdown and are hidden from the regular model dropdown.
  • Project icon: Add an icon, courtesy of LMLocalizer on Reddit.
  • Treat negative --ctx-size values as auto (0).
  • UI
    • Add drag-and-drop file upload support to the chat input (Gradio fork).
    • Reorganize the right sidebar with Mode/Character/Chat style on top.
    • Hide reasoning and tools controls in chat mode (only shown in instruct / chat-instruct).
    • Fade in new messages, fix scroll-up jump on send.
    • Rename "Send dummy message/reply" to "Insert user/assistant message".
    • Polish character dropdown in chat tab.
    • Tighten spacing between dropdowns and refresh buttons.
    • Improve the looks of the Session tab.

Security

  • Restrict CORS to localhost by default to prevent drive-by API access. --listen and --public-api opt into network exposure.
  • Sanitize character name in load_character to prevent path traversal.
  • fix: prevent path traversal in load_template_by_name (#7562). Thanks, @Allen930311.
  • UI: Improve web search security by rejecting non-HTTP links.

Bug fixes

  • Fix llama-server not being killed when the parent process exits on Windows, e.g. when closing the console window or killing python.exe (#7574).
  • Fix streaming output leaking across chats when switching mid-stream (#7555).
  • Fix continue-mode regressions across template families.
  • Fix incorrect prompts generated with continue mode. Thanks, @MeemeeLab.
  • Fix thinking channel being lost across tool-call turns (#7578).
  • Fix API model load silently dropping hyphenated arg keys (#7577).
  • Fix chat deletion failing when user_data/logs is a symlink (#7579).
  • Fix token count not being set in non-streaming mode.
  • Keep web search blocks closed when the user closes them mid-stream.
  • fix(win): set PYTHONUTF8 for non-ASCII locale Windows compatibility (#7560). Thanks, @jerry78424.
  • Set TORCH_VERSION to 2.9.0 to match xformers 0.0.33's torch pin (#7581). Thanks, @AJ-Gazin.

Dependency updates

Portable builds

TextGen is now a desktop app for local LLMs. Download, unzip, double-click.

Note

NVIDIA GPU: If nvidia-smi reports CUDA Version >= 13.1, use the cuda13.1 build. Otherwise, use cuda12.4.

ik_llama.cpp is a llama.cpp fork with new quant types. If unsure, use the llama.cpp column.

Windows

GPU/Platform llama.cpp ik_llama.cpp
NVIDIA (CUDA 12.4) Download (936 MB) Download (1.24 GB)
NVIDIA (CUDA 13.1) Download (840 MB) Download (1.33 GB)
AMD/Intel (Vulkan) Download (336 MB)
AMD (ROCm 7.2) Download (617 MB)
CPU only Download (319 MB) Download (335 MB)

Linux

GPU/Platform llama.cpp ik_llama.cpp
NVIDIA (CUDA 12.4) Download (893 MB) Download (1.21 GB)
NVIDIA (CUDA 13.1) Download (826 MB) Download (1.33 GB)
NVIDIA ARM64 (CUDA 13.1) Download (910 MB)
AMD/Intel (Vulkan) Download (324 MB)
AMD (ROCm 7.2) Download (409 MB)
CPU only Download (307 MB) Download (338 MB)

macOS

macOS note: You need to run xattr -cr /path/to/your/textgen-folder on the extracted folder before launching. See #7558.

Architecture llama.cpp
Apple Silicon (arm64) Download (272 MB)
Intel (x86_64) Download (284 MB)

Updating a portable install:

  1. Download and extract the latest version.
  2. Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Starting with 4.0, you can also move user_data one folder up, next to the install folder. It will be detected automatically, making updates easier:

textgen-4.6/
textgen-4.7/
user_data/    <-- shared by both installs