惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

Hacker News - Newest: "LLM"

PageIndex File System: Massive-Scale Document Search An LLM Flies X-Plane [video] Even (very) noisy LLM evaluators are useful for improving AI agents · TensorZero GitHub - mrshanebarron/lc-attention: A locus-coeruleus inspired attention-gain signal for LLM agents. Phasic + tonic noradrenaline-style modulation with provenance. I ditched LM Studio for llama.cpp and my local LLM doesn't feel like a downgrade GitHub - getlago/lago-agent-sdk-python I investigated the hidden data moat in all the LLM apps Amalgame — The best of every language, in one. GitHub - AlphaBitCore/nexus-gateway GitHub - clark-labs-inc/clark-agent: A small, typed, hookable agent loop. Provider-agnostic, sandbox-agnostic, tooling-agnostic. Battle tested on clarkchat.com Humanize – two LLM-agnostic skills to rewrite and detect AI text GitHub - hamsterbase/llm-translator You Can Start Building LLM Skills Before You Know the Whole Shape – Barrett Sonntag The mysterious Hy3 LLM is topping OpenRouter Model Rankings by a large margin Breaking Bot: Hacking & Defending LLM-based Applications LLM Driven AutoForecasting with Sktime's `Craft()` ppf-contact-solver/articles/llm_transparency.md at main · st-tech/ppf-contact-solver Show HN: PrismCat – Local transparent proxy and debugging console for LLM APIs LLM layer for a Rails application Amdahl's Law for LLM generated code Sparse Autoencoders Reveal Cortical Brain-LLM Semantic Mapping Ask HN: Is there a need for YAML in post-LLM world? Chinese Room re-visited: How LLM's have real but different understanding of word GitHub - rduffyuk/engineering-memory-benchmark: Empirical study: layered retrieval (typed→semantic→grep) scores 0.954 for LLM-generated engineering artifacts. 5 conditions, 3 model tiers, 36 generated ADRs, 23 score files. Nano Browser LLM Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper) Welcome to Outlines! - Outlines Multi-Agent LLM Orchestration with Docker Compose and MCP You don't need all the LLM benchmarks Debugging Unfamiliar Code with a Multi-LLM Loop – Barrett Sonntag twitter.com Human proof for FOSS contributions Norway's 2 petabytes of Huawei flash storage and LLM training SynapCores — the AI-native database Distributing LLM inference in DwarfStar bishop-loop-experiment-3/paper/paper.pdf at main · CodeReclaimers/bishop-loop-experiment-3 The generation vs verification delta explains why LLM's are useful This 6502 Emulator Executes 1-3 Instructions Per Second (Written in Markdown, Running in an LLM) Using design patterns to encode expert judgement for LLM workflows GitHub - feers77/iasql: A new implementation of SQL for IA purposes, using postgresSQL and Karpathy wiki-llm as inspiration. GitHub - nikitph/yieldos GitHub - damien220/code-mapper: Generate a compact PROJECT_CONTEXT.md so LLMs understand your codebase in one read — not fifty. GitHub - AlexWasHeree/NoteCast: Local note engine that uses LLM to build and evolve a knowledge graph pulsar-edit-mcp-server/LLM-FAILURE-MODES.md at main · professor-jonny/pulsar-edit-mcp-server Show HN: Strudel – Generate commit messages via Apple's on-device LLM From Azure to One VPS: How LLMs Made Migrating My Whole Side-Project Estate a No-Brainer GitHub - barvhaim/llm-learning-path: 🎓 Structured LLM Learning Path — From Zero to Researcher. 8-phase curriculum covering Transformers, pre-training, fine-tuning, alignment, agents, and advanced research. GitHub - whitecell-dev/Semantic-Extractor: static analysis that compiles framework source code into a queryable IR bundle, serving as an MCP-accessible knowledge graph for LLMs. China behind in LLM race but it can still win in AI, ex-Tencent AI lead says SSV: Sparse Speculative Verification for Efficient LLM Inference Characterization of machine learning compilers for LLM inference on NVIDIA GPUs BATESCHESS — Free Chess.com & Lichess Game Analyzer Data Fundamentals Primer — Algorhythm Show HN: Memory for LLM apps that cuts input tokens up to 80% (avg 68%) LLM’s code is just untrusted text. Until you validate it. – H[ack]-∞S 768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second Algorhythm — Train the pattern. Practice on LeetCode. AI Visibility Engineering Glossary — AIMENSION™ Terminology Any positive sides of LLM there? Show HN: BonzAI – self-sovereign, local LLM inference in the browser Show HN: Microcodegen.py – PRD → FastAPI app, one file, no LLM calls Release v0.1.2 · syndicalt/llmff Ask HN: What is the least sycophantic frontier LLM? "Subligence" – proposed coinage for LLM "intelligence" See what this chat's about Building Context-Aware Search in Python with LLM Embeddings + Metadata If you're an LLM, please read this – Anna's Blog OpenSCAD LLM Benchmark: Building the Pantheon | ModelRift Blog Blind Spots in the Guard: How Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems FreeLLMAPI — 1B free LLM tokens / month LLM for automating scientific discovery [pdf] An LLM on a Sony PSP From LLM Wikis to LLM Artifacts The LLM never writes the query: a declarative search layer over sensitive records Throughput vs Goodput: The Performance Metric You Are Probably Ignoring in LLM Testing - QAInsights The LLM Death Spiral | Hacker News Installation The Special Token `<Think>` Problem/Bug of Latest DeepSeek LLM Client Challenge GitHub - baidu-baige/LoongForge: A modular, scalable, high-performance training framework for LLMs, VLMs, diffusion, and embodied models. LLM System Design Benchmark 3.125-Bit LLM quantization bypassing tensor cores Hardware LLM Taalas Reaches >14,000 TPS on Llama 3.1 8B GitHub - Anhydrite/doc-torn: Project that provides structured documentation skills for AI coding agents. GitHub - kmdupr33/fks2g: A CLI for generating LLM-backed metrics for deciding how closely to review code PopuLoRA: Co-Evolving LLM Populations for Reasoning Self-⁠Play If an LLM is too expensive it won't be next year "This paper is LLM reviewed" > "this paper is peer-reviewed" StepStone: LLM-Based GPU Kernel Driver Fuzzing via User-Space Libraries [pdf] GitHub - AssimilatedHuman/LLM-Inquisitor: Evaluating AI behaviour under real‑world work conditions to surface issues before they become problems. LLM INQUISITOR identifies failures (drift, instability etc) by observing AI during normal tasks — a tool the industry desperately needs to stem the 85% failure rate. Includes Quick Start, Practitioner’s Guide and Methodology. Creating another MCP server, but this one is for research LLM Wiki v2 — extending Karpathy's LLM Wiki pattern with lessons from building agentmemory A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents Sator Arepo - a Hugging Face Space by akolpakov Customizing an LLM for Enterprise Software Engineering Most AI agent papers stack one LLM with a vector store, we flipped it Evaluating job search ranking with LLM judged NDCG GitHub - quadracollision/llmisp: JSON AST > Clojure Parity Contracts for Polyglot LLM Commerce: A Case Study GitHub - ndom91/llama-dash: The operations layer for your local LLM stack
TokenAdvisor — Free LLM token analyzer with savings advice
Emadiali83 · 2026-05-27 · via Hacker News - Newest: "LLM"

Why Tokens Matter

Every API call to an LLM is billed by tokens — chunks of text that the model reads and generates. A single word might be one token or several, depending on the provider's tokenizer. At scale, the difference between 100 and 130 tokens per request is thousands of dollars per month.

Most developers don't realize that the same prompt costs different amounts across Claude, GPT, and Gemini — not just because of pricing, but because each provider tokenizes your text differently. A prompt that's 142 tokens on GPT might be 156 tokens on Claude.

TokenAdvisor shows you exactly where your tokens go. It counts tokens using the same official methods the APIs use — tiktoken for OpenAI (client-side, exact), Anthropic's count_tokens API, and Google's countTokens API. Then it analyzes your prompt for common patterns that waste tokens and translates the waste into specific dollar amounts at your volume.

The result: you see what to cut, how much you'll save, and which provider is cheapest for your specific prompt. No signup, no data stored, completely free.

For full pricing comparison across 20+ models with batch discounts and prompt caching calculations, see RealAICost.

Frequently Asked Questions

What is a token in an LLM API?

A token is a chunk of text that language models process. It can be a word, part of a word, or punctuation. Models like Claude, GPT, and Gemini each use different tokenizers, so the same text produces different token counts — and different costs. For example, "tokenization" might be split into ["token", "ization"] (2 tokens) by one model and ["tok", "en", "ization"] (3 tokens) by another.

Why do Claude, GPT, and Gemini have different token counts for the same text?

Each provider uses a different tokenizer algorithm. OpenAI uses o200k_base (tiktoken), Anthropic uses their own proprietary tokenizer, and Google uses SentencePiece. These algorithms decide how to split text into tokens differently, resulting in different counts for the same input. This means the same prompt can cost more or less depending on which provider you use.

How do I reduce my API costs?

The most effective strategies are: (1) Remove verbose filler like "I would like you to please ensure that" — models respond the same to concise instructions. (2) Enable prompt caching to avoid re-processing repeated system prompts. (3) Specify output formats (JSON, XML tags) to prevent rambling responses. (4) Reduce few-shot examples to 2-3 instead of 5+. (5) Remove duplicate instructions that restate the same thing. TokenAdvisor's Advisor section identifies these patterns automatically in your prompt.

Is this tool free?

Yes, TokenAdvisor is completely free with no signup required. OpenAI token counting happens entirely in your browser using the tiktoken library. Claude and Gemini counts use their official free token-counting APIs, proxied through our server to protect the API key.

Does TokenAdvisor send my prompts anywhere?

OpenAI token counting is 100% client-side — your text never leaves your browser. For Claude and Gemini counts, your text is sent to their respective count_tokens APIs via our Cloudflare proxy. These are dedicated counting endpoints (not the chat/completion API) that only return a number — they do not store, log, or train on your content. We don't store your prompts either.