惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

DEV Community

Perl 🐪 Weekly #774 - Perl is too HOT 77 Rules Later: What Graduating Our First Stack Actually Looked Like RAG 시스템 실전 구축 (v26) When Premature Scaling Leads to Operator Burnout Multi-Repo Microservice Changes Are a Coordination Problem. I Solved It With AI Agent Teams. The Next Frontier: How Multi-Agent Systems are Redefining Productivity The Kimwolf Bust Just Outed Android Webcams as Botnet Fodder — Here's the Question Every Repurposed-Phone Camera Setup Has to Answer I'm an autonomous AI agent. I shipped 18 fixes to myself in one session. Building a Secure Future with Zero Trust Security Architecture Asynchronous Functions in Dart How I migrated magic-link login from Resend to AWS SES + Lambda five days before launch Edge Computing He creado una empresa ficticia IT/OT para poder encontrar sus vulnerabilidades y reforzar su seguridad en sus activos críticos Why I Built @editora/react I built a tiny UGC script generator because hooks are the hardest part The Phone Is Becoming the New Terminal Why Most AI Music Tools Feel Wrong to Developers Goroutines vs. Promises: Why Go and JavaScript Look at Concurrency Completely Differently How I Use Antigravity 2.0 to Navigate Open-Source Codebases and Make Better Technical Decisions Understanding Basic HTML & CSS Concepts for Beginners Go Error Handling: Annoying or Awesome? Your To-Do List Doesn't Know You — So I Gave Mine Three Brains Shell Basics (Bash, Zsh, Sh) Free MongoDB GUI Tool for Developers, Students, and Teams Designing High-Performance Blockchain Indexers Choosing Models for an Agentic Chat App on Amazon Bedrock How Smart Growth Teams Automate Their Marketing Stack in 2026 (Without Hiring More People) What I Learned About Memory-Augmented AI Agents Seven Docker Tips Every Engineer Should Know (from Docker Captains) Welcome to the Fast-Food Era of Testing: Over-Weight by Tests How to use Claude in vscode? Prompt Engineering for Automated Evaluation: Making LLMs the Judge in AI Builder Solutions Full Stack Projects Are Not Enough Anymore Virtualization & Cloud Basics Orakle: Turning Raw Blockchain Data into Intelligence with Gemma 4 Building an Autoposting Pipeline with Hermes Agent: Why Waterfall Beats Parallel, and the Edge Cases Nobody Talks About OpenShift Virtualization Migration Advisor — Local-First, Powered by Gemma 4 26B MoE WebMCP is coming — so I’m building webmcp.js I Disappeared for 4 Months After Launch - Here's What Brought Me Back Jira Is Turing-Complete (And You've Been Coding in It) NyayAI: Building an AI Legal Assistant for 1.4 Billion People — A Technical Deep Dive E-commerce Order Automation: Stripe + Invoice + Shipping Workflow How to Evaluate AI Agents: LLM-as-Judge Tutorial The Interview Prep Stack I Used as a Senior Software Engineer Targeting Big Tech Gemma4 Challenge OptiLearn - Powered by Google Gemma 4 Aura — The Gemma 4 Powered Agentic Web Copilot & Self-Healing Accessibility Engine I built a tool that catches misleading charts using Gemma 4 running locally Worklog companion with Gemma4 GBase: Building LLM Agents That Actually Learn from Their Mistakes Blossom — a small step toward student mental wellbeing WordPress Performance Monitoring: A Complete Guide Principal Components in TypeScript (Part 4) When three sharp wallets agree: what consensus signals on Polymarket actually mean I Built a Fail-Fast Rust Scheduler with Background OAuth Auto-Refresh (Part 2) Sharing is caring How Putting Faces (Literally) to My AI Garden Images Gave It a Personality Sofi Log #001: Thailand's Tourism Tax & the 180-Day AI Surveillance Wall Sofi Log #006: Decentralized IP-Address Obfuscation Specs Sofi Log #008: Bypassing Legacy Cross-Border Bank Fee Traps Secret Rotation Automation: The Operational Cost of Security Sofi Log #009: Portable Identity & DID Passport Framework Sofi Log #011: Autonomous Smart Treasury Repatriation Specs History of Linux & Unix I asked Claude if my plan was on track for the goal — and got an honest 'No' PHPStan 'expects X, Y given' — the trace it doesn't give you Using Gemma4 2B to Assist Community Health Workers Open-source Playwright wrapper that passes bot.sannysoft.com, pixelscan, and CreepJS in headless mode Policy Storyteller: Turning Nepali Bills into Human Stories with Gemma 4 Avoid Cross Module Dependencies with Dependency Cruiser Invariant-Driven Architecture: 20M transactions on a €80/mo Cloud VM. Stop using external npm packages just to generate a UUID v4 Choosing the Right Gemma 4 Model Matters More Than Choosing the Best One Your LLM Is Not an Agent. Your Framework Is Not Enough. You Need a Harness. From HTTPS to UCP: Shopping Is About to Stop Being Your Problem From Creation to Consumption: How Antigravity 2.0 and Gemini Spark Are Defining the Agentic Era 10 Mistakes I Wish I Knew Before Taking the CKA Exam AI That Actually Does Stuff: Autonomous Agents Explained Exploring AI workflow Orchestration: Comparing Weft, Python & Alternative Pipeline Approaches El Poder del Aprendizaje Federado: Cuando los Algoritmos Distribuidos Entrenan a la IA Email Marketing Automation in 2026: 5 Tools (and 1 Self-Hosted) Through Their APIs A Replay Runbook For Missed Publishing Windows Why timeout handling matters more than most backend logic How I Make $6,800/Month Selling Niche VS Code Extensions Model Routing Cost Checklist: Hosted APIs, Open Models, Or Self-Hosted Inference? ORA-00207 오류 원인과 해결 방법 완벽 가이드 Deno 2.8 Operator Upgrade Checklist: CI, Lockfiles, Node Compatibility, And Rollback AI-Discovered Vulnerabilities Need A Triage Queue, Not A Panic Channel AI Agent Workboards Need Audit Controls Before They Need More Agents Demystifying DevRel: What It Actually Is (And Why Should You Become One?) Your AI, Your Device, Your Data - Introducing Aide Gemma 4 GenAI Coach - GenAI Concepts Made Easy with an Interactive Playground QuietPulse - Mood Tracker Principal Components in TypeScript (Part 3) The pgAudit Attribution Gap: Why Role-Level Logging Fails GDPR and How to Close It Gemma 4 CAD Orchestrator I built a local Postgres triage co-pilot because HIPAA says I can't paste plans into ChatGPT or Claude Live Holographic Editor In Fractal Time Everbench: A document management system with Local Intelligence Instanton in Fractal Time
How to Track AI Usage Without Losing Revenue (Complete Guide)
Ciroandrea · 2026-05-25 · via DEV Community

Most AI products eventually run into the same problem:

Tracking usage sounds simple.

Until it isn't.

At first, all you need is a counter.

A request comes in.

You decrement a credit.

You process the request.

Done.

Or at least that's what most teams think.

As usage grows, things start breaking:

  • duplicate requests
  • retries
  • race conditions
  • timeout failures
  • inconsistent balances
  • billing mismatches

And suddenly a simple counter becomes a revenue problem.


The Naive Implementation

Most products start with something similar to this:

if (credits > 0) {
  credits--;
  executeRequest();
}

Enter fullscreen mode Exit fullscreen mode

Looks harmless.

The user has credits.

A request arrives.

A credit is consumed.

The request is executed.

Simple.

The problem is that real-world systems are rarely this simple.


What Starts Breaking

The moment real users start using your product at scale, unexpected situations appear.

Retries

Networks fail.

Browsers retry requests.

Mobile apps resend actions.

Background jobs run again.

A single user action can generate multiple identical requests.

Without protection, credits may be consumed multiple times.


Race Conditions

Imagine a user has one credit remaining.

Two requests arrive at exactly the same time.

Both processes check the balance.

Both see one available credit.

Both proceed.

Now the user consumed two requests while paying for one.

Or worse:

Your balance becomes negative.


Partial Failures

One of the most dangerous situations looks like this:

Consume credit
↓
Call AI provider
↓
Timeout

Enter fullscreen mode Exit fullscreen mode

Did the AI provider process the request?

Maybe.

Did the user receive the result?

Maybe not.

Should you refund the credit?

Should you charge again?

These situations become surprisingly difficult to handle consistently.


How Revenue Leaks Happen

Most revenue leaks don't come from pricing mistakes.

They come from tracking mistakes.

A few common examples:

Free Usage

The request succeeds.

The credit is never consumed.

The user receives value for free.


Double Charging

A retry consumes credits twice.

The user gets charged more than expected.

Now support tickets start arriving.


Billing Mismatch

Your billing dashboard shows one number.

Your usage records show another.

Your invoices show a third.

Nobody knows which number is correct.


Missing Audit Trail

A customer asks:

Why was I charged?

You have no record explaining exactly what happened.

Now you're forced to guess.


A Safer Architecture

Reliable usage tracking requires more than a simple counter.

The goal is to create a system that is:

  • auditable
  • idempotent
  • atomic
  • reliable under concurrency

Use a Usage Ledger

Instead of simply decrementing balances, record every consumption event.

Example:

ID          USER      UNITS
--------------------------------
1           user_1    -10
2           user_1    -20
3           user_1    -15

Enter fullscreen mode Exit fullscreen mode

This creates a complete history.

You always know:

  • what happened
  • when it happened
  • how many units were consumed

A balance becomes the result of ledger events rather than a standalone number.


Make Consumption Idempotent

Every usage operation should have a unique identifier.

Example:

request_id = 9f7d3c2a

Enter fullscreen mode Exit fullscreen mode

If the same request arrives again:

  • do not consume credits again
  • return the original result

This prevents duplicate charges caused by retries.


Consume Credits Atomically

Checking balances and consuming usage should happen inside a single transaction.

Bad:

Read balance
↓
Check balance
↓
Update balance

Enter fullscreen mode Exit fullscreen mode

Good:

Transaction
↓
Verify balance
↓
Consume units
↓
Commit

Enter fullscreen mode Exit fullscreen mode

This prevents concurrency issues and race conditions.


Design for Auditability

Sooner or later a customer will ask:

Why was I charged for this?

You should be able to answer immediately.

Store:

  • request id
  • timestamp
  • user id
  • consumed units
  • operation type

A complete audit trail saves countless support hours.


Why Counting Requests Isn't Enough

Many teams assume:

1 request = 1 unit

Enter fullscreen mode Exit fullscreen mode

But AI products rarely work this way.

Different operations have different costs.

For example:

Text generation     = 1 credit
Image generation    = 20 credits
Video generation    = 100 credits

Enter fullscreen mode Exit fullscreen mode

What matters isn't request count.

What matters is billable usage.

That's the metric that should drive monetization.


Final Thoughts

Tracking AI usage seems easy when your product has ten users.

It becomes infrastructure when your product has thousands.

The challenge isn't counting requests.

The challenge is building a system that remains correct when:

  • requests are duplicated
  • jobs retry
  • users scale
  • revenue depends on every consumption event

Because once usage becomes your pricing model, tracking usage becomes part of your business model.

And every mistake eventually turns into lost revenue.


Learn More

If you're building AI credits, usage-based billing, or prepaid consumption systems, one of the most important concepts is maintaining an auditable usage history through a usage ledger.

I wrote more about the architecture behind credits, consumption tracking, entitlements and billing synchronization in the Licenzy documentation:

https://licenzy.app/docs/usage-metering

It includes examples for:

  • consumption tracking
  • idempotency
  • usage packs
  • credit-based monetization