June 15, 2026
Like many of you in the IBM i community, we have been treading the AI waters for a couple of years now, starting with small proof-of-concept applications such as chatbots but progressing to far more advanced and complex code-to-AI interfaces.
One such project leveraged AI for recurring Jaccard similarity index processing over large sets of business data. The small projects fool a person into thinking that the large projects are just more of the same but, in reality, large projects take you into the world of proprietary AI APIs and can be complex engineering efforts.
We mostly use ChatGPT because it has a well-defined API and has been the most useful tool at helping us to use it. We have used AI to construct applications and solve specific programming problems. Leveraging the ChatGPT API was largely a matter of asking the AI to create interfacing logic in free-format RPG with embedded SQL code utilizing HTTP hooks. (This AI-to-API-to-AI construct reminds us of the proverbial snake eating its own tail.)
What bothers us, and we know we are not alone in this, is the pricing model for ChatGPT and similar AIs. Here we usually encounter a tiered per-seat pricing plan with monthly or annual billing. Behind this, however, is the concept of limits based on token usage. Any astute engineer quickly says, “Okay, but what is a token?” A reasonable expectation is that a token is based on something defined and quantifiable. A number of bytes processed or requests made or something similar. Tokens, however, are not so easy to understand.
Why do we care? Well, because we pay for tokens and therefore a cost associated with a project is token usage. Managers and executives like predictability when it comes to project costs and it seems altogether, well, wrong, to have a cost factor of a modern project be something that seems to squirm away from any attempts to nail it down.
How bad is the problem? Pretty bad. We recently collaborated on deep dive into this “token problem,” starting with the basic assumption that tokens are arbitrary units of measurement. How can one survive in a development world where an arbitrary unit of measure rules the day? For starters, it is incorrect to say that tokens are arbitrary. A token does have a definition (we will get there in a minute) and therefore can be considered part of a convention and a measurable convention at that. We refer you to any billable metered thing. The meter is arbitrary in origin but not in measurement. Suppose we are talking about water. The meter turns as the water flows past it and the gallons of water are consistently measured. The meter, if accurate, does its job of measuring the gallons of water, wherein a gallon is a thing for which we agree on a definition. (We know that even this example has complications! Are we talking U.S. gallons or Imperial gallons?)
An oft-quoted definition is that a token is “roughly three-quarters of an English word.” If you push hard enough and dig deeply enough you’ll find that tokens are based on linguistics and lean heavily on lexical analysis. That “roughly” and “English word” are not obfuscation, as they might appear at first blush. An AI token is a discrete symbol index produced by a tokenizer, mapping from input text into a model vocabulary. This mapping has several elements to it. The tokenizer is the first thing, but there is also the so-called “normalized” input, a packaging protocol, the aforementioned vocabulary, and a proprietary model.
To make a token: the output unit of tokenizer T applied to normalized input N under serialization protocol S with vocabulary V for model M. Try to pin down the exact definition of T, N, S, V, or M and you will quickly find that they are all proprietary constructs of the given AI provider.

Some parts of some tokenization mechanisms are public and reproducible (for example, frameworks such as SentencePiece or byte-pair encoding implementations), but the billing abstractions built atop them often are not. Vendors may layer in preprocessing, protocol overhead, caching rules, or model-specific accounting behaviors that make real-world cost predictions all the more difficult.
If the mechanism for computing tokens is opaque, doesn’t the unit being measured become unverifiable or at least dangerously unpredictable? IBM has upped the ante on this whole question with Bob, wherein we find not just paying monthly per-instance subscriptions but also a limit defined as fixed numbers of “Bobcoin” wherein tokens are just a part of the “stuff” being counted. While IBM arguably took a bad situation and made it worse, we do have to give kudos to IBM for the Bob Dashboard, which looks like a good start at exposing the reality behind the Bobcoin. But a lot more transparency is still needed.
Ironically, it has been our observation that senior management doesn’t really mind the opacity. Oh, they do care when token expense doubles and then doubles again and they yearn for a way to tie this expense in a meaningful way to profit (or loss) on a given project or for the enterprise. But there is an acceptance of sorts that tokens, in all their glorious opacity, are here to stay. Organizations typically then pick a mitigation path like a child with a choose-your-own-adventure story:
- Hire fewer devs because these tokens costs must surely be indicating greater productivity.
- Measure dev productivity by individual token usage. (Remember those fools who once used software lines of code to determine dev productivity?) Hire/fire/reward/punish devs based on this measure. Turn a blind eye to “tokenmaxxing” – the process of a dev burning as many tokens as possible to make themselves rank more highly in this measurement scheme (See also Goodhart’s Law).
- Blithely tie token cost to profitability and dare the board to question it.
- Require back-level model usage by devs because the tokens are usually cheaper in older models and this helps to control costs.
All of these paths feel like the real-world equivalent of a favorite emoticon:
¯\_(ツ)_/¯
In this choose-your-own-adventure, either token usage is good and leads to a better product therefore one should accept the cost for using the most up-to-date models and incentivize AI usage or tokens are expensive and costs must be controlled therefore one should use older models and in some cases even require justification of AI use at all for a given project.
One thing that would help: Guidance for developers to use as standards for making best use of AI wherein the costs are mitigated and results are predictable. What do we have instead? The promotion of “vibe coding” where a developer “feels” their way through an interaction with AI with an end goal in mind but little or no guidance along the way. All of our decades sharing best practices are out the window; we can easily run into projects that can’t be finished when limits are reached and the project budget has vibed its way into oblivion.
In researching this piece, we were struck by the lack of developers and engineers discussing the token problem. It must be discussed or else we are at the mercy of the AI providers in a way that minimizes everything we have learned as engineers about software development. Our ultimate thesis is that a token is non-arbitrary only if its definition is stable, inspectable, and constrained by something outside the authority that benefits from changing it. Today, the issuer controls definition, conversion, accounting, and audibility. In what world have we ever accepted such a bag of counter-intuitive nonsense?
Further, as engineers, we cannot inspect it, cannot reproduce it, cannot verify whether it changes, and cannot compare it across systems. An opaque proprietary unit used for billing by an entity (the same entity that controls the definitions) is functionally arbitrary from the payer’s perspective. Seeing IBM take the bad and make it worse with Bobcoin means that AI tokens, already opaque, are only being augmented by further obfuscation such that everything in this AI realm is a vendor-defined billing abstraction rather than a transparent universal unit of measurement.
We need to demand predictability of AI tokens instead of accepting their inherent unpredictability. This is not just to help us as engineers but is supremely important lest we give over our responsibility as engineers, ethically supporting the industries we inhabit, to AI providers who are not showing any signs of placing the interest of our industries above their own thirst for profits via ever-more-abstract proprietary billing practices.
Dan Darnell is a co-owner and vice president at Arkansas Data Services. Dan loves to write code, talk about code, design things, and, most of all, to work with other people to solve problems and create solutions. Dan has worked successfully as a consultant, managed companies large and small, and has written for international publications (including two books on Java for the IBM i). You can reach Dan at ddarnell@ark-data-services.com.
Eric Whitcomb is a freelance author, software developer, and thinker. Eric loves to dissect problems in realms across disciplines and dimensions. He excels at taking these thought experiments and turning them into real-world applications, wherein the application might be everything from software to a screenplay. Eric has extensive experience with all-things-AI. If you Grok, Claude, Gemini, ChatGPT, or slide into the Wild West of open source AI, Eric has been there too. You can reach Eric at eric@rootmedium.com.























