

























I didn't suggest it was. I pointed out that some of the subscriptions offered by the Chinese labs probably are. Not the per token API prices. |
Okay interesting. I presume that China also has low cost areas too no? Their grid at least seems more stable. Datacenter construction is more likely to raise prices in the US than there. |
China's grid has had some serious issues over the past decade that didn't get widely reported for all the reasons you can think of. Some of them were exasperated by poor planning and censorship making it hard to hold anybody accountable. Not to say that they don't/didn't eventually work on it, but there was a widely held belief that the people at the top weren't even aware of the issue until foreign firms were directly impacted. This is not to say they can't or won't expand come hell or high water, though. https://www.bbc.com/news/business-58733193 https://www.cbc.ca/news/business/china-power-cuts-1.6193281 |
Quebec has lower rates then 7¢/kWh at data center / wholesale level. Quebec spot market runs negative sometimes, apparently. And Oklahoma has cheap power, and probably other places. Not sure your utility bill is the place to get accurate numbers.
https://www.ferc.gov/sites/default/files/2025-03/25_State-of... If my math is right, divide those by 10 for cents per kWh |
Yeah, this argument is bullshit. You can head over to Openrouter and look at the token cost for deepseek-v4-flash and deepseek-v4-pro. They are very competitive on the open market |
Add MiMo 2.5 to the list. Priced like DeepSeek, performs similarly but it also has vision capability. |
It’s true once built the data center can operate right up to a financed data center value of zero. The investors will loose money but the costs of AI will go down as they do |
Has there ever been a market for cloud gaming apart from middle class people with macbooks who casually want to play one particular game but not enough to pay for a whole PC or console? |
I'm surprised people think LLMs, a thing which mainly excels at advertising, spam and writing code is going to generate that much economic activity. |
I would go as far as to say writing more Code has almost no impact on their economic productivity. What drives those companies is infrastructure and networks |
Not necessarily. European grocery shoppers report higher satisfaction with the shopping experience than American grocery shoppers do. |
If we talking about Meta, Google, etc. code is only incidental to them earning money. |
But what if it kills current ad-tech as we know it (paying to show ads on random sites without any way to verify that the site is legit), and the flow of ad money for legitimate goods turns back to journalism, magazines and other publications? That would be half a trillion[1] redirected to regular people just from Google Ads. [1] snatched my number from here: https://pixis.ai/blog/2025-google-advertising-benchmarks-for... |
I mean, that says a lot about the kind of crisis out current economy is in. How much longer can the United States Be a world leader when it’s primary function is social media and advertising |
The $1T number seems more promises than reality, which is closer to the $300B to $500B level. Still a big number, but between a third and a half of the value used in the popular media. |
If you have a good model router, you can route to older, cheaper models that run on older hardware, for simpler tasks. That helps labs extend the economic life of their hardware investments. They will likely fight it at first though as they see it as reducing ASP. This is why I'm building role-model, a routing protocol and a router runtime: https://role-model.dev/ |
The other part of that is that while price per token may be going down, tokens per task is going up |
For ~equivalent tasks/results, or because we’re expecting more or better from tasks? The real measure should be cost per ~equivalent task result, not cost per token nor tokens per task. |
For better performance of ~equivalent tasks. That's what all the harness tooling people are using does: (often) increasing output quality by significantly increasing token counts. |
Using a shittier model is just more work for the user, I’m not sure why anyone does it, unless they’re playing with it like a toy. |
> I use a local model to log everything I do all week to automate my timesheet. Isn’t that just more work than logging it yourself? |
I sometimes let Claude Opus create plans, DeepSeek v4 pro implements and writes tests. Claude reviews and corrects. Saves like $2-3 per session. Same quality code. |
What you call a shittier model is what was considered frontier and fantastic one generation ago… |
except for you know the enterprise customers who won't change their code and will pay to run old inefficent hardware just to keep from dealing with upgrades? |
What do you think are running on the T4 GPUs in AWS? A lot of the use cases I know of for them are mid-level computer vision models that don't need to be frontier level. |
Gradually, and especially when hot. Modern chips are pretty close to the physical limits of how small they can be made, and that means atomic/chemical effects like electromigration are accounted for and determine the lifetime. Every extra 10 degrees Celsius of temperature doubles the speed of chemical reactions. When they stray too close to the line ... you get Intel's 13/14th gen chips that wear out after 1-2 years instead of 10-20 years. Intel calls it "Vmin drift" because that doesn't sound scary, but the actual point is that various wear-out mechanisms push the chip outside of its design envelope - increasing the voltage or lowering the clock speed may get it to run for a while longer, but you're living on borrowed time as the various circuits just stop working right and you get unpredictable instruction mis-execution: https://fgiesen.wordpress.com/2025/05/21/oodle-2-9-14-and-in... |
sounds like planned depreciation on Intel's part, they definitely do not design server grade chips for longevity since that would harm their own revenues |
They didn't replace all the chips like with the FDIV bug though. What did it cost them? Only reputation? |
My understanding is that a lot of AI data centers are still heavily relying on spinning HDDs, which is why seagate, western digital are selling more HDDs than ever before. |
i think its reasonable to give up 15% of speed for a decade more lifetime. This depreciation change alters economics of GPU |
Nothing is stopping them, it's just not worth it: Have a look at e.g. vast.ai's pricing (https://vast.ai/pricing). The V100 (2017 -> 9 years old) can be rented from $0.02 to $0.37/h (right now I can find a V100 with a Xeon Gold 6140 and 48GB RAM for $0.165/h). Let's assume the guy you rent it to pins it at its 250W TDP and let's ignore the running costs of CPU/RAM/etc... Then you draw 1/4 kwh for that compute hour. The industrial electricity prices in the US vary between 7.5 and 25 ct per kwh (depending on state, time of day, etc...), so at 100% efficiency, assuming nothing ever breaks, and the CPU consumes 0W you earn about 14ct/h. And remember: V100s hours are sometimes sold at 1/10th the price. If I pick average conditions you need to start thinking of whether it is worth it to rent them out: Usually it isn't unless you have them anyways and just sell idle capacity. It's barely worth it to run them in a pure "is it profitable" sense, if we also account for the opportunity cost of taking up a slot in your datacenter it seizes to be worth it really quickly. |
GPU do depreciate indeed, but here the depreciating commodity is the token, not the hardware. You sell cheaper token with the same hardware |
the hardware itself is still useful, but random failures happen every so often, so if you're trying to run a fixed sized fleet then your fleet shrinks when you can't get spares any more |
If you do the math, they don't have a choice. If China captures America's AI market it'll cause a major depression. They'll give it the BYD treatment, though it'll be a lot less effective. |
They'll ban them because (unless run locally or self-hosted) they are just data capture tools for the China. |
You will see soon that china uses illegal uyghur children labor to train these models so we should all boycott them |
If it’s open weight then anyone can run it for you. Presumably someone you trust just as much as US proprietary models. |
You dont think CIA and NSA are reading the data Asian and European companies and individuals send to openai and antropic? |
China is the worst trading partner in the world. They banned most companies from functioning in their country for decades |
> Once a model is open-weight, safeguards that do exist can be removed Safeguards trained into the model (ie exist in the weights) can’t be removed. |
Raise them, more likely. NVidia says that GPU hardware prices won't decrease until at least 2030. The world is out of fab capacity. |
Seriously, they’re trying to justify trillion+ IPO’s while setting piles of money on fire, prices aren’t going DOWN. |
Today's frontier models will be tomorrows low-end option. I think whatever model you are using today will be less expensive to use a year or two from now. |
Last year's o3 was more expensive than 5.5 is. Whatever model we are using now is probably be more expensive than next year's leading models will be. |
Price per M/tokens is also a fuzzy metric when newer models reason longer, and then burn more tokens while doing so. |
Isn't 5.5 a router, though? As in, some prompts get automatically sent to a cheaper model? |
im implementing this now. thanks. the guides specify the exact intention of more determinism. |
Most sane US companies will disallow use of cloud-based Chinese AI providers, because everything including code, data, PII, etc is being sent to them. |
Then don't use the cloud-based Chinese providers, use cloud-base US/EU providers using Chinese models. The interesting Chinese models are all open making this issue mostly moot. |
Cro.ai seems to be: https://crof.ai/ Of course though they are not necessarily a viable solution for companies with security requirements etc. given it is just a single person project, but they still serve as a proof it can be done. |
By "cost" I think the parent means the provider's own costs, not the cost of inference to the customer. The cost of land, labor, and electricity are significantly lower in China than in the US. |
Have you heard of openrouter? There's 1000 of these companies already. Do something else. |
> The models themselves are the problem -- most large US companies are not going to touch them. Can you expand on this? |
Deepseek has some models in Bedrock. There is definitely a huge market for a "good enough" model running within the country of the company |
> Deepseek has some models in Bedrock. Just looked into it, seems like at most they have just 3.2, not 4: https://aws.amazon.com/bedrock/pricing/ Looking around their catalogue more, most of their models seem quite outdated, aside from the OpenAI and Anthropic ones (but those get more expensive). I wouldn't willingly pick Bedrock and would instead throw money at OpenRouter, that has both a bunch of providers, as well as almost any model for you to try. |
Do you trust OpenAI with your code, data, PII? What makes you so sure it's not all part of the next training set anyway? |
> Do we know that AI providers are going to keep these per-token prices, or eventually lower them because of competition from China? Are they even making money off them now ? |
Why would I even pay for deepseek? I get deepseek v4 flash for free with opencode. If I somehow run out of tokens for the day, I can just then on my vpn |
They're going to need to bring in a few trillion dollars fast to meet wall street expectations. Expect prices to rise. |
personal take? token prices are a race to the bottom, as long as open source models remain competitive. that's why OSS is so important |
The majority of Deepseek providers on OpenRouter for v4 pro are in the US. Especially interesting is that they are in the same ballpark for pricing. |
They are in the same ballpark for deepseek-v4-flash, but deepseek-v4-pro from deepseek is still around 1/2 of the alternatives. |
I'm pretty sure that Deepseek said that pricing was promotional. Be curious to see if it lasts. V3 pricing from them was right in line with what the commodity providers are charging. |
“Any” is a very high bar Unless laws prevent it, I don’t see why a substantial minority wouldn’t buy services from where they can get them at a similar quality and much lower price. |
Together.ai provide many open weights models and as far as I’m are their servers are US based (the company certainly is) |
API prices of Anthropic, OpenAI, and Google are massively inflated. https://martinalderson.com/posts/no-it-doesnt-cost-anthropic... There's no way that all AI inference providers are colluding and/or all running at a massive loss, meaning the cheap Chinese model prices must be the real cost it takes to run frontier-class models PLUS their margin. Look at Deepseek 4 Pro. https://openrouter.ai/deepseek/deepseek-v4-pro/providers Deepseek and Baidu are subsidising prices but they probably train on inputs. I have no model training and ZDR in OpenRouter enabled, and the first provider that shows up there is Deepinfra, significantly more expensive than Deepseek. BUT much cheaper than Sonnet 4.6 and ChatGPT GPT-5.4. |
Which organizations? Uber is not representative of any trend beyond big tech and VC over funded startups. |
Is your argument that $1500 / mo is too much? Why would the engineering team not be more rigorous in their model selection given a constraint? |
Nearly no one is doing anything that is “only possible with AI”. This doesn’t seem like a relevant calculation. People spend on AI as an investment in their current productivity. |
I agree, outside of the AI bubble, there's a lot of wait-and-see happening in the B2B world right now, I'd say we're currently 6-8 months into that 14 months. |
It also presupposes that open models will bridge that gap towards opus4.5, which was really when I drank the AI coding koolaid |
> Don't ask LLMs for big changes > Review everything and point them in the right direction Sorry upper management doesn't care. That's an engineering problem that you need to solve. |
They were proposing a solution.. To use flash models and use them in a way that best amplifies your work. |
opus to produce workflows, flash 3.5 to do them. Chinese models prob work too, but idk since i cant use them at work |
Why would double be a good rule of thumb for typical US SWEs? Most of the costs aren't proportional to salary, and the ones which are aren't anywhere approaching 50%, much less double. |
Genuinely thank you. I’ve had sleep issues my whole life and no one has mentioned these. Not saying I will pay that, but more info is always good. |
There is a tier just outside of FAANG that pays similarly or better, prominent examples being Uber, Airbnb, Stripe, Block, Databricks, Datadog, Pinterest, Snowflake, etc. |
250k for base pay is about in line with median I'd say. If 250k was the total comp (taking into account bonus/stocks/what have you) then yeah, you definitely should have negotiated. |
I’ve even heard the rule “twice the salary” being used here in EU, but the tax and insurance burden may be higher. All kinds of those are based primarily on total payroll amount. |
Both metrics are valuable. If one uses AI minimally and is able to out perform peers who are maxing out AI spend, one might want to use that in salary negotiations. |
Quoting the article : > Levels.fyi lists the median yearly compensation package for Uber software engineers in the USA at $330,000. |
> But you can't talk to them about the flow of the code. You can't ask them for their thinking as to why certain things are. You can absolutely do this. It's even right most of the time. |
When you criticize AI, always remember that the alternative is the average employee. Today's models are pretty good. |
A lot of people think they're above average. A lot of them are wrong. A lot of average people are producing gigantic messes. At least previous to this they were gated by their mediocrity. |
I'm not American or ever worked in the USA. It's not a judgement of human value. It's a judgement of work output. |
and have they totally got rid of the average employees? They can blame the models for the production outages already? |
when you criticize the average employee, always remember that the alternative is the average employee with AI. |
Happened to me at least three times the past 14 days. I point out where it made a design decision that causes data loss. «Oops my mistake» |
And you can certainly tell it the flow you want (and any other constraints) in the prompt. |
Literally in the middle of ripping apart a vibe coded mess at work to figure out what's even worth keeping. Not fun :( |
> Because companies are betting that this spending will allow them to reduce cost by firing people. I've never worked at a company that didn't have a technical backlog measured in years. |
If they don't hire to get it done it means they don't think it's really important to get it done. |
Not a service, but do you remember Scrum Masters? We had them as full time employees not so long ago. Pure fad. |
I hope this is sarcasm, or "half" my job doesn't exist or something. Or you talking about full time non-dev scrum masters? |
Hey I'm a consultant. They pay me to be a regular developer but they cannot hire since they just fired thousands of people which they apparently did need, turns out. |
All the NLP experts that companies bring in to make those seminars despite it has been debunked decades ago for example… |
There is a whole spectrum between "ai coding is a fad" and "unlimited tokens for every employees we don't even care if it actually ends up being a net positive financially" |
I like the presentation I heard from a Principal, that AI tools amplify your competence. If you start out incompetent, it'll just allow you to be incompetent with greater scope and (negative) impact. |
When I started I learnt something about coding from VBA macros to automate excel. Often that started with the macro recorder. Then you worked out what that "recorded" code/sludge did, removed the crud you didn't need or want, improved the logic and so on. I bought books to understand it better. Now you can ask a (different) LLM "what is this? why is it used? How would I?" etc which is probably a faster learning curve than books, newsgroups and old school personal home pages with good info. I would have been quite surprised when I first used a VBA macro in anger just how far I would go down the rabbit hole. C, asm, verilog, Linux were no part of what I originally signed up for! Some people will specialise in the equivalent of recording macros and go no further. And this will be fine for code that gets it done but doesn't matter too much in the other dimensions (security, reliability, usefulness without the authors' support, etc.) Much like VBA utilities inside companies that were useful way back when. Other people will want what they produce to be better, even good, and they will learn about floating point [1] and all the rest, much as I did. Probably learn pretty fast too. [2] [1] https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.h... [2] Working out how to write an excel vba webserver and using it to collect and and collate summary data from various divisions into reports was seedy as hell, solved the actual business problem (given ridiculous but intractable constraints) and isn't something you can record. We all have stories from a misspent youth that we're simultaneously ashamed and yet somehow proud of. |
And, you don't have to vibe code. A competent developer can make great use of AI. I think a developer that can develop the system themselves is the most accelerated user. |
> You don't actually need to know the answer to those questions in order to vibe code No, but you do need to know the answer to respond to that 3AM page about prod being down. |
But is it enough of an improvement to justify the cost? (Since the current raises are probably just the beginning) |
Uber cutting back to ~$1,500/engineer/tool/month makes it look to me like they think there's at least $1,500 of monthly ROI to be had per engineer. |
"I suspect this is a first pass at figuring how to budget them and there will be a second pass." Completely agree with that. |
Yes you can, and this is still the most realistic use of AI llms, but this is a 2x multiplier, not 10x or 20x |
Oh, it won't get any better. LLMs already trained on every bit of code ever published, they won't get any more material. |
If anything the snake is eating it’s own tail because now it’s training on vast amounts of its new slop…dragging down the average bar of quality. |
> Which other tool went from nothing to this level of acceptance so quickly? NFTs? My company had nothing to do with blockchain but I ended up working on NFT integration regardless. |
I still believe Scrum is a fad and yet companies have been spending obscene amounts on to push it down developers' throats for decades now. |
Why are there so many people who mistake simple anecdotes for actionable data? Why do the majority of businesses fail rather than succeed? |
yea, and understanding that too is important. the idea you dont need to read code or analysis seems to align with the depwndcy addiction being shoved in thw pipe. |
Even if the laptop costs $5k and you upgrade it every year with the latest hardware and run local models (assuming your workload can tolerate smaller models at slower tok/s), you win. |
Sure, but has their rate of value added increased as a result? It's a good question to ask. They added value before LLM coding, and now are more expensive than before thanks to token costs. |
you don't get promotion for supporting existing things, but for "inventing" you can get promoted. also for large migrations |
In Latvia, the net salary for a Java dev is around 1729 - 4314 EUR, based on https://www.algas.lv/algu-informacija/informacijas-tehnologi... (crowd sourced data) For the employer those employees cost between 2945 - 7736 EUR per month based on https://kalkulatori.lv/lv/algas-kalkulators (income and social taxes). So on the lower end that's (1500 USD ~ 1300 EUR) close to half the total expenses of such a developer, on the high end here around 15-20%. That's quite significant, depends on whether their productivity also improves (if that's what the orgs care about). And we’re not even the country with the worst pay out there, but pay the same for tokens, cause regional pricing isn’t a thing! |
I wonder how this plays out. Perhaps programmers in these countries will use cheaper models like Deepseek and they will be able to compete better, so offshoring continues? |
That depends on where you are. $18K is the equivalent of paying around 15% more for your developer. |
In hcol locations yes, but in south of spain you can get full time talent for that figure. It's also an entry-level salary in eastern europe, with ukraine and turkey even being somewhat cheaper. |
That was badly worded on my part, my intend was to indicate that there was no way they can or will pay $1500 per month per seat. |
There's models for every price point. What was SOTA and stupid expensive to run a year ago is a cheap flash model today. |
That's going to stop eventually, and I think at that point we're going to see business models more like the major CAD providers. |
I don't think they'll have a choice, open weights models are not far behind. At some point it's essentially a commodity game |
More importantly, it's questionable how much extra revenue improving a design of internal tool brings. |
> doing a days work in an hour then fucking off in a variety of ways Until companies start hiring 5x less engineers than they did before and well.. we are clearly moving towards that direction |
How much more software does Uber need? Unless they are iteratively replacing expensive vendors and optimizing other headcount costs? |
There's waayyyy too much money betting on that not happening, to the point I feel there'll be regulations popping up for "safety reasons" etc to ensure the big players control this. |
>WTF did Uber build with all of that spend? How did it meaningfully impact their revenue in a positive direction? Uber (and quite a few bay area companies and startups) can afford to spend that money. There is no expectation of profit, Uber lost ~62B and growing: https://uberlosses.com/ |
$18k a year is near half of my salary as junior verging on senior developer in the conservation field. Not everyone works in FAANG. |
Can you share some examples that you would say justify that price? Not a gotcha, I’m genuinely curious where you’re seeing a return at that level. |
I also randomly wrote some code in a bind yesterday, while I was on the toilet, and it felt so strange. That was the first I'd written in probably 6 months. |
You don't even make small tweaks by hand? There's so many things that are honestly faster to do by hand than wait for agents to do. |
Nope I'm a couple levels too far removed from the code at this point for that. Closest I get is during meta-management (modularizing, complexity reduction, etc) with agents |
Just to put this in context. If every company did this, all over the world, with that same limit, we are talking about something around $45B monthly in revenue for all AI companies to share. |
There are a lot of places in Europe where 1.5k$ is more than 50% of the total cost of an employee. And the obvious question: what it's the cost of that revenue? Because it looks huge but ... |
Don't you forget about India and Latinamerica... No way I see companies paying that much for outsourced employees |
> Putting the total addressable market at around 67T if all of us spend USD 1.5k on tokens every month However, that's an absurd scenario. |
> didn't even come to $600. That's still in the ballpark. A modest change in your usage habits or workload could easily get you there. |
Isn't all the "knowledge" just text files? I've transitioned between services easily by simply copying the text files. |
You can even just instruct the LLM to create a context file for you! They are surprisingly good at that as well. |
that's an interesting approach and something i also considered (using git to avoid conflicts). one thing i needed was a "database" (basically a folder of markdowns) with a fixed schema so i can let the agents record their decisions in (for example when the code conflicts with product design spec). this combined with search has been a real lifesaver. this is how it works: https://help.markbase.cloud/humans/collections/overview |
Believe it or not, after writing this comment I was doing some more reading on the task. I'm planning to reorganize our context repo after finding this paper (it argues that AI generated context files can stunt the performance of models): https://arxiv.org/abs/2602.11988 For what it's worth, if you were considering building context out. |
Very interesting. Anecdotally I’ve found the opposite to be the case. But I’m very interested in understanding more. Thanks for sharing |
My favorite solution to this is to use the Cline coding agent, which is open and allows you to easily switch between different providers and models. |
Not worried at all. Switching is trivial. Rebuilding context isn't very difficult and harnesses are a dime-a-dozen. |
Knowledge in there? Where is the knowledge stored? All of my knowledge typically gets stored in plans outside of the agent? And each agent window gets archived regularly, anyways. |
Genuine question: what would make me a "paid shill"? Who do you think would be paying me, and what would they expect in return? |
The asset I most value is my credibility. The reason so many people read my writing and find it useful is that they see me as a credible source of information: in a world full of clickbait and misinformation, I have a reputation for providing an independent voice that occupies that rare middle ground between "AI will kill us all" doomerism and "AI will solve everything" hype. Credibility is hard to earn and easy to squander. I've been blogging for 24 years now, which has helped me build credibility with a large array of people across many different interest areas. The modern influencer business model is to grow an audience and then sell things to them, through partnerships and sponsored content. I refuse to do that, because it strikes directly at that credibility. The moment you say "I've partnered with X to tell you about product Y" you're no longer an independent voice. Nilay Patel of the Verge (and the excellent Decoder podcast) refuses to read ads from sponsors himself, at significant financial cost to his publication. I've adopted the same policy - I will not let anyone else pay me to put words in my mouth, because it strikes directly at the credibility I value so much. Until a few months ago the only money I made from my blog was an https://ethicalads.io banner which pulled in a few hundred dollars a month (more if I had a high traffic piece). It helped cover some of my hosting costs for my various projects. That changed in February - https://simonwillison.net/2026/Feb/19/sponsorship/ - when I added a Troy Hunt-style sponsor banner to my site (no cookies, no JavaScript) - currently sold by an agency called Freeman & Forrest. Sponsored slots are sold on a weekly basis and get a mention in my email newsletter in addition to the blog banner. I'm earning enough from those that I no longer feel the opportunity cost of not going and getting a proper Silicon Valley engineering job. If I was a publication like the Verge I'd have a complete firewall between editorial and advertising. I don't have a team, but I've tried to replicate that as much as I can by having Freeman & Forrest sort out the sponsors while I stay hands off. I'll veto sponsors if I have to (no prediction markets etc) but thankfully that hasn't been necessary so far. I maintain a disclosures section on my blog here: https://simonwillison.net/about/#disclosures - which was inspired by Molly White's: https://www.mollywhite.net/crypto-disclosures/ I'm currently considering extending that to more of an ethics statement like this one on the Verge: https://www.theverge.com/ethics-statement The Verge policy I'm currently not fulfilling is "Our policy against receiving anything of value from companies we cover includes, but is not limited to, things like gifts, meals, discounted services, or paid trips and junkets. Vox Media and The Verge pay for all travel expenses to all events, including transportation, food, and hotels." - I've occasionally accepted flights, dinners, accommodation and some pretty absurd swag (Microsoft just gave me a jacket with my name stitched onto it as part of the GitHub Stars programme, and a bunch of gadgets in a pelican case) which didn't bother me so much when the blog was a side project, but I think I need to start refusing those kind of gifts. The day after the jacket I wrote a piece about their new models - https://simonwillison.net/2026/Jun/2/microsofts-new-models/ - which I later had to update because I missed some crucial details. Was I subconsciously influenced by the freebies? I don't think so, but the whole point of "subconsciously" is you don't know for sure. |
The issue is he’s not actually balanced at all. I’ve never seen him say anything negative about an AI product. |
Days ago he said… “I'm finding that coding agents can take me from a vague idea to a working solution, one with tests and documentation and that looks like a carefully considered project evolved over the course of many weeks... in less than an hour. Even if the code is rock solid, there's a limit to how many projects like that I can sensibly care for - and if they're instantly abandoned, what value was there from creating them in the first place?” https://simonwillison.net/2026/May/31/the-solution-might-be-... Here is Simon questioning a fundamental belief held by the pro-LLM lobby. Would a paid shill question that? Simon is, without question, an enthusiastic pro-LLM person. I disagree with what he says often, the product market fit post was a bad take. But I don’t believe he is shying away from sharing his thoughts when they’re not favorable to the industry. |
Yes, a paid shill. You can find a clear point in time where he shifted from sceptic to 1000% fully onboard non-stop praise, with no reason. |
Typescript is also hugely represented. My projects are TS in a big way, where I have no experience with it at all. |
Only people who do pay-per-use optimize this. Most heavy users have their use covered by an employer. |
These are still at currently subsidized prices. We'll see if they think they're getting $1500/month of value when that buys significantly fewer tokens. |
You think Fireworks is subsidizing token spend? And Friendli? And Baseten? Every single provider on Openrouter is offering their service at a loss? What? |
> theres no evidence that they arent (or can't) use profitable inference to subsidise those other expenses as far as we know there's no evidence that they can produce any profits at all |
Yes; they ban various uses of their subscriptions but say you can do whatever if you’re paying for the API without limits |
That's just market segmentation and them trying to maximize revenue it doesen't really say anything about their costs. |
That's not evidence. Very likely though, but the only evidence we get one way or another is when they IPO. |
This story isn't about those subscriptions - enterprise customers like Uber are paying the full API prices. |
afaik, enterprise plans are not subsidized. its 20$/seat+api pricing. Unless you are saying api pricing itself is subsidized. |
It's mostly R&D though, not inference. If LLM's effectively become a commodity then they are screwed anyway. |
The inference prices for very large open models would indicate that Antrophic's and OpenAI's margins are quite large. |
It's not. They recently forced enterprise customers onto API billing instead of the cheap consumer pricing. Now the pricing is brutal. |
Uber engineers reported that loading their workspace and pulling recent commits exhausted that AI limit for Claude Code (4.8 x-high) immediately. |
I don't think loading up a single context window costs $1,500. Which limit are you talking about? |
That's a lot. On my usual day I burn less than $1 on Opus. I could get beyond $10 only if I have a complex and well-defined problem, which is rare (the second part at least). |
Yea, I’m sure the personal plans are subsidized. I have $200 Claude Max at home and straight API pricing at work and equivalent work would easily cost me 5x if not more on the API. |
Uber is likely on an enterprise plan - these charge tokens at API cost, which can be much more expensive than the $20 flat rate. |
Do you think companies are gonna be like?: Wait a minute. We didn’t save money by adding AI. We just added an expense. Now we have to pay for employees AND AI. |
It's also a useful signal for AI value. Looks like it's a max value add of $18,000 per engineer per year. |
I'm sure if a dev can show useful results at 1k they won't have trouble getting permission for a higher cap as well. |
It means Uber thinks they can sustain that level of expense. Whether engineers at Uber are representative of the rest of the work force is an easily debatable question. |
Not really. There are clearly diminishing marginal returns, so it's likely that the first $2,400/engineer/year adds >>$2,400 of value, even if 18,001st $/engineer/year adds <$1 of value. |
It's among a wave of fresh "non-insane" takes on AI in the enterprise. Maybe we can reel things in to a sustainable level before a giant bubble bursts. |
Let's just say their performance (OKR, KPI, whatever "impact" metric you want) was indistinguishable from a peer that used the AI/LLM monthly allowance in full. Maybe a $10k raise would be nice? |
Theyd get a bad review for leaving performance on the table. When has finishing your work ever resulted in anything other than more work? |
It's disturbingly anti-merotocratic. You're not allowed to prove that you're more useful without AI because they just assume that AI is a 10x multiplier on everyone. |
You are paying account pricing. Uber is paying API pricing. You're $100/m plan is likely equivalent to thousands of dollars of API pricing. You are being subsidized by the companies using AI. |
And this is why as the freeloader (includes me) volume goes up, they add more and more rules to constrain us. |
But is it an accurate number? Does AI reach diminishing returns after $1,500/month, or is that all they are willing to risk/burn to stay in this game? |
It's probabaly a good things that Uber-developers are now forced to do some coding on their own. Only use AI where it absolutely helps |
I don't think at $1,500 you're not forced to code on your own at all, in the sense of typing code. You're simply forced to not yolo-max twelve parallel agents at all times. |
It’s not just about the model but also setting up the system to create and share compute (GPUs) which is quite complicated on its own. Ubers primary business focus isn’t infrastructure. |
eventually tokens will cost price of energy. and china is miles ahead. china will be major token exporter soon. mark my words. |
Electricity actually is only a small part of the data center costs. There are challenges in getting enough electricity that create problems, but the cost of the electricity really isn’t an issue. |
Until SaaS providers run another price racket and there will be many mental exercises why the new price is also justified. |
It still probably produces better results than some junior engineers in a lot of cases. But yeah, for a company at Uber’s scale, I can see why they would want real engineering discipline around it. |
When blue-collars were loosing jobs they were told to learn to code and now engineers are vilifying AI for taking jobs |
Do you believe the same people were saying those things? (Were they really?) The idea that "different attitudes towards labor have been expressed by different people" doesn't feel too remarkable |
They want to replace employees with AI, then replace paid AI with unpaid AI. Their wet dream was never automation. It was zero marginal cost labor. And that dream is starting to rot. |
If I were paying API rates this year, I would have already burned through $20k in tokens. Looking forward to the costs of this level of capability coming down. |
I think the logical follow up will be for Uber to lay off a bunch of people so that the remaining ones can token maxx. To the mooooon! |
Is anyone doing story point estimation in terms of tokens? If you have a token budget, does this change how you prioritize? |
If budgeted at $1,500/month per user, power users still can get 5-10x of that allocation if the user pool is large enough. |
I'm curious how much of the usage comes from vibe coding vs using agents/harnesses in internal tooling |
yeah where i work has been at $150 a week with a pretty generous over ride if you ask. people self limit when there are caps. if you give people unlimited they wont even use sonnet easy things. |
Outside of coding what other tools expend that kind of tokens? People are not creating that many slide decks or videos are they? |
if you have more than x seats, you have to use Enterprise pricing as far as I know which is pay as you go with a pool. |
Why are people getting these high spending numbers? A 200 USD subscription for either Codex or Claude should give you plenty of usage. What am I missing? Are they just being dumb? |
The subscriptions are not available to enterprise users. Enterprise users must pay per-token. A $200 subscription gives you roughly the equivalent of $1500 in per-token billing. |
It's wild; at my shop in Silicon Valley they dropped us from unlimited use to 60% prem budget on copilot. People are walking around like zombies. |
Exactly my experience. I always refactor first myself then delegate boring tasks to AI. It saves me energy, time and also tokens. If code is not prepared for easy implementation agents always fail. |
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。