惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

L
LangChain Blog
Martin Fowler
Martin Fowler
P
Palo Alto Networks Blog
MongoDB | Blog
MongoDB | Blog
A
About on SuperTechFans
Google DeepMind News
Google DeepMind News
博客园_首页
量子位
小众软件
小众软件
F
Full Disclosure
Vercel News
Vercel News
爱范儿
爱范儿
Engineering at Meta
Engineering at Meta
F
Fortinet All Blogs
博客园 - 聂微东
V
V2EX
Blog — PlanetScale
Blog — PlanetScale
罗磊的独立博客
WordPress大学
WordPress大学
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
T
Tor Project blog
Google DeepMind News
Google DeepMind News
M
MIT News - Artificial intelligence
L
Lohrmann on Cybersecurity
H
Hacker News: Front Page
Spread Privacy
Spread Privacy
AI
AI
C
Cyber Attacks, Cyber Crime and Cyber Security
C
CERT Recently Published Vulnerability Notes
D
Docker
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Recorded Future
Recorded Future
L
LINUX DO - 热门话题
Microsoft Azure Blog
Microsoft Azure Blog
Recent Commits to openclaw:main
Recent Commits to openclaw:main
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Latest news
Latest news
W
WeLiveSecurity
Application and Cybersecurity Blog
Application and Cybersecurity Blog
博客园 - 司徒正美
博客园 - 叶小钗
T
Threat Research - Cisco Blogs
P
Privacy International News Feed
O
OpenAI News
Help Net Security
Help Net Security
aimingoo的专栏
aimingoo的专栏
宝玉的分享
宝玉的分享
博客园 - Franky

Forbes - Innovation

Why Do Humans Have Fingerprints? Hint: It’s Not What You Think Booking.com Confirms Data Breach, Reservation PIN Codes Changed Why Major News Sites Are Blocking The Internet Archive’s Wayback Machine iPhone Fold Release Date: New Report Details Frustrating Apple News Comet Tracker: How To See Pan-STARRS And Three Planets On Wednesday NYT Mini Crossword Today: Tuesday, April 14 Hints And Answers Today’s NYT Strands Hints, Spangram, Answers: Tuesday, April 14 (It’s A Little Unclear) Today’s Wordle #1760 Hints And Answer For Tuesday, April 14 Most Of The Microplastics In Urban Air Come From Tires Today’s Wordle #1759 Hints And Answer For Monday, April 13 NYT Mini Crossword Today: Monday, April 13 Hints And Answers NYT Pips Today: Hints, Answers And Walkthrough For Monday, April 13 The YC Chief Who Codes 10,000 Lines A Day Has A Simple Secret Samsung Expands One UI 8.5 Beta To More Galaxy Owners Why You Should Stop Using Your iPhone If It’s On This List Chamath Says Firms That Treat AI As A Strategy Hand Rivals Their Edge 3 Unexpected Habits Of Secure Couples, By A Psychologist The First Lamp That Folds Your Clothes Samsung’s Disappointing Price Update For Galaxy Phone Buyers 3 Subtle Signs Someone Is Falling In Love With You, By A Psychologist Do Mantis Shrimp See More Colors Than Humans? A Biologist Explains NYT Connections Answers Explained For Monday, April 13 (#1,037) NYT Connections Hints Today: Monday, April 13 Clues And Answers (#1,037) LEGO Luigi & Mach 8 (72050) Review: 2026’s Best Set Yet? Marc Andreessen Says AI Productivity Will Trigger A Hiring Boom 3D Printing Is The Ultimate Hack To Reduce Household Spending Apple iPhone Fold: Striking Design Revealed In Leaked Photos Apple Smart Glasses: New Leak Reveals A Major Design Twist To Beat Meta Tested: The AI Coming To The Rivian R2 Quordle Hints Today: Monday, April 13 Clues And Answers Companies And H-1B Employees Endure Immigration Waits At Consulates 3 Easy Ways To Turn Anxiety Into Sustained Focus, By A Psychologist Here’s The Most Affordable Humanoid Robot You Can Buy Now UFC 327 Results: 5 Biggest Takeaways From A Wild Night In Miami UFC 327 Results, Bonus Winners, Highlights And Reactions Dana White Announces Huge New Fight For UFC White House Today’s NYT Strands Hints, Spangram, Answers: Sunday, April 12 (Get Ready) Tesla ‘Model 2’ Rises From The Ashes Today’s Wordle #1758 Hints And Answer For Sunday, April 12 NYT Pips Today: Hints, Answers And Walkthrough For Sunday, April 12 Tyson Fury Vs. Arslanbek Mahkmudov Results: Highlights and Reaction NYT Mini Crossword Today: Sunday, April 12 Hints And Answers How Shadow AI Culture Is Destroying Your Business Venture Capital Funds That Market Like Startups Win More Deals Conor Benn Vs. Regis Prograis Results: Highlights and Reaction Samsung’s Disappointing Price Update For Galaxy Phone Buyers Artemis Reached The Moon. The Grid Can Reach The 21st Century A Biologist Explains How Archerfish Shoot Down Prey. Hint: Their Aim Rivals Human Throwing Is It Time For Apple To Forget About The MacBook Air NYT Connections Hints Today: Sunday, April 12 Clues And Answers (#1036) Trump’s 2027 Budget To Reshape U.S. Environmental And Energy Policy CDC Delays Reporting Of COVID-19 Vaccine Benefits—Here’s What To Know Oura Has Designed A Solution To A Big Smart Ring Problem Netflix’s Best New Show Has A Near-Perfect 95% Rotten Tomatoes Score Coachella 2026 Is Being Taken Over By Creator Streams Quordle Hints Today: Sunday, April 12 Clues And Answers This Startup Wants To Use AI To Help Digitize History How To Get The Best Shield In ‘Crimson Desert’ Microsoft Venom Attack Targets C-Suite Executives ‘Maul: Shadow Lord’ Sets Even More Star Wars Rotten Tomatoes Records 3 Ways Happy Couples Argue Differently, By A Psychologist Success For Leapmotor Might Have Negatives For Stellantis New Names Surface As Potential Rogue And Wonder Woman In The MCU And DCU 4 Reasons Artemis Mission Matters Even If You Think It Is Wasteful Fast ‘Crimson Desert’ Patch Adds New Moves, Shield Hiding And One Great Feature Why Do Humans Blush? An Evolutionary Biologist Explains The Signal We Can’t Control Apple iPhone Fold: Striking Design Revealed In Leaked Photos Adobe Attacks Underway—Windows And Mac Users Given 72 Hours To Update iOS 26.4.1 Release: Crucial iPhone Feature Update Arrives, But No Security Fix Fury vs. Makhmudov Full Card, Ring Walk Times and How to Watch Can’t Stand Liquid Glass? This New Hidden iPhone Setting Is A Game-Changer Test-Driving The 2026 Changan Deepal S05: Italian Style Made In China NSA Warning—Reboot Your Internet Router Now Ways That Human-AI Collaboration Slides People Into ‘AI Brain Fry’ And Cognitive Downturns Stop Using These Networks—Google, NSA And TSA Warn NASA Changes Moon Plan: Landing Now Depends On SpaceX Or Blue Origin Samsung Expands One UI 8.5 Beta To More Galaxy Owners The Evolution Of Programmable Hardware At Xilinx NYT Mini Today: Saturday, April 11 Hints And Answers Today’s NYT Strands Hints, Spangram, Answers: Saturday, April 11 (You’re Putting Me On) Splashdown! NASA’s Artemis II Returns To Earth After Moon Mission Attention Is All You Need. The Human Kind Is Still The One That Counts Today’s Wordle #1757 Hints And Answer For Saturday, April 11 NYT Pips Today: Hints, Answers And Walkthrough For Saturday, April 11 Android Circuit: Galaxy S27 Pro Emerges, Honor 600 Pre-Order Offers, Pixel 11 Display Leaks Apple Loop: iPhone 18 Pro Leak, Urgent iOS Update, MacBook Neo Issues Morgan Stanley Has Mostly Positive Outlook On Tesla Robotaxi, FSD V15 Running Out Of AI Tokens Faster Than Ever? Here’s Why CoreWeave Shares Pop 13% After Anthropic Deal ‘Euphoria’ Season 3’s Rotten Tomatoes Score Crashes, Has Lost Key Player People Don’t Agree On What AI Can Do, But They Don’t Even Use The Same Product ‘Overwhelming’—Google Issues Gemini Update For Gmail Users NYT Connections Hints Today: Saturday, April 11 Clues And Answers (#1035) Quordle Hints Today: Saturday, April 11 Clues And Answers The Costly Dream Of Space-Based AI Infrastructure Can You See The Watcher In This ‘Daredevil: Born Again’ Shot? Adobe Attacks Underway—Windows And Mac Users Given 72 Hours To Update You Just Watched The Backdoor Pilot For ‘The Pitt: Night Shift’ Are Nicotine Pouches Like Zyn And VELO Safe To Use? A Doctor Answers Human Resources (HR) Is The Key To AI Success Per WalkMe ( SAP)
Big Tech’s AI Datacenter Investments Might Be In Big Trouble
Amir Husain · 2026-06-18 · via Forbes - Innovation
Chinese Startup DeepSeek Releases Preview Version Of V4

JINAN, CHINA - APRIL 25: DeepSeek-v4 is an excellent model available for a fraction of the cost of competitive US models.

VCG via Getty Images

On June 16, the Chinese AI lab, Z.ai released GLM-5.2. It is an open weights model under an MIT license, which means anyone can download it, modify it and run it commercially with no restriction. It’s performance is incredibly impressive. It scores 81.0 on Terminal-Bench 2.1, which is one of the most commonly used model performance benchmarks. What’s worth noting is the rapid rate of improvement. The previous version of GLM, version 5.1. scored a mere 62 on the same benchmark. This is a serious jump and it’s been achieved in weeks, not years.

GLM 5.2 continues to dazzle with its performance on benchmarks. It scores 62.1 on SWE-bench Pro, which means that it edges past GPT-5.5. On FrontierSWE it trails the widely-considered leader, Opus 4.8, by a mere point. This new Chinese model carries a one million token context window that holds up across long agentic sessions. And it costs roughly one sixth of what the leading American closed model charges per token.

Read that paragraph again. An open model you can run yourself now trades blows with the frontier on the tasks that matter most to engineers. At a sixth the cost.

China’s Winning Model

This is not an anomaly. This is China’s strategy. Real performance, fast iteration and improvement and costs which will put a damper on every revenue projection sold to investors by US-based AI companies.

If my conclusion was supported merely by reporting third-party benchmarks someone else ran, you would be justified in your skepticism. But I have been using Chinese models for hundreds of problems across many machines and a variety of workflows. My current favorite “jack of all trades” model is DeepSeek-V4, and its cheaper v4-flash cousin. In my own work, outside of the horrendously expensive top Opus tier, it has been the most broadly capable model I use. V4-Pro is a 1.6 trillion parameter mixture of experts model that activates 49 billion parameters per token. It posts 80.6 percent on SWE-bench Verified benchmark. It costs about 87 cents per million output tokens. That is roughly one thirtieth of frontier pricing. A smidgeon over 3%! The weights are open. You can do with it as you please. Please note that I am not describing a research curiosity. This is a model that does much of my real work.

Earlier I mentioned speed. Now let’s look at the cadence of this brave new open frontier. GLM-5 arrived in February. GLM-5.1 arrived in March and lifted the internal coding score from 35.4 to 45.3, a 28 percent jump in a single point release. GLM-5.2 arrived in June and nearly doubled the Terminal-Bench result again. Three steps. Four months. Each step was trained on Chinese silicon. There is still some argument about whether all of it was Nvidia-free, but I am inclined to believe that Chinese labs are now able to deliver frontier-class models on an entirely domestic stack.

All this speed means that the open frontier is not crawling toward the closed frontier. It is sprinting. In 2023 open models were two years behind. In 2024 one year. In 2025 six months. Today the gap on the benchmarks that decide real engineering work is measured in mere weeks.

Cost Curves And The Price of Intelligence

Compare this to the cost of intelligence itself. For three years the price of a unit of model output fell roughly tenfold each year. A GPT-4 class result that cost twenty dollars per million tokens in late 2022 costs around forty cents today. That is close to a thousandfold decline. It is one of the fastest cost collapses in the computing.

But that curve stalled this year. Not because the technology stopped improving. But because of the supply chain. With the Iran war and the AI datacenter boom, the world ran short of memory. DRAM and high bandwidth memory went into acute shortage. Supplier inventories fell from months of stock to weeks. Server memory prices are on track to double by the end of 2026. The per token price kept drifting down while the cost to own or rent the hardware underneath it climbed. The deflation paused for a supply reason. Not a technology, physics or demand reason.

Fear The Coming Surprise

But this pause is not permanent. What happens when the dam breaks? Here is how two surprises landing at once can spell doom for many optimistic investors in massive datacenters.

The first surprise is new capacity coming online. The memory shortage is a cycle, not a ceiling. Fabs are being built. When that supply materializes, hardware costs fall back toward trend and the thousandfold curve picks up where it left off. Intelligence resumes getting cheaper on schedule, and the pause looks in hindsight like a single bad period on an otherwise steep slope.

The second surprise is the advent of the edge. While the cloud waits on memory, the desktop can quietly cross an important performance threshold. Nvidia now ships DGX Spark, a Grace Blackwell machine with 128 gigabytes of unified memory that runs models up to 200 billion parameters at four bit precision, for about 4,700 dollars. Link two of them and you have 256 gigabytes. Open weight models in the right size class already run on it. The software stack to support all this distributed inference, fast interconnects, model and machine management has matured in months. Quite literally, a box that fits next to a monitor now does work that required a rack of rented accelerators two years ago.

Put the two together. Frontier grade open models. A cost curve about to resume its fall. Consumer hardware that can host real models locally. Within three to four years the most capable model most people touch every day will not live in someone else’s data center. It will live on a machine they own. Cloud models may be more powerful at the margins, but that difference will be made up by unlimited run time, local network and document access, privacy and much else.

Will The Datacenter Bet Survive?

If excellent models run locally, this can become a problem for one specific bet. The bet is that demand for centralized inference will grow fast enough, for long enough, to justify hardware depreciated over five and six year schedules.

Michael Burry has made the accounting case loudly. Hyperscalers write down Nvidia silicon over five or six years while the real economic life of a chip is closer to two or three. He puts the understated depreciation across the industry at roughly 176 billion dollars through 2028. Goldman frames the same risk plainly. A fifty thousand dollar accelerator on a five year schedule carries ten thousand dollars a year in depreciation. If a new generation makes it uneconomic to run in year two, the operator still carries an asset that no longer earns. Multiply that across hundreds of thousands of units.

The first lease renewal cliff for the 2023 and 2024 build out hits late this year and next. Roughly half the data centers planned in the United States for 2026 already face delay or cancellation. A town in Wisconsin just passed the first voter referendum requiring approval for large data center incentives. Prediction markets put the odds of a federal moratorium before 2027 at about one in three.

Now consider the demand. If open models keep closing the gap, and if the cheapest place to run them becomes the device on your desk, the centralized inference demand curve that underwrites a five year depreciation schedule does not need to collapse. It only needs to grow slower than the spreadsheets shared with investors assumed. That is enough for major trouble to ensue.

Arguably, even a move to the edge and to local models may not hurt Nvidia. The company sells the accelerators in the data center and it sells the silicon at the edge. DGX Spark is Nvidia. The chips in the next generation of workstations and consumer cards are Nvidia. If inference migrates from the rack to the desk, Nvidia simply follows the workload. The risk does not sit with the company selling shovels. It sits with the operators who borrowed against a single mine and wrote a six year schedule for a two year asset.

Scary, But Good

There is a final reason this potentially bubble-popping shift is not just likely but also good.

Every time you send a prompt to a hosted model you hand over information. Not only the question. The context. The document you pasted. The code base you are debugging. The deal you are modeling. The diagnosis you are worried about. The strategy you have told no one. We are pouring the most sensitive material of our professional and personal lives into systems we do not control, governed by terms we do not write, subject to retention and access rules that change without our consent. Recent export actions that cut foreign users off from specific models overnight should end any illusion that the hosted relationship is stable.

The DoD has just disclosed that Grok’s models were used in military action against Iran. Imagine a user in a particular country sharing information about his or her house, location, office, street only to have this be used as training data that could one day cause their street to be bombed. It’s a horrendous thought, but many people now see these risks clear as day.

The only environment you can fully trust is the one you own. A model running on your machine, on your weights, behind your firewall, leaks nothing. The prompt never leaves the building. For a clinician, a lawyer, an intelligence operative, a weaponeer, an engineer on restricted work, a founder guarding a thesis, that is not a nice to have. It is the whole game.

The technology is converging on local. The economics are converging on local. And the question of trust was always going to converge on local, because the most private thinking you do all day should not require a stranger's server to complete.

Data center’s will continue to have their time in the sun. But I’d rather bet on open models at the edge. Build accordingly.