惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Cisco Talos Blog
Cisco Talos Blog
人人都是产品经理
人人都是产品经理
云风的 BLOG
云风的 BLOG
IT之家
IT之家
Google Online Security Blog
Google Online Security Blog
Google DeepMind News
Google DeepMind News
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
S
SegmentFault 最新的问题
C
Check Point Blog
The Last Watchdog
The Last Watchdog
AI
AI
D
Darknet – Hacking Tools, Hacker News & Cyber Security
P
Proofpoint News Feed
J
Java Code Geeks
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Help Net Security
Help Net Security
A
Arctic Wolf
T
Tor Project blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Apple Machine Learning Research
Apple Machine Learning Research
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
M
MIT News - Artificial intelligence
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
腾讯CDC
W
WeLiveSecurity
Recent Commits to openclaw:main
Recent Commits to openclaw:main
量子位
Forbes - Security
Forbes - Security
Stack Overflow Blog
Stack Overflow Blog
V
Vulnerabilities – Threatpost
O
OpenAI News
L
LINUX DO - 最新话题
The Register - Security
The Register - Security
Hugging Face - Blog
Hugging Face - Blog
Cloudbric
Cloudbric
博客园 - Franky
AWS News Blog
AWS News Blog
I
InfoQ
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
www.infosecurity-magazine.com
www.infosecurity-magazine.com
S
Security @ Cisco Blogs
F
Fortinet All Blogs
Scott Helme
Scott Helme
D
DataBreaches.Net
Security Latest
Security Latest
The Hacker News
The Hacker News
D
Docker
有赞技术团队
有赞技术团队
Schneier on Security
Schneier on Security

The New Stack | DevOps, Open Source, and Cloud Native News

Agentic development hinges on verification. For cloud-native software, that is a runtime problem. AI agents need infrastructure: Why Europe’s regional cloud strategy matters Transform your AI coding agent into a deterministic Java Spring expert WeAreDevelopers is coming to the US to give unsung developers a bigger voice Cleaner AI training data, fewer bugs: Sonar’s SonarSweep explained Observability overload is drowning engineers Google’s DiffusionGemma is 4x faster than its other Gemma models Fable 5: Guardrails and burn rate are annoying users, who say it’s still better than Opus 4.8 The Anthropic leader who built Claude Code says he ditched prompting — now he just writes loops. AWS can now mathematically prove your VMs are isolated Microsoft pulled 73 GitHub repos after malware attack — but still won’t say who’s compromised Databricks wants to kill the “email me a file” problem for AI agent skills Ramp bets forward deployed engineers can do what off-the-shelf finance AI can’t Git real: AI agents aren’t just for solo developers anymore Anthropic launches Claude Mythos/Fable 5, but you better try it soon This AI agent startup ditched Anthropic for DeepSeek — and says it’s saving millions When your data model is the bottleneck: lessons from Medium’s feature store How long before we stop reading the code? The tokenmaxxing party is over, and Revenium is mopping up How AI is solving the memory crunch it created Microsoft’s pitch to enterprises: Ditch Azure Repos for GitHub, despite its rocky reliability record Claude Code’s biggest upgrade yet ran 5 agents at once — here’s what happened Why Anthropic just doubled Claude Cowork limits at no charge For years, Apache Cassandra handed this work to your team — 6.0 takes it back “A dangerous combination”: The 2 factors that can “corrupt” AI agent workflows With Foundry, Microsoft bets the enterprise AI battle is about reliability, not capability Microsoft unlocks Visual Studio for developers left behind by its own AI AI teams now deploy 1,000 times a month. Your pipeline wasn’t built for that. Microsoft just made the agent runtime free — and kept everything around it “Whoever builds the most joyous product wins”: The agent war begins Netlify CTO Dana Lawson: Writing code is no longer the job From Jupyter Notebook to production: How to ship AI systems that actually work OpenClaw used Gavriel Cohen’s code and exposed the AI Agent accountability problem Replit shows how vibe coding is getting its own financial stack — and a path to profit Cloudflare aqui-hires VoidZero: Did a piece of the open web just stabilize, or become more brittle? Cursor cuts prices and adds enterprise spend controls amid “tokenomics” reckoning Google Gemma 4 12B nearly matches 26B benchmarks — and runs on your laptop Snowflake thinks it knows what’s really slowing developers down Autonomous agents have met their biggest challenge yet: The database. Why agentic AI makes the ops platform the most important layer in the enterprise How to dramatically improve enterprise security alert tuning to battle cyberattacks Why the need for humans won’t disappear in the age of autonomous databases How to secure Kubernetes in the age of AI workloads Asana says its new AI “chief of staff” turns your Slack chaos into trackable work Nvidia’s best model is now live Mate Security’s Asaf Wiener made every backend engineer a model router. He’s right to. The AI cost crisis finally has a watchdog — just not the companies causing it How to get operational data off the factory floor without creating an IT breach Why CPUs still matter in the age of AI agents Rayfin: Microsoft’s answer to the gap between vibe coding and enterprise production Microsoft bets the enterprise AI race will be won on data context, not model power “A successful attack could be catastrophic”: Anthropic gives more groups access to Claude Mythos How GitHub plans to win developers back Microsoft really, really, really wants developers to love Windows again With Intelligent Terminal, Microsoft is reinventing the Windows terminal Microsoft debuts “Scout” at Build, a new personal agent for work OpenAI’s Codex adds new tools — Sites, Annotations, more plugins — for knowledge workers GitHub Copilot’s usage-based billing is live: Here’s what you need to know OpenAI, Anthropic, Google, Amazon, and xAI all fail on type of attack, study finds JetBrains open-sources Mellum2 to go where Claude Code can’t Claude Code vs. Cursor vs. Codex vs. Antigravity — six months in This coding agent doesn’t want your feedback — it ships without it “Blowing things up”: The one move vendors got wrong on AI agents At Sapphire, SAP makes the case that enterprise AI is a context problem Gavriel Cohen found his own code inside OpenClaw, so he walked away AI retrieval at scale is becoming a systems problem, not a tooling problem The DIY platform trap that’s burning out engineering teams I tested Cursor’s new Jira integration and it’s 5 stars, no notes. Here’s why. Why GPT-5.4, Claude, and Gemini can’t agree on basic, real-world facts Replit’s vibe coding platform just got a Visa-backed identity layer for AI agents — and it changes how agents spend money Opus 4.8 Made Claude Smarter. Token Discipline Got Urgent. Why Linux creator Linus Torvalds gets angry hearing “99% of code is AI” Vendor neutrality isn’t magic: A hard look at the OpenTelemetry ecosystem “The AI did it” won’t save you when EU regulators come knocking The fix for soaring AI cloud bills exists — so why won’t we trust it? AI is shipping code faster than security was built to handle Why AWS scrapped OpenSearch’s architecture to chase agent workloads Claude Opus 4.8 is here: effort controls, dynamic workflows, cheaper fast mode, better honesty, less deception Percona celebrates 20th birthday with new foundation — and a goat cake Why OpenAI and Anthropic are hiring forward deployed engineer teams Claw-style AI agents are coming to the enterprise. The governance infrastructure is still catching up. The agentic identity crisis: Why your security isn’t ready for the AI revolution Debugging the undebuggable: building observability into probabilistic AI systems Snowflake commits $6B to AWS as it pushes deeper into AI Why MotherDuck refuses to fork DuckDB Researcher “gave Claude Code ‘ADHD’… and it thinks 2x better now.” Outside experts want more proof. “There is no accountability”: AI coding agents are installing packages no one owns “Tokenmaxxing is real, expensive & it’s spreading”: AI budgets are exploding With Google’s debut, the most important AI agent feature is now the most boring one Why AI agents need a Context Lake Google ranks the best AI for building Android apps, and the winner isn’t Gemini Google pushes Pro, Ultra, and free users from open-source Gemini CLI to closed-source Antigravity CLI The reason enterprise outages almost never start where ops teams think Taming the agentic influx: a blueprint for AI business observability How the AC/DC framework helps teams govern AI coding agents GitLab 19.0 trades its string section for a full DevSecOps orchestra Who’s monitoring the agents? How Jaeger hit 8.6× compression on 10 million spans with ClickHouse What ClickHouse learned from a year of coding with AI agents OpenClaw passed 300,000 GitHub stars. Then Google launched Spark.
Your AI isn't broken. Your data is.
Darryl K. Taft · 2026-06-17 · via The New Stack | DevOps, Open Source, and Cloud Native News

Enterprises are pouring billions into AI and getting garbage back. A new startup says it knows why and has built the first platform designed to fix it.

Clario launched from stealth Wednesday with $6 million in seed funding to tackle what co-founder and CEO Yousuf Khan calls data ROT: redundant, obsolete, and trivial files inflating storage costs and poisoning AI projects at the source.

“Four years post-ChatGPT, enterprises have spent billions on projects that are failing to make a meaningful impact,” Khan says in a statement. “Garbage in, garbage out isn’t a cliché, it’s an incredibly costly mistake.”

Industry estimates put more than a third of all stored enterprise data in the garbage category. And Gartner projects that 60% of AI projects will be abandoned by end of year due to poor data quality. Clario’s own early customer work has pushed that figure higher. In tests with design partners, the company has found garbage rates as high as 60%, Khan says.

Khan, a five-time CIO who held the role at Pure Storage and Moveworks before becoming a general partner at Ridge Ventures, says he kept running into the same wall at every stop. “I tried to solve this multiple times with all the big file systems. I couldn’t do it,” he tells The New Stack. The problem only compounded as AI-generated content began flooding enterprise repositories after ChatGPT’s launch.

Co-founder and CTO Madhu Vohra brings the infrastructure side of the equation. She spent her career building the systems where this data ends up — architecting clustered SAN at NetApp, scaling engineering teams at Nutanix, and leading Oracle’s block and object storage in OCI.

“I’ve built major systems that enabled people to accumulate,” she tells The New Stack. “So here I am atoning.”

How it works

Clario connects directly to enterprise file and content systems including Google Drive, SharePoint, OneDrive, Box, and Confluence — and scans metadata to surface garbage without ever opening the files themselves. Classification is currently heuristics-based, using file checksums, naming patterns, access timestamps, and format support status, Vohra says. AI and embedding-based detection are on the roadmap, she notes.

When Clario flags a file, it triggers a workflow via Slack or Teams, notifying the person who created or owns the content and asking them to keep, archive, or delete it. The system learns from those decisions to build an increasingly autonomous cleanup engine over time. Clario only gets paid when customers act on a flagged file. This is an outcome-based model that aligns the company’s incentives with actual data reduction.

The ROT breaks down into three buckets: redundant files (duplicates and near-duplicates), obsolete files (legacy formats no one can open, documents untouched for years, content from departed employees), and trivial files (hidden files, noise). Early customer analysis has uncovered terabytes of junk, including knowledge base articles for discontinued product lines and full-length feature films downloaded by former employees, Vohra says.

To avoid false positives, Clario’s model is tuned for precision over recall designed for flagging only what it’s confident is garbage.

“Anything which we think is difficult to decipher, we want to bring up,” Khan says, adding that the goal is to tackle low-hanging fruit first and build confidence before moving into more ambiguous territory.

The AI cost angle

The timing argument is about more than storage bills. As enterprises build internal agents and RAG-based systems, the quality of the underlying data directly determines whether those systems work. Vohra puts it bluntly: “Did my AI hallucinate or did it because you fed it all 15 million files?”

Khan says he sees the issue in token economics: internal agents built on unclean knowledge bases force LLMs to sift through outdated policies, discontinued product documentation, and obsolete support articles, burning compute budget on noise.

“You’re literally processing tokens on garbage,” he notes.

One early customer with 5.5 million files found that more than 20% was data ROT — and that it traced back largely to four departed employees.

Competitive landscape

Khan acknowledged the field is thin. Backup vendors and archiving companies have touched the edges of data cleanup, but none have built an end-to-end workflow from classification through employee notification to action and learning, he says. “If they were there, I would have used them,” he says. “I haven’t seen a company that’s done this.”

Vohra notes that compression and storage efficiency tools address the cost of bits, not the number of them. “The core of the problem continues to be that the 15 million files you have continue to be exactly those 15 million problems.”

Investors and customers

“The enterprise data crisis isn’t new, but the cost of ignoring it today is becoming impossible to justify,” said Saad Siddiqui, Partner at Preface Ventures, says in a statement. “We backed Clario because they are the only company working to get enterprises to AI-ready on a foundational level.”

Clario has about a dozen customers in early analysis and deployment. The company is about six months old and plans to expand beyond file and content systems to image repositories, video stores, and knowledge bases in platforms like ServiceNow and Salesforce Service Cloud.

Khan puts the product vision simply: “Our goal is to make sure that data hygiene is a continuous process in the enterprise.”

YOUTUBE.COM/THENEWSTACK

Tech moves fast, don't miss an episode. Subscribe to our YouTube channel to stream all our podcasts, interviews, demos, and more.

Created with Sketch.