惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

I
Intezer
Jina AI
Jina AI
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
有赞技术团队
有赞技术团队
J
Java Code Geeks
人人都是产品经理
人人都是产品经理
博客园 - 叶小钗
M
MIT News - Artificial intelligence
月光博客
月光博客
C
Check Point Blog
Y
Y Combinator Blog
S
SegmentFault 最新的问题
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
C
Cybersecurity and Infrastructure Security Agency CISA
A
Arctic Wolf
S
Security Archives - TechRepublic
S
Securelist
美团技术团队
SecWiki News
SecWiki News
H
Help Net Security
V
Vulnerabilities – Threatpost
S
Secure Thoughts
F
Fortinet All Blogs
量子位
aimingoo的专栏
aimingoo的专栏
T
Tor Project blog
大猫的无限游戏
大猫的无限游戏
Scott Helme
Scott Helme
MyScale Blog
MyScale Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
D
Docker
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
L
Lohrmann on Cybersecurity
F
Fox-IT International blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
博客园 - 三生石上(FineUI控件)
Engineering at Meta
Engineering at Meta
Microsoft Security Blog
Microsoft Security Blog
Recorded Future
Recorded Future
V
Visual Studio Blog
WordPress大学
WordPress大学
S
Schneier on Security
Stack Overflow Blog
Stack Overflow Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Apple Machine Learning Research
Apple Machine Learning Research
N
News | PayPal Newsroom
GbyAI
GbyAI
T
Threat Research - Cisco Blogs

VentureBeat

MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X long-context response speed boost DataGrail report finds your vendor may be sending data to AI models you never approved DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5, and finds Claude Opus exploiting a benchmark loophole How attackers bypass MFA in financial services Why prompt debt, retrieval debt, and evaluation debt are quietly reshaping enterprise AI risk AI agents are quietly generating chaos engineering failures enterprises don’t track yet npm supply chain: valid certificates, stolen accounts Replacing RAG with bash cut AI retrieval costs 30% AI agent identity: D&B rebuilt its 642M-company graph Alibaba's proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic's Claude Code LLM agent memory at 0.12% of model parameters Americans can’t spot a deepfake, and that’s a business crisis, not just a consumer problem MFA verifies who logged in. It has no idea what they do next. Kore.ai launches Artemis AI agent platform, expands challenge to Microsoft and Salesforce Resolve AI says the AI coding boom is breaking production systems. It wants to fix that. AI didn’t kill brand consistency — it made it mission-critical Google Managed Agents API: fast deployment, Google runtime Cohere cracks lossless quantization and native citations with first full Apache 2.0 licensed open model Command A+ Cerebras says its chips run a trillion-parameter AI model nearly 7 times faster than GPU clouds Enterprise AI agents fail because they forget GitHub confirms 3,800 internal repos stolen through poisoned VS Code extension as supply chain worm hits Microsoft's Python SDK NanoClaw's creators are turning the secure, open source AI agent harness into an enterprise 'second brain' Corti's new Symphony for Speech-to-Text model beats OpenAI at medical terminology accuracy, highlighting the value of specialized AI AWS nabs white hot gen AI media creation startup fal, becoming its preferred cloud provider Securing AI agent credentials with MCP tunnels Google says Gemini 3.5 Flash can slash enterprise AI costs by more than $1 billion a year Google just redesigned the search box for the first time in 25 years — here’s why it matters more than you think. Google’s new AI agent can draft your emails, monitor your inbox and eventually spend your money Google unveils Gemini Omni 'any-to-any' AI model: what enterprises should know Influential AI researcher Andrej Karpathy announces he's joining Anthropic Context architecture is replacing RAG in AI AI supply-chain attacks bypass model red teams LangSmith Engine closes the agent debugging loop automatically — but multi-model enterprises still need a neutral layer Architectural patterns for graph-enhanced RAG: Moving beyond vector search in production The enterprise risk nobody is modeling: AI is replacing the very experts it needs to learn from Intercom, now called Fin, launches an AI agent whose only job is managing another AI agent RecursiveMAS cuts multi-agent AI costs by 75%: researchers Claude’s next enterprise battle is not models: it’s the agent control plane Developers can now debug and evaluate AI agents locally with Raindrop's open source tool Workshop Cerebras stock nearly doubles on day one as AI chipmaker hits $100 billion — what it means for AI infrastructure Agent authorization gap: why verified agents are still a risk Anthropic's Claude Code adds a built-in evaluator to catch agents that quit too soon Enterprises are training their own AI models from production workflows — without a machine learning team AI IQ is here: a new site scores frontier AI models on the human IQ scale. The results are already dividing tech. Anthropic reinstates OpenClaw and third-party agent usage on Claude subscriptions — with a catch Anthropic finally beat OpenAI in business AI adoption — but 3 big threats could erase its lead Frontier AI models corrupt 25% of document content Protect your enterprise now from the Shai-Hulud worm and npm vulnerability in 6 actionable steps Perceptron Mk1 shocks with highly performant video analysis AI model 80-90% cheaper than Anthropic, OpenAI & Google Claude Code and Claude in Chrome have four security blind spots. Here's the audit Is your enterprise adaptive to AI? Turning AI cost spikes into strategic growth opportunities Thinking Machines shows off preview of near-realtime AI voice and video conversation with new 'interaction models' AI agent IAM: why enterprise identity governance is broken AI tool poisoning exposes a major flaw in enterprise agent security Intent-based chaos testing is designed for when AI behaves confidently — and wrongly Anthropic says it hit a $30 billion revenue run rate after 'crazy' 80x growth OpenAI voice models get GPT-5-class reasoning Vibe coding exposed 380,000 corporate apps — 5,000 held sensitive data AI agent identity: how to govern agentic AI in 6 stages Anthropic wants to own your agent's memory, evals, and orchestration — and that should make enterprises nervous Enterprise GPU utilization: why 95% of AI infrastructure spend is wasted Governance, not gatekeeping: How SAP brings enterprise‑grade safety to AI connectivity Anthropic introduces "dreaming," a system that lets AI agents learn from their own mistakes RL orchestration: how a 7B model routes tasks across GPT-5, Claude, and Gemini Meet ZAYA1-8B, a super efficient open reasoning model trained on AMD Instinct MI300 GPUs Anthropic Skill scanners passed every check. The malicious code rode in on a test file. Why AI breaks without context — and how to fix it Market research is too slow for the AI era, so Brox built 60,000 identical 'digital twins' of real people you can survey instantly, repeatedly The app store for robots has arrived: Hugging Face launches open-source Reachy Mini App Store with 200+ apps Scaling AI into production is forcing a rethink of enterprise infrastructure Miami startup Subquadratic claims 1,000x AI efficiency gain with SubQ model; researchers demand independent proof. GPT-5.5 Instant shows you what it remembered — just not all of it One command turns any open-source repo into an AI agent backdoor. OpenClaw proved no supply-chain scanner has a detection category for it AI agents are missing all the discussions your team is having. SageOX has an answer: agentic context infrastructure OpenAI turns its sold-out GPT-5.5 party into a monthlong Codex giveaway for 8,000 developers Inside AMEX’s agentic commerce stack: How intent contracts and single-use tokens enforce AI transactions Microsoft takes Agent 365 out of preview as shadow AI becomes an enterprise threat The RAG era is ending for agentic AI — a new compilation-stage knowledge layer is what comes next Salesforce Agentforce Operations fixes workflows breaking enterprise AI MCP command execution flaw: what security teams need to know The scaffolding era is over. LlamaIndex says context is the new moat xAI launches Grok 4.3 at an aggressively low price and a new, fast, powerful voice cloning suite Hidden IT problems are quietly creating risk, shadow IT, and lost productivity Alibaba's HDPO cuts AI agent tool overuse from 98% to 2% One tool call to rule them all? New open source Python tool Runpod Flash eliminates containers for faster AI dev Why OpenAI's 'goblin' problem matters — and how you can release the goblins on your own AI coding agents breached: attackers targeted credentials, not models | VentureBeat Writer launches AI agents that can act without prompts, taking on Amazon, Microsoft and Salesforce Netomi raises $110 million as Accenture and Adobe bet on AI for customer service Cheaper tokens, bigger bills: The new math of AI infrastructure Amazon’s OpenAI gambit signals a new phase in the cloud wars — one where exclusivity no longer applies Enterprise RAG rebuild: hybrid retrieval adoption tripled in Q1 2026 IBM launches Bob with multi-model routing and human checkpoints to turn AI coding into a secure production system AWS Quick's knowledge graph creates an orchestration blind spot Why enterprise GPU utilization is stuck at 5% — and why the fix makes it worse Definity embeds agents inside Spark pipelines to catch failures before they reach agentic AI systems How to build custom reasoning agents with a fraction of the compute American AI startup Poolside launches free, high-performing open model Laguna XS.2 for local agentic coding Mistral AI launches Workflows, a Temporal-powered orchestration engine already running millions of daily executions
Agentic AI in production: Merck, Mastercard on what works
taryn.plumb@ · 2026-05-28 · via VentureBeat

Merck is using AI agents to cut drug discovery cycles by a third and ship compliant marketing materials up to 80% faster — but VP of Digital Platforms Sean Finnerty says the only reason it's working is because they built the infrastructure first.

And the pharmaceutical manufacturer is seeing promising early results: AI is generating marketing drafts that are “99% right” when it comes to compliance, shrinking review cycles from months to days and accelerating delivery by 70% to 80%. In the company’s medical research, meanwhile, one AI-assisted discovery cycle was reduced by 33%.

Still, agentic AI only works if companies first build the underlying “plumbing,” Finnerty said of digital platforms and services at a recent AI Impact Series event.

“If we do one-offs, we're gonna end up with thousands and thousands of things that are ultimately just gonna be debt that we'll have to deal with later,” he said. “And that's gonna be a drag on any further innovation.”

Starting with the plumbing

Merck’s plumbing-first strategy comes from lessons learned during the early days of cloud in the 2010s “when nobody knew what the heck was going on,” Finnerty said.

Getting the cloud right meant building from the ground up; at Merck, that infrastructure now supports 2,500 AWS accounts, numerous Microsoft Azure subscriptions, and new Google Cloud Platform (GCP) integrations.

“AI is gonna be the same exact thing,” Finnerty said. “We're going to have thousands and thousands of agents.” The questions then pile up: How do you register them? How do you secure them? How do you ensure they're connected to the right tools, and have access to the right data and the right context?

Context delivery is also critical; Merck works with three hyperscalers and has forty-seven edge locations and hundreds of databases. “Many, many petabytes” of structured and unstructured data are stored in Oracle databases, SQL databases, Excel spreadsheets, phone transcripts, and other repositories, Finnerty said.

His team is building scaffolding to deliver meaningful context in various situations, he explained. Data must be organized and ingested into various platforms, because “there’s no one solution to solve every single problem.” Sometimes it's Databricks, other times it's Amazon Redshift, “plus four other things.”

The goal is: “Let's make that easy and frictionless for people to do, and secure it, and make sure it's well integrated with MCP [model context protocol], and A2A [Agent2Agent], and upstream compute,” Finnerty said. “If you wanna run stuff on GCP or you wanna run stuff on AWS, we've got the plumbing in place so you can run your adjacent workloads wherever you want.”

How Merck is using agents

As it builds out its technical plumbing, Merck is experimenting with agents across regulated enterprise operations, scientific discovery workflows, and app modernization.

Notably, AI is accelerating drug discovery. Finnerty explained that scientists look at molecular structures and disease states to determine if a given condition is druggable. But even if a disease state is known, developing a drug to target it can take years.

Now with AI, teams are starting to see “very promising things,” such as cutting one particular research cycle down by one-third. “That's a year off of the life of the discovery cycle,” Finnerty said. “Which means, theoretically, we can get it to a patient who needs that therapy a year faster.”

Once developed and approved, these products are regulated and marketing materials around them must be clearly and explicitly articulated. “The way you communicate that information per market, per country, per state, per region, is all very carefully governed and regulated,” Finnerty said. It’s also variable: An ad campaign for a vaccine in the state of Georgia looks much different from one launched in Canada.

Historically, humans did the due diligence to make sure the company complied with various laws. Draft materials go through iterations of reviews; when a mistake is discovered, it gets “kicked back to the beginning, and it goes through it again, and then it takes another however many weeks and months,” Finnerty said.

But now, AI can do that “much, much more effectively,” and the process is increasingly evolving from a human-in-the-loop to essentially a "human-as-governor." With human oversight, AI can deliver a first draft in a day or week that is 99% there, allowing teams to ship materials up to 80% faster.

Meanwhile, when it comes to app modernization, AI can discover architecture, document data interactions, APIs, network paths, and do authentication checks and authorization; it can also write code for Terraform for deployment and refactor JavaScript into Python.

Where the company would have previously spent weeks and months and hundreds of thousands of dollars to update one application, Finnerty said, agents are now handling the work through prompts.

Running into "wackiness"

That’s not to say there aren’t significant challenges; Finnerty noted that his team has run into some “wackiness”; for example in automated code and scenario testing. AI has blatantly made up scenarios, whether due to incorrect context, infrastructure, “or if it was just getting creative with, ‘You should be testing these three functions that don't even exist in the code that you're trying to test.’”

“That surprised me a little bit because I thought we were further past some of the hallucination challenges in these later models,” he said.

To address this, his team has engineered guardrails to keep hallucinations to a minimum, essentially using AI to supervise AI and applying confidence scores. So if Claude created the first output, they’ll instruct Microsoft Copilot to assess it.

“So if you ask something once, have AI check it, then ask it a third time, the confidence increases every time, and it minimizes some of the garbage that gets created in the early runs,” Finnerty said.

Use cases for agentic AI in financial services

Meanwhile, at Mastercard, Chief Data Officer Andrew Reiskind and his team are focusing agentic experimentation on highly orchestrated transaction and dispute workflows. As he noted, a chargeback or fraud dispute is not a single event.

When a consumer disputes a charge (typically online), that “kicks off an entire other process on the back-end that tends to be very labor-intensive,” Reiskind said.

Mastercard has to collect specifics about the actual dispute; then the merchant has its own investigations (Was the card reported as lost or stolen? Does the consumer dispute charges often?). Further, the network sitting in the middle has its own rules for timing and information submission.

“You have each and every one of these steps, many of which are unstructured, but there are also structured data elements to this,” Reiskind said. Whether a card was lost or stolen tends to be structured, but the consumer complaint is “unstructured data of questionable reliability.”

“So you're sitting there with a decisioning system that has deterministic decisions, but also probabilistic decisions,” he said.

This problem can be sped up and potentially solved by AI agents, but that can be a complex process: Which tasks are you handing off to agents? When are they kicking things back to human reps? How many agents are you ultimately using? What are the cost implications?

Then there are reputational questions and costs: Have you just called a consumer potentially a liar when they weren't lying?

“It's an exact problem where you want to, as a bank, maintain trust with your consumer,” Reiskind said. “But you also wanna make this efficient and take costs out of the system.”

The PB&J versus turkey mistake: Determine what risks are acceptable

There’s always going to be risk with AI, and enterprises should assess it from the beginning of product design, Reiskind said. There’s also the question of acceptable risk.

As an example: Did you serve a customer a peanut butter jelly sandwich instead of a turkey sandwich (a minor inconvenience)? Or did you serve gluten to someone with celiac disease?

“Is it an acceptable risk if one percent of the time it makes the mistake? If it is, let's go to the next stage of how you're mitigating that risk,” Reiskind said.

Leaders must perform cost-benefit analysis, break problems down to their “constituent pieces,” and calculate cost for each one. But these are estimates; it’s near-impossible to forecast real usage, Reiskind said. “It is not a simple process to get to the cost,” he said. “But it is doable.”