惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
阮一峰的网络日志
阮一峰的网络日志
Apple Machine Learning Research
Apple Machine Learning Research
爱范儿
爱范儿
WordPress大学
WordPress大学
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
罗磊的独立博客
S
SegmentFault 最新的问题
V
V2EX
V
Visual Studio Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
美团技术团队
博客园 - 三生石上(FineUI控件)
Stack Overflow Blog
Stack Overflow Blog
Y
Y Combinator Blog
MyScale Blog
MyScale Blog
D
Docker
Google DeepMind News
Google DeepMind News
Blog — PlanetScale
Blog — PlanetScale
M
Microsoft Research Blog - Microsoft Research
Martin Fowler
Martin Fowler
S
Secure Thoughts
B
Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Recent Announcements
Recent Announcements
MongoDB | Blog
MongoDB | Blog
C
Cisco Blogs
C
CERT Recently Published Vulnerability Notes
T
True Tiger Recordings
GbyAI
GbyAI
P
Proofpoint News Feed
P
Privacy International News Feed
Jina AI
Jina AI
The Cloudflare Blog
I
Intezer
AWS News Blog
AWS News Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
S
Security Archives - TechRepublic
NISL@THU
NISL@THU
The Register - Security
The Register - Security
Recent Commits to openclaw:main
Recent Commits to openclaw:main
P
Palo Alto Networks Blog
S
Schneier on Security
L
LINUX DO - 热门话题
C
CXSECURITY Database RSS Feed - CXSecurity.com
Security Latest
Security Latest
C
Cybersecurity and Infrastructure Security Agency CISA

TestingCatalog

Google unveils 24/7 Gemini Spark AI Agent for advanced tasks Google launches Gemini 3.5 Flash AI model to all users Google rolls out Gemini Omni AI for video generation Anthropic launches secure sandboxes and private MCPs How to watch Google I/O 2026 and what to expect Cursor released Composer 2.5 with up to 10x cost efficiency Manus released Scheduled Tasks 2.0 upgrade for all users Codex can now control other desktop devices via Computer Use OpenAI launches personal finance on ChatGPT for Pro users Perplexity connects Computer to Snowflake and Databricks OpenAI brings Codex to ChatGPT mobile apps Microsoft expands Copilot in Edge with new AI tools OpenSquilla launches open-source AI agent to cut token costs Google prepares Gemini Spark AI Agent ahead of I/O launch Meta releases Incognito AI Chat in WhatsApp and Meta AI holaOS 0.1 launches as AI Workstream Management Layer Cline releases open-source agent runtime SDK thehype launches 24/7 AI-powered radio for founders Google brings Gemini Intelligence automation to Android Meta announced Muse Spark in Voice Mode and Meta Glasses Gemini Omni Agent will launch along with Avatars support Thinking Machines announced new SOTA Realtime Voice model OpenAI announces Daybreak initiative around Codex Security Anthropic adds Agent View to Claude Code CLI interface Google’s Gemini Omni video model surfaces ahead of I/O debut OpenAI set to add remote Codex control to ChatGPT mobile app OpenAI adds Chrome plugin and tests Remote control for Codex Google shipped Gemini 3.1 Flash-Lite in General Availability Google unveils Google Health app, Health Coach, Fitbit Air Telegram ships major update for AI bots and automations OpenAI launches new realtime voice and translation AI models SpaceXAI prepares Grok Build desktop app for release Scale Labs debuts new Refactoring Leaderboard for AI Meta prepares Hatch AI Agent with waitlist and social skills Google tests Agent Mode on Flow to automate video production Google prepares Agent Mode on Gemini to tackle complex tasks Anthropic partners with SpaceXAI and doubles 5h rate limits Anthropic debuts Dreams for Claude Managed Agents Google made Gemma 4 models 3x faster with MTP Drafters Maket opens Draw from Scratch tool to all users for free Google tests screen sharing and custom agents in Antigravity OpenAI launches GPT-5.5 Instant as new ChatGPT default Manus adds connector suggestions based on task needs Inworld AI launches Realtime TTS-2 for live conversations Google prepares new upgrades for Gemini Flash model Gemini mobile app redesign leaks show upcoming look Anthropic working on Orbit, its upcoming proactive assistant Perplexity prepares Digest tool for personalized summaries TinyFish makes Search and Fetch APIs free for all developers OpenAI adds animated Pets and config imports to Codex Google is testing new Omni model for video generation ahead of I/O Anthropic tests Jupiter-v1-p ahead of its developer conference OpenAI updates Codex and prepares Remote Control feature Manus launches Cloud Computer with service hosting feature xAI debuts Imagine Agent in Grok with open Canvas workspace Mistral AI unveils Medium 3.5 model and Work Mode for Le Chat Meta invests in space solar and storage to power US data centers Anthropic rolls out Claude connectors for creative platforms Microsoft Copilot in Outlook adds AI to manage inbox and calendar Mistral AI launches Workflows public preview for Enterprises NotebookLM tests Mind Map controls and Play Books sources SenseTime releases SenseNova U1 models on HuggingFace Base44 allows instant data migration from other platforms xAI rolling out custom, shareable Imagine templates for Grok ElevenLabs launches Agent Templates for faster bootstrapping OpenAI can now host models with other cloud providers, like AWS GitHub Copilot moves to usage-based billing for all plans Anthropic tests new Bugcrawl tool for Claude Code bug detection Google tests Catalog and Website generation for Pomelli Google prepares credits system for Gemini and new image tools Anthropic launches Memory in Claude Agents for enterprise xAI launches Grok Voice Think Fast 1.0 for voice agents DeepSeek released 3 new open-source V4 models Maket AI can now edit your floor plans in real-time OpenAI launches GPT-5.5 on ChatGPT and Codex OpenAI launched 24/7, always-on Workspace Agents in ChatGPT ICYMI: OpenAI launches Images 2.0 on ChatGPT, Codex, and API Google debuts Workspace Intelligence for Gemini Workspace Google launches new Agent Platform for Gemini Enterprise Atomic Bot adds one-click Hermes Agent setup on desktop Google debuts Deep Research agents on AI Studio and APIs OpenAI develops platform for always-on Agents on ChatGPT Anthropics works on its always-on agent with UI extensions OpenAI prepares 8 interactive Avatars for its Codex app Moonshot AI launches Kimi K2.6 on Kimi Chat and APIs Anthropic likely preparing Claude Security for broader release Google tests Google AI subscription support for AI Studio Anthropic launches Claude Design AI tool for paid plans Perplexity released Personal Computer to all Max subscribers Exclusive: Early look at Grok Computer and Grok Build OpenAI Codex transformed into Superapp with Computer Use Anthropic launches Claude Opus 4.7 model on apps, and APIs Opera adds Browser Connector to pass tabs and content to AI Windsurf 2.0 adds Devin and Agent Command Center Perplexity tests new Workflows tab for Perplexity Computer Google tests Live Mode with screen sharing for Gemini desktop Meta partners with Broadcom for custom AI chip development Google DeepMind releases Gemini Robotics-ER 1.6 OpenAI expands Trusted Access for GPT-5.4-Cyber Humwork A2P marketplace connects AI agents with experts
Exclusive: Early look at the next Gemini desktop upgrade
Alexey Shaba · 2026-05-18 · via TestingCatalog

Just days before Google I/O kicks off, fresh signals from inside the desktop Gemini build point to a sweeping upgrade for the recently launched Mac client, which has lagged behind the web version. The initial release was deliberately pared back, but the next wave appears ready to close that gap.

Gemini Live Overlay
Gemini Live Overlay

A Gemini Live mode is being prepared as a floating desktop overlay, allowing Gemini to observe what's happening on screen and respond in real time via a voice model. This positions Google directly against ChatGPT's macOS companion mode and the screen-aware Claude experiments out of Anthropic. A second addition, internally framed as Stream to Cursor, appears to plug into the Magic Pointer concept previewed at The Android Show. Rather than waiting for a prompt, the cursor itself would read context around whatever element it hovers over and surface relevant suggestions, blurring the line between pointing device and agent trigger.

0:00

/0:40

Gemini Desktop

Video generation is also being threaded into the desktop client through what is internally labeled "Veo4 Omni". The naming hints at a single omni-modal output system rolling up under the broader Gemini Omni umbrella.

Gemini Spark
Gemini Spark

The most consequential thread is Gemini Spark on desktop. Users would be able to point Spark at local folders and let the agent edit, analyze, move, and rename files within them, with support for skills and connector access to Google Drive and the broader Google services layer. That would extend Spark from a proactive web assistant to a local file-system agent, the territory currently being pursued by OpenAI's Codex desktop work and Anthropic's Claude Code.

Join Dev Mode Discord for more 👀

Join

Taken together, Google appears to be preparing the desktop app to host its full agentic stack rather than serving as a thin wrapper around the chat window. With I/O opening tomorrow, much of this should surface on stage!

Credits: Anonymous Contributor