InertiaRSS Track and read blogs, news, and tech you care about
Read Original Open in InertiaRSS

Recommended Feeds

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

推广

No Articles

[Promotion] Google I/O announced the Gemini Omni Flash, sharing the conversational video editing experience
WickedZX · 2026-05-24 · via 推广

Recently tried out the Gemini Omni Flash released by Google at I/O 2026 and here are my thoughts.

The biggest difference with this model is that you can edit videos through conversation. After generating a clip, you can simply say "change the background to a beach," "slow down the footage," or "add a person on the right," and it will only modify the part you specified while keeping the rest intact. You don't have to regenerate the entire clip like with Sora each time.

Key points:

- Supports multimodal input: text + images + audio + video can be fed in together
- Outputs 10-second clips with synchronized audio
- YouTube Shorts is free to use; the Gemini app requires AI Plus ($7.99/month)
- The developer API hasn't been opened yet, with a release expected "within a few weeks"
- All outputs are强制带 SynthID 水印

compared to Sora 2: Sora has better character consistency and can generate 25-second clips; Omni Flash excels in multimodal input and dialogue editing, with much lower iteration costs.

also has limitations: 10-second upper limit, cannot edit audio (to prevent deepfakes), text rendering is not very accurate, and complex motion scenes occasionally crash.

If you want to quickly experience video generation, you can check out [gemini omni]( https://www.veol.ai?utm_source=v2ex), which supports up to 4K output and charges per usage starting at $0.15.

Have you used any V friends? I feel that the direction of conversational editing is quite right, but the 10-second limit is indeed a bit short.