惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

DEV Community

I love MJML — I just didn't want a whole templating engine for two tiny things Are we still in the Console Era of AI? Building a Senior-Level DevOps / SRE / Infrastructure Engineer Terminal Setup (macOS) Media Queries, Transitions, Positions, and Units (rem vs em) Explained Vibe Coding Will Destroy Your Software Engineering Career Your Payment API Wasn't Built for AI Agents. Open Banking Might Be the Fix. The Amazon Interview Process in 2026: Every Round Decoded (With Copy-Paste Scripts) Why Most Social Platforms Optimize Engagement Instead of Emotional Safety How to Build Your Own AI API Gateway (70x Cheaper Than GPT-4o) Announcing LightningChart JS Trader v.4.1 TensorCircuit-NG: Quantum Software On AI, For AI, With AI Open-Source Multi-Agent Orchestration: Lessons from AgentForge AI Agents in Practice — Part 3: How the Control Loop Actually Works Polymarket vs Kalshi: Who Actually Wins on Volume and Liquidity I Wired 8 MCP Servers Into One Claude Agent. 3 Pairs Quietly Fought Over the Same Tool Name. Twenty Minutes, Seventeen Organizations DNSControl + CoreDNS Container Example - Announcement Tech Talks Weekly #106 Umka Parental Control CI/CD for Side Projects: 3 Pragmatic Design Choices Why Agentic AI Is the Most Over-Hyped — and Under-Delivering — Trend of 2026 How teams can add a custom LineageLens adapter — a practical, code-free guide What Engineers Learn After Building Enterprise Chatbots That Actually Go Live The case for compiled, typed CSS (blame AI) Your Terraform estate documents itself now: meet iac-cartographer Vector‑native RAG on Oracle: embeddings, HNSW/IVF, and hybrid search under database governance I Stumbled Into a 40x Cost Reduction by Switching to Chinese AI Models China vs US AI Models in 2026: The Architecture Decision That Saves 40x Chinese AI Models Are 40x Cheaper Than GPT-4o — Here's the Proof ERC-8004 Agent Validation: Trustless Reputation for DeFi Bots Claude Managed Agents Outcomes: Auto-Grading Agent Work 5 URL Encoding Bugs That Silently Break Your App Which AI Tool Wins? Wrong Question. API Contract-Driven Development (Build Reliable Systems Without Guesswork) I built 'Ask Your Life' — a personal Coral agent that answers questions about your money & deadlines with SQL 5G RedCap for embedded IoT: useful 5G without full 5G complexity Building a Live Odds Dashboard in React (without the re-render storm) How to Build Token-Efficient Web Scraping Pipelines for AI Agents Using n8n PyLadies Dublin June Meetup The Dangerous Myth of the "10x Developer" (And Who You Actually Want) I Hardened a Rust Media Upload API with Magic Bytes, Atomic Quotas, and Race Condition Fixes (Part 3) The Moment We Realized the Language Was the Constraint in the Veltrix Treasure Hunt Engine ABAC and CASL with NestJS What If AI Fact-Checked Your Meetings in Real Time? Inside Meeting-Time AI Skills Don't Wrap the LLM. Make Its Failure Modes Unreachable. Building Autonomous DeFi Agents on Arbitrum: From Events to Execution The One Cache That Broke Our Treasure Hunt Engine Why your AI chat reconnects but your session doesn't Why I Built Tenurr: A Private Career Ledger and Document Vault for Engineers (And Solved "Career Amnesia") Rate Limiting in C# — Don't Let Your API Get Hammered I audited the 12 fastest-growing new GitHub repos for fake stars. Here's the base rate. I Stopped Treating AI Agents Like Toys After Hermes Agent Started Running My Entire Week SVG Keyframe Animation in Pure CSS (No Library) The Hidden Cost of Fake Invoices: $127,000 Lost Per Incident The Stream class in Dart Kubernetes HPA Scale to Zero Without KEDA: Native Autoscaling for Idle Workloads Building a Gaming Content Platform with Game Pages and News Articles Can Quantum Computing Change AI? A Deep Dive Into Quantum Machine Learning My PC setup as a Linux user Why Your Chart Library Is the Bottleneck You Never Suspected by Andrew Burnett-Thompson, CEO & Founder of SciChart i touched AWS and stuff didn't break (mostly) Using Google's New AI Command-Line Assistant: Antigravity CLI (agy) and YOLO's No-Confirmation Mode GCP: Upgrading a LINE Bot with Vertex AI ADK Tools for Smart Business Cards and Backup Search My Journey into Web3 Auditing Securing AI Generated Code: You Ship It, You Own It Optimizing Browser Fingerprint Spoofing and Session Validation in Automated Scrapers I Scanned a Vulnerable Kubernetes Cluster with 9 Engines — The AI Filter Caught Everything When the Treasure Hunt Engine ate my weekend How to Choose the Right AI Course in Mumbai Building an Interactive Tier List with Next.js — NTE Tier List Case Study Website Accessibility Audit: The Complete Guide (WCAG 2.2) GitHub has had 257 incidents in 12 months. Here's what that means for your CI pipelines The Moment We Realized the Default Config Was a Lie Grafana Pricing Teardown 2026 Infisical Pricing Teardown 2026 Langfuse Pricing Teardown 2026 Metabase Pricing Teardown 2026 n8n Pricing Teardown 2026 Novu Pricing Teardown 2026 Plane Pricing Teardown 2026 Temporal Pricing Teardown 2026 Python 101 a Comprehensive Guide ToolJet Pricing Teardown 2026 Dev.to is such a fantastic platform for developers, writers, and tech enthusiasts to share knowledge and learn from each other. I really appreciate how the community encourages creativity, collaboration, and continuous learning through insightful articles Twenty CRM Pricing Teardown 2026 Ever Wondered What Actually Happens When You Click “Send” on an Email? Automating MongoDB Auditlogs Cleanup & Restore Workflow with S3 Backup Best Java Web Scraping Libraries The padlock doesn't mean what you think it means I built a simple pytest plugin for test observability (need your help 😅) Laravel AI SDK Silently Kills Your Horizon Queue (And How to Fix It in 4 Config Changes) The Day We Hardcoded 42 in the Treasure Hunt Engine Today we are launching on Product Hunt! I built FreeLedger to end the freelance finance nightmare Fintech Devs May Get Fed Master Accounts Karpathy Joined Anthropic to Train Claude Using Claude Just released my new Flutter package: smart_player_kit The Day the Treasure Hunt Engine Decided to Lie to Us About Latency Django Session Cookie vs localStorage JWT Security Comparison The Day Our Treasure Hunt Engine Blew Up at 3 AM How I Built 8 Free Dev Tools as a Solo Maker — Lessons Learned
OpenBrief Review: Local-First Video AI Summarizer 2026
Andrew · 2026-05-27 · via DEV Community

Originally published on andrew.ooo — visit the original for any updates, code snippets that aged out, or follow-up posts.

TL;DR

OpenBrief is an open-source desktop app that does what most people actually want from a "video AI tool": paste a link, get a transcript, get a grounded summary, and chat with the content — without uploading anything to a SaaS. It hit 88 points on Show HN on May 26, 2026, and it's basically a polished GUI for yt-dlp + Whisper with an LLM layer wired in for summaries and Q&A.

Key facts:

  • Tauri v2 desktop app — macOS, Windows, and Linux from a single Rust/React codebase
  • Bundled yt-dlp — paste a YouTube/Vimeo/arbitrary URL, it downloads locally
  • On-device transcription — Whisper (via whisper.cpp and transcribe-rs) runs against the audio on your machine
  • Bring-your-own LLM — OpenAI GPT, Anthropic Claude, Google Gemini, or OpenRouter (DeepSeek) for summaries and chat
  • Grounded summaries — markdown briefs with timestamped takeaways tied back to transcript spans
  • Chat with media — Q&A against the transcript using the LLM of your choice
  • Text-to-speech — listen to the generated brief (Supertonic 3 / Qwen3-TTS on the roadmap)
  • AGPL-3.0 licensed, monorepo organized as a pnpm/Turborepo workspace
  • Honest limitation: LLM summaries are still cloud-by-default — local Gemma 4 / Qwen3-ASR support is on the roadmap but not shipped yet

If you've ever wanted a private alternative to NotebookLM or Read.ai for your own video backlog, OpenBrief is the most complete open-source attempt to date.

Why "Local-First" Video AI Suddenly Matters

For three years, the default flow for "summarize this video" has been: upload to a SaaS, get a summary, hope they don't train on it. Read.ai, Fireflies, Otter, NotebookLM — they all do good work, but every minute of audio you feed them is a minute on someone else's GPU.

For a lot of use cases that's fine. For others — legal depositions, internal all-hands recordings, anything under NDA, anything you don't want to count toward a per-minute pricing tier — it isn't. The market quietly noticed:

  • yt-dlp keeps gaining stars (currently 100K+) because people want their videos as files, not stream-only assets
  • whisper.cpp made on-device transcription genuinely usable on a MacBook
  • Local LLMs (Llama 3, Qwen 3, Gemma 3, DeepSeek V4) crossed the "good enough for summaries" line in 2025

What was missing was a polished desktop app that stitched these together with a sane UX. OpenBrief is the first Show HN this year that does it without feeling like a research demo.

What Makes OpenBrief Different

OpenBrief isn't doing any one thing nobody else has done. The interesting part is how it composes them.

1. Tauri v2 instead of Electron

Most "local AI" desktop apps reach for Electron because the team already knows React. OpenBrief uses Tauri v2, which means:

  • Rust backend, system webview frontend — installer is ~10MB instead of ~150MB
  • Native filesystem access for the helper sidecar (yt-dlp, ffmpeg) without IPC gymnastics
  • Lower memory footprint while a 90-minute Whisper transcription is running

The src-tauri/ directory exposes Rust commands that the React renderer calls. The helper sidecar — a separate binary that wraps yt-dlp and ffmpeg — is bundled at build time via pnpm setup:dev-sidecars. This is the right shape for a desktop tool that needs to shell out to native binaries; it avoids the "user has to install yt-dlp themselves" trap that kills adoption of a lot of similar projects.

2. Grounded summaries, not free-floating ones

A "grounded summary" in OpenBrief means each bullet in the generated brief is tied to a timestamp span in the original transcript. Click a takeaway → jump to the spot in the audio/video. This is the same idea NotebookLM popularized with its citation-numbered answers, and it matters for one reason: summaries hallucinate, transcripts don't. When the summary says "the speaker proposed a 40% price cut at 12:34," you can verify that's actually what they said.

The implementation is straightforward — the prompt template forces the LLM to emit JSON with { text, start_ts, end_ts } for each point, and the UI renders them as clickable chips. If you've built RAG over transcripts before, you've written approximately this code; OpenBrief just ships it as the default.

3. Pluggable model layer

The README's "Model Support" table is honest about what's shipped vs. roadmap:

Model type Supported today Roadmap
Speech-to-text Whisper Parakeet, Qwen3-ASR
Text-to-speech (placeholder) Supertonic 3, Qwen3-TTS
LLM OpenAI, Anthropic, Gemini, OpenRouter (DeepSeek) Local Gemma 4
Video embeddings Frame/clip semantic search

The architectural win is that all four slots are service abstractions, not hardcoded. You wire your key into settings, the rest of the app doesn't care. Adding a local LLM later is "implement the LLM service interface against Ollama" — not "rewrite the summary pipeline."

Install and First Run

OpenBrief is currently distributed as a source build — the Tauri team hasn't pushed signed installers yet. You need:

# Prerequisites
# - Node.js ^22.21.0
# - pnpm 11.0.9
# - Rust + Cargo
# - Tauri v2 platform prerequisites for your OS
#   (Xcode on macOS, MSVC on Windows, webkit2gtk on Linux)

git clone https://github.com/tantara/openbrief
cd openbrief/client
pnpm install

# If pnpm flags ignored native build scripts on first install:
pnpm approve-builds  # approve the listed packages
pnpm install         # rerun

# Build and run the desktop app
cd apps/tauri
pnpm setup:dev-sidecars     # builds the yt-dlp helper sidecar
pnpm prepare:media-assets   # downloads Whisper model + ffmpeg
pnpm dev                    # launches the desktop window

Enter fullscreen mode Exit fullscreen mode

First-run experience:

  1. App opens to an empty library.
  2. Paste a YouTube URL into the import bar.
  3. The helper sidecar shells out to yt-dlp, downloads the audio (or video), and stores it in your local app data directory.
  4. Whisper kicks off in a background worker. On an M2 MacBook Air, a 30-minute video transcribes in ~3 minutes against whisper-base. The bundled model size and the underlying whisper.cpp build determine real-world speed.
  5. Once the transcript exists, the Summarize button calls your configured LLM with the grounded-summary prompt.
  6. The right-hand pane shows the brief with clickable timestamps; the chat box at the bottom lets you ask follow-up questions against the transcript.

Configuring API keys

Settings → Models. Each provider gets its own key field:

  • OpenAI: sk-... — used for gpt-4o, gpt-4o-mini, or whatever model id you set
  • Anthropic: sk-ant-... — Claude models for summary + chat
  • Google: Gemini API key
  • OpenRouter: sk-or-... — routes to DeepSeek and 100+ others, cheap for bulk summarization

Keys are stored in the platform secure store (Keychain on macOS, Credential Manager on Windows, libsecret on Linux), not in plaintext config.

Real Use Cases

A few patterns that justify the install:

Conference talk backlog. I have ~40 saved YouTube talks I'll "watch later." OpenBrief into a library, summarize each, scan the briefs, watch the three that earn it. This is the use case the Show HN comments kept coming back to.

Podcast research. Drop a 90-minute episode in, ask: "Did they mention any specific tools or products?" The chat-with-transcript flow surfaces names you'd never catch on a normal listen.

Internal recordings. All-hands, customer interviews, sales calls — anything you can't or shouldn't upload to a SaaS. The transcript stays on disk, the summary is generated by your own API key (which you can route to a self-hosted LLM via OpenRouter or an OpenAI-compatible proxy).

Course / lecture notes. Long-form educational content compresses well into timestamped briefs. The grounded format makes it easy to re-watch the parts you actually need.

What the Show HN Thread Said

Sentiment in the HN thread was warm but pointed:

  • "Basically a GUI for yt-dlp with AI on top" — the maintainer's own characterization, and the thread mostly agreed that's a feature, not a bug. Wrapping a powerful CLI tool in a clean desktop UX has product value.
  • Local LLM gap — the most common request was support for Ollama / llama.cpp for the summary step. Right now BYO-key works fine but the "local-first" claim has an asterisk while the LLM call still goes to a cloud API by default.
  • Comparisons to NotebookLM — several commenters framed it as "the open-source NotebookLM I've been waiting for," which is roughly accurate for video/audio specifically. NotebookLM still has the edge on multi-source synthesis and audio overviews.
  • Why Tauri over Electron — a few people appreciated the small installer size; Tauri v2 is becoming the default pick for new local AI tools (Jan, LM Studio's recent versions, Bolt, OpenBrief).

Who Should Use This (and Who Shouldn't)

Use OpenBrief if:

  • You watch / produce a lot of long-form video or audio content
  • You want transcripts and summaries to stay on your machine
  • You already pay for at least one LLM API and want to extract more value from that key
  • You're comfortable with a source build for now (signed installers coming)

Skip it (or wait) if:

  • You want fully local end-to-end — local Gemma 4 / Qwen3-ASR are on the roadmap but not shipped
  • You need team collaboration features — this is a single-user desktop app today
  • You need real-time meeting transcription — OpenBrief is post-hoc, not live (use Whisper-Live or Fireflies for that)
  • You're on a low-spec machine — Whisper transcription is CPU/GPU-heavy; a 90-minute video takes meaningful time on older hardware

Comparison with Alternatives

Tool Local-first Open source Video download Grounded summary Chat Cost
OpenBrief ✅ (transcript) ✅ AGPL-3.0 ✅ yt-dlp Free + your LLM API
NotebookLM Free (Google account)
Read.ai $0.04/min uploads
Otter.ai $8.33/mo+
MacWhisper Limited $20–60 one-time
Whisper.cpp + custom Manual DIY DIY Free + dev time

The honest read: NotebookLM is still better for multi-source research with audio overviews. OpenBrief is better when privacy or per-minute cost is the constraint, or when you want one place for your personal video library.

Architecture: How a Single Import Actually Flows

For developers considering forking or contributing, here's the request lifecycle when you paste a URL:

[React UI: paste URL]
       │
       ▼  Tauri command: import_url
[Rust core] ──► [Helper sidecar: yt-dlp]
       │              │
       │              └─► writes .mp4/.m4a to app data dir
       │
       ▼  Tauri command: start_transcription
[Whisper worker (whisper.cpp via transcribe-rs)]
       │
       └─► writes transcript.json with { segments: [{start, end, text}] }
              │
              ▼  React reads transcript, renders timeline
              │
              ▼  User clicks "Summarize"
       [LLM service: OpenAI/Anthropic/Gemini/OpenRouter]
              │  prompt enforces grounded JSON schema
              ▼
       [Summary stored with timestamp links → rendered as clickable chips]

Enter fullscreen mode Exit fullscreen mode

Every step is replaceable. The LLMService interface in packages/api/ is a few dozen lines; bolting on Ollama is straightforward, and the GitHub Issues already have a PR draft from a contributor doing exactly that.

Limitations Worth Knowing About

  1. "Local-first" still routes summaries to cloud LLMs by default. Until local Gemma 4 ships, the transcript stays local but the summary prompt (which includes the transcript) goes to whichever API key you configured. If that's a dealbreaker, run a local OpenAI-compatible endpoint like LiteLLM or Ollama's OpenAI-compatible mode and point OpenBrief at http://localhost:11434.
  2. No signed installers yet. Source build only on May 2026. On macOS that means xcode-select --install is a prerequisite.
  3. Whisper model is fixed at install time. Switching from base to large-v3 for higher accuracy requires re-running pnpm prepare:media-assets.
  4. AGPL-3.0 — fine for personal use and internal tools; if you want to fork it into a SaaS, the AGPL terms apply to network use. Read them before commercial deployment.
  5. TTS is a placeholder. The "listen back" feature in the README requires Supertonic 3 / Qwen3-TTS support that isn't merged yet.

FAQ

Is OpenBrief free?

Yes — the app itself is open source under AGPL-3.0 and there's no subscription. You pay for the LLM API calls (your OpenAI / Anthropic / Gemini / OpenRouter key), which run a few cents per long video at current gpt-4o-mini or DeepSeek pricing.

Does OpenBrief work fully offline?

Transcription does — Whisper runs on your machine. Summaries and chat don't today, because they call your configured cloud LLM. Local LLM support (Gemma 4) is on the roadmap. As a workaround, point the OpenAI provider field at an Ollama or LiteLLM endpoint running on localhost.

How does OpenBrief compare to NotebookLM?

NotebookLM is stronger for multi-source synthesis (PDF + video + notes in one notebook) and ships an audio-overview generator that OpenBrief doesn't have yet. OpenBrief is stronger for privacy (everything stays on disk except the LLM call, which goes through your key), format flexibility (any video URL yt-dlp supports), and library management (it's built around a long-running personal collection rather than per-notebook sessions).

Can I use OpenBrief with a self-hosted LLM?

Yes — any OpenAI-compatible endpoint works. Configure the OpenAI provider with your local URL (e.g., http://localhost:11434/v1 for Ollama or LiteLLM) and a placeholder API key. End-to-end local pipelines work today with this trick; the official "local LLM" UI option is still on the roadmap.

What videos can it download?

Anything yt-dlp supports — that's YouTube, Vimeo, Twitch VODs, X/Twitter videos, podcasts hosted on most major platforms, and ~1,800 sites total. You can also import local .mp4, .m4a, .mp3, .wav files directly without downloading.

How fast is Whisper transcription?

Depends on the model and hardware. On an M2 MacBook Air with whisper-base, expect ~10x realtime (a 30-minute video transcribes in ~3 minutes). Larger models (large-v3) are slower but more accurate; M-series with Metal acceleration is the sweet spot. CPU-only on older Intel hardware can be 1–2x realtime.

Is there a Linux or Windows build?

The repo says all three platforms are supported, but since signed installers aren't published yet, Linux and Windows users build from source the same way macOS users do. The Tauri v2 prerequisites differ per platform — check the Tauri docs for your distro.

Bottom Line

OpenBrief is the first respectable open-source attempt at a personal video AI library — the kind of tool that sits between yt-dlp (too low-level) and NotebookLM (too cloud). It's not finished — local LLM support and signed installers are the two big gaps — but the architecture is right, the maintainer is active in the Show HN thread, and the AGPL license keeps it honest.

If you want to try it: clone the repo, run pnpm dev:tauri, and feed it one of your "watch later" videos. If the grounded-summary UX clicks, you'll probably end up importing your whole backlog within a day.

Repo: github.com/tantara/openbrief
Show HN: news.ycombinator.com/item?id=48272393