惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
G
GRAHAM CLULEY
P
Privacy & Cybersecurity Law Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
宝玉的分享
宝玉的分享
P
Proofpoint News Feed
H
Help Net Security
V
Visual Studio Blog
阮一峰的网络日志
阮一峰的网络日志
C
Cisco Blogs
人人都是产品经理
人人都是产品经理
Know Your Adversary
Know Your Adversary
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Recorded Future
Recorded Future
I
Intezer
罗磊的独立博客
T
The Exploit Database - CXSecurity.com
Blog — PlanetScale
Blog — PlanetScale
Malwarebytes
Malwarebytes
Spread Privacy
Spread Privacy
T
Tor Project blog
V
Vulnerabilities – Threatpost
云风的 BLOG
云风的 BLOG
腾讯CDC
B
Blog RSS Feed
Stack Overflow Blog
Stack Overflow Blog
F
Future of Privacy Forum
MyScale Blog
MyScale Blog
Latest news
Latest news
IT之家
IT之家
MongoDB | Blog
MongoDB | Blog
The Hacker News
The Hacker News
S
Securelist
博客园 - 【当耐特】
C
CXSECURITY Database RSS Feed - CXSecurity.com
T
Threat Research - Cisco Blogs
Jina AI
Jina AI
Cisco Talos Blog
Cisco Talos Blog
B
Blog
博客园 - 三生石上(FineUI控件)
Last Week in AI
Last Week in AI
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
M
MIT News - Artificial intelligence
V
V2EX
D
Darknet – Hacking Tools, Hacker News & Cyber Security
The Cloudflare Blog
The GitHub Blog
The GitHub Blog
博客园 - 聂微东
F
Full Disclosure
C
CERT Recently Published Vulnerability Notes

DEV Community

Microservices Didn't Fail. People Did 400+ Remote Companies Using React in 2026 Gizmo Guard - Safeguard Bot (Powered by Gemma4) Gizmo Guard - Safeguard Bot (Powered by Gemma4) Grafana 'No Data' after migration: 7 reconcilers we had to kill first CrimsonOS: Building a Mobile OS from the Firmware Up I’ve Been Building a Python Game Engine Counting tokens is dumb. So we built a free metric for AI proficiency. Best Free AI Tools for Developers in 2026 Selling Without Stripe in a Country That Stripe Can't Reach: When Compliance Becomes a Technical Problem How I built a fallback loop to save my recommendation engine Solana's Biggest Consensus Overhaul Is Live for Testing. Here's What Builders Need to do right now. Your agent keeps using that word ... OpenSparrow v2.3 – visual admin panel, zero dependencies, now with ERD and M2M support Why AI Engineering Is Becoming More Like Distributed Systems Engineering How I Cut My LLM Costs by 90% Without Changing My App Logic Security Is Important. Automate It I killed my SaaS after 17 days and rebuilt it into something else GitHub Actions for HIPAA-compliant deployments How to Stop Your LLM Agent From Looping Itself Into Oblivion Apache Kafka for Beginners: Building Real-Time Streaming Systems with Python Dating the Crawler AI-Assisted Frontend Reviews Using Gemma 4 Building Secure Multi-Agent Systems: My Takeaways from Google I/O 2026 The Most Underrated Announcement from Google I/O 2026 Was Buried in a 90-Second Demo How to Fix CUDA Out of Memory Errors in Stable Diffusion WebUI My Experience Building My First Token And Having it Exist On-Chain. African Creators Deserve Better: How I Built a Payment Gateway for Every Corner of the Continent React CRUD basics Should Websites Allow AI Search Crawlers? Chunking Strategies for AI Code Review on Large Repos Beyond the Prompt: How to Build Stateful AI Agents with Persistent Memory and Self-Learning Loops What 10 University Visits in Cameroon Taught Me About Building AI for the Real World, and Why Gemma 4 Was the Answer The Universal Remote for AI: A Deep Dive into the Model Context Protocol (MCP) AgentGuard 0.3.0 — macOS menu bar app, Telegram rollback, and more Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent Shopify Functions vs Shopify Scripts: A Migration Walkthrough What Actually Survives a Chicago-Area Winter on Your Deck Rethinking Geo-Blocking and Stripe's Failures in Global Access: A Cautionary Tale of Misoptimization I Built a Free Brat Generator - Here's What I Learned About Next.js Performance published Found a Second Layer to a GitHub Follow Botnet? AI Daily Digest: May 22, 2026 — Agentic Workflows, Coding Agents & Embodied AI How I Secured Internal Microservice Calls Without Passing JWTs Stop Mixing Them Up: SLI vs SLO vs SLA Explained Rebuilding My Engineering Mind Building a Music Production Ecosystem Instead of Just Releasing Plugins The Vonage Dev Discussion: How AI is transforming software development I Gave Our Enterprise AI a Memory. It Started Citing Last Quarter's Incidents. 𝐓𝐡𝐞 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧 𝐒𝐭𝐲𝐥𝐞 𝐂𝐫𝐢𝐬𝐢𝐬 Hermes Agent in the Wild: How I Turned It Into an AI Ops Employee Navigating the Hazy Jungle of Global E-commerce: How We Built a Reliable System for Digital Creators in Tanzania The Cost of Cross-Platform Development: Native Module Integration AI-Native Apps Will Swallow the Web I switched my Gemma 4 model three times in 72 hours. Here's the decision tree I wish I'd had. Inside #100DaysofSolana: A Guided Path into Web3 I Built and Shipped TinyHab: an ADHD-Friendly Habit Tracker for iOS I'm an ECE Student Who Vibe Codes Hardware Projects — Here's What Google I/O 2026 Actually Changed for Me From Fragmented Pipelines to Coherent Intelligence — Why Gemma 4 Actually Changes How I Work Our AI Inference Bill Dropped 65% After We Stopped Treating Every Query the Same Why P95 Latency Is the Only Metric That Matters at 3 AM Recycling made easy: a Polish recycling assistant powered by Gemma 4 The Complete Guide to Running a Midnight Node: Setup, Sync & Monitoring De CSRF a RCE: una visita web cuesta una shell en OpenYak Why We Built a Faster Wiki Building a Browser-Based Inkarnate Alternative for D&D Battle Maps Apache Kafka How to Build a FinTech Platform as a Solo Developer (By Any Means Necessary) Your LLM Logs Deserve Better — Send Claude Code Events to Bronto I built a free tool to track subscriptions and stop getting surprised by charges Building the TEYZIX CORE Internship Portal — My Full-Stack Development Journey PocketCFO: a private personal-finance brain that runs entirely in your browser Go Idioms I Wish I Knew Earlier Hey how are you guys I'm newbie web developer , learning wordpress+elementor Right now I don't know what to make I don't know what to write or use what color can you tell me about it ? Google I/O 2026 Blew My Mind — Here's What It Means for the Family App I'm Building 5 Things I Learned in My First Month as a Dev Intern EU AI Sovereignty Belongs in the Workflow Layer Why AI Coding Agents Need Business Context, Not Just Code Context How I Built 9 Claude AI Features into a Production SaaS Expo SDK 56 HashiCorp built an MCP server for writing Terraform. I built one for reviewing it Why Enterprise AI Agent Deployments Keep Failing Date Shear: A New Term for a Common Programming Pain Point Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift Zod Validation: Type-Safe APIs & Forms in TypeScript (Complete Guide) GitHub Actions CI/CD: Build a Complete Node.js Pipeline (2026) MCP in 2026: The numbers behind the ecosystem explosion working with an ai model mirror Learnt new things Four Metrics That Actually Tell You Whether Your Enterprise RAG Is Working Beyond the Stateless Prompt: Building an Auditable Product Intelligence Pipeline with Cascadeflow and Hindsight Most Creators Are Building in Pieces. I’m Building the Entire System. The Hidden Privacy Problem in Every AI App CVE-2026-26007: Subgroup Confinement Attack in pyca/cryptography The One Thing I See in Every Developer Who Gets Unstuck AI Memory Governance for Legal Tech: How Contract AI Agents Handle Privileged Data Two tables, zero migrations, full LINQ — a .NET data engine that's been running our production for 3 months Join the GitHub Finish-Up-A-Thon Challenge: $3,000 Prize Pool! I Replaced a $50/Month OCR API with Gemma 4’s Native Vision (And You Can Too) Building a Data-Driven Medical Image Enhancement Pipeline with Differential Evolution 🔥🩻 Why I Like Small Software
I Replaced ChatGPT With Gemma 4 In My Product. It Felt Like The Same Radio Show With A Different Host.
Kirill · 2026-05-22 · via DEV Community

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

Most “read later” links quietly die in browser tabs. At some point I realized I wasn’t actually trying to consume more content anymore. I was trying to reduce the cost of deciding what deserved my attention in the first place.

That realization eventually turned into TLDR Radio — a Telegram bot that converts long-form articles and discussion threads into short audio briefings you can consume while walking, commuting, cooking, or doing literally anything except staring at another glowing rectangle.

But while building it, I accidentally discovered something much more interesting than “AI summaries”. I swapped the underlying model, and almost nothing important broke. That realization stayed in my head much longer than any benchmark chart.

What I Built

TLDR Radio is an audio-first article triage system. You send a link. The bot:

  • fetches the article
  • extracts readable content
  • optionally pulls discussion context
  • generates a structured summary
  • converts it into audio
  • sends the result back through Telegram

The original problem was surprisingly simple. My browser had basically become a graveyard of tabs I was never going to read anyway. And the real issue wasn’t lack of time. It was decision fatigue. Choosing what deserved attention started feeling more exhausting than the actual reading itself. So I stopped treating summarization as “compression”. I started treating it as attention routing.

The core UX decision was intentional from the beginning: I did not want another AI chat interface. I wanted passive consumption.

The product only started feeling genuinely useful once I could:

  • listen while walking
  • listen while driving
  • listen while cooking
  • listen while cleaning
  • stay away from the screen entirely

That constraint ended up influencing almost every architectural decision. The system itself looks less like a chatbot and more like a media-processing pipeline.

High-level flow:
High-level flow


Demo

Landing: https://tldr-radio.humifylab.com
Telegram Bot: https://t.me/TldrRadioBot

How to use: send one or two links and get a short audio summary.

[UX demo: from link to detailed audio summary]

Conversation in Telegram, audio list and lock screen[Conversation in Telegram, audio list and lock screen]

Each audio summary has a message with a caption, tags, a few first sentences of the summary, and sources. You can see the difference between Gemma and ChatGPT by comparing those messages yourself. For the rest of the article, Gemma is on the left.

Gemma on the left and ChatGPT on the right[Gemma on the left and ChatGPT on the right]

One thing I really like is pulling in discussion context from places like Hacker News and Reddit. An article is just one perspective. The comment threads usually surface the real signal way faster than the article itself. There's also an option to go deeper and get a more detailed summary, which works really well for long HN threads.

Gemma on the left and ChatGPT on the right[Gemma on the left and ChatGPT on the right]


Code

One thing I wanted very deliberately was separating:

  • webhook latency
  • durable job execution
  • asynchronous processing
  • execution snapshots

The architecture is heavily queue-oriented. The webhook itself stays lightweight and returns quickly. Long-running work happens asynchronously in workers.

Architecture diagram[Architecture diagram]

The stack currently includes:

  • ASP.NET Core Minimal API
  • PostgreSQL
  • OpenTelemetry
  • LLMs providers
  • Telegram Bot API
  • TTS providers
  • Fly.io deployment

The LLM is only one component inside the pipeline, not the entire product.

One small feature to mention is procedural-generated images as covers. For each summary mp3 ID3 tags are written, including "Album" cover. How do you like these?

Procedural-generated images as covers[Procedural-generated images as covers]

The actual TLDR Radio repository is currently private. But during development I extracted part of the infrastructure into an open-source production-oriented Telegram bot starter for .NET:

https://github.com/lemesevkirill/telegram-bot-starter-dotnet

It contains the asynchronous webhook/worker architecture that heavily influenced TLDR Radio itself.


How I Used Gemma 4

Originally, TLDR Radio used ChatGPT-based summarization. That felt like the obvious choice. Then the Gemma 4 challenge appeared and I started wondering: What actually happens if I swap the model without changing anything else?

For the core reasoning engine of TLDR Radio, I selected the Gemma 4 31B Instruct model, deploying it via OpenRouter's free tier. Within the Gemma 4 ecosystem, developers often choose between the high-throughput Mixture-of-Experts (MoE) models (like the 26B variant) and dense architectures. I intentionally chose the 31B Dense model for a specific architectural reason: script-writing and role-preservation.

While MoE models are incredibly cost-efficient because they activate fewer parameters per token, dense models utilize their entire parameter weight (all 30.7B parameters) for every single token generated. For an audio-first product like TLDR Radio, this full-scale dense processing is critical. It delivers more cohesive narrative structures, better flow, and firmly holds the "radio host personality" across complex, multi-layered summaries without breaking character.

Using OpenRouter allowed me to plug this 31B dense powerhouse into my .NET pipeline instantly, gaining a massive 256K context window and native multilingual support without managing complex local infrastructure.

Honestly, I expected the quality to collapse. That’s not what happened. This became the most interesting part of the entire experiment.

Gemma on the left and ChatGPT on the right[Gemma on the left and ChatGPT on the right]

I intentionally kept:

  • the same prompts
  • the same orchestration
  • the same summary structure
  • the same Telegram UX
  • the same audio generation flow

The only thing that changed was the model. And the result did not feel like smart AI vs dumb AI or high-quality vs low-quality. It felt more like swapping podcast hosts.

Gemma on the left and ChatGPT on the right[Gemma on the left and ChatGPT on the right]

ChatGPT often sounded patient and explanatory. Gemma frequently sounded denser and more compressed, almost like:

“here’s the essence, let’s move”

The factual quality was often surprisingly close for this workflow.
What changed more noticeably was:

  • pacing
  • sentence density
  • narration rhythm
  • listening feel
  • emotional texture

That was the moment where the whole thing stopped feeling like “model evaluation”. It started feeling more like media production for the same show with different hosts. And that realization stayed in my head much longer than expected. Because I originally assumed TLDR Radio was basically a model experiment. Smarter model equals better product. Simple. Then I started swapping models and something uncomfortable happened: The model quietly stopped feeling like the whole product.

Gemma on the left and ChatGPT on the right[Gemma on the left and ChatGPT on the right]


Real-World Observations

One thing that became obvious very quickly: Operational reliability matters enormously in audio products.

The free Gemma endpoints through OpenRouter were heavily throttled during peak usage. The paid endpoint was dramatically more stable. Which mirrors a broader AI product lesson: Raw intelligence matters less if the operational experience becomes unreliable.

As long as the endpoint is stable, Gemma is totally fine on the pipeline side. You can do everything with Gemma that you do with ChatGPT - latency, limits, context, all the technical details work.

Another interesting observation: I expected prompt portability to be much worse. Instead, both models handled the orchestration surprisingly well. That made the models feel far more interchangeable than I originally expected. Multilingual behavior also changed the feel of the product in interesting ways. Not just translation quality. Personality. Different combinations of model / language / TTS provider started producing noticeably different listening experiences.

Again: less like swapping engines, more like swapping hosts.


Final Thought

Building TLDR Radio changed how I think about AI products. I expected swapping models to feel like replacing the engine. Instead, it felt more like replacing the host of the same radio show.

Gemma didn’t replace GPT in this project. It changed the pacing, tone, and listening feel of the experience. And that turned out to be much more interesting than a benchmark comparison.

The biggest surprise wasn’t realizing that open models got good. It was realizing how quickly the model itself stopped feeling like the whole product.

While building TLDR Radio, I ended up thinking about something larger: What happens when intelligence itself becomes infrastructure? I wrote a more philosophical version of that realization here:

https://futurehangover.substack.com/p/nobody-cares-about-your-frontier

And if you want to try the bot itself:

https://t.me/TldrRadioBot