惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
阮一峰的网络日志
阮一峰的网络日志
Apple Machine Learning Research
Apple Machine Learning Research
爱范儿
爱范儿
WordPress大学
WordPress大学
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
罗磊的独立博客
S
SegmentFault 最新的问题
V
V2EX
V
Visual Studio Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
美团技术团队
博客园 - 三生石上(FineUI控件)
Stack Overflow Blog
Stack Overflow Blog
Y
Y Combinator Blog
MyScale Blog
MyScale Blog
D
Docker
Google DeepMind News
Google DeepMind News
Blog — PlanetScale
Blog — PlanetScale
M
Microsoft Research Blog - Microsoft Research
Martin Fowler
Martin Fowler
S
Secure Thoughts
B
Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Recent Announcements
Recent Announcements
MongoDB | Blog
MongoDB | Blog
C
Cisco Blogs
C
CERT Recently Published Vulnerability Notes
T
True Tiger Recordings
GbyAI
GbyAI
P
Proofpoint News Feed
P
Privacy International News Feed
Jina AI
Jina AI
The Cloudflare Blog
I
Intezer
AWS News Blog
AWS News Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
S
Security Archives - TechRepublic
NISL@THU
NISL@THU
The Register - Security
The Register - Security
Recent Commits to openclaw:main
Recent Commits to openclaw:main
P
Palo Alto Networks Blog
S
Schneier on Security
L
LINUX DO - 热门话题
C
CXSECURITY Database RSS Feed - CXSecurity.com
Security Latest
Security Latest
C
Cybersecurity and Infrastructure Security Agency CISA

DEV Community

When the Cleanup Code Becomes the Project Rockpack 8.0 - A React Scaffolder Built for the Age of AI-Assisted Development Mismanaging the Treasure Hunt Engine in Hytale Servers Will Get You Killed Why Hardcoded Automations Fail AI Agents Stop Calling It an AI Assistant. It’s Already Managing Your Company Why I built a post-quantum signing API (and why JWT is on borrowed time) Weekend Thought: Frontend Build Tools Suffer From Work Amnesia A 10-Line Playwright Trick That Saved Me Hours on Every Sephora Run AI Is Changing Engineering Culture More Than We Realize Everyone Was Focused on Gemini, But Infinite Scaler Was the Real Twister "Gemma 4 Analyzed My Bank Statements – Apparently I 'Have a Problem' with Coffee and Late-Night Apps" #css #webdev #beginners #codenewbie The Hidden Layer Every AI Developer Must Learn AlphaEvolve: Google DeepMind's Gemini-Powered Evolutionary Coding Agent RDS Reserved Instance Pricing: Every Engine, Every Rule, Real Dollar Savings How To Build An AI-Powered MVP Without Burning Your Startup Budget In 2026 Reading a Psychrometric Chart Without Getting Lost LMR-BENCH: Can LLM Agents Reproduce NLP Research Code? (EMNLP 2025) How to turn text into colors (without AI) Building Real-Time Apps in Node.js with Rivalis: WebSockets, Rooms, Actors, and a Binary Wire This Week In React #282 : Security, Fate, TanStack, Redux, Jotai | Hermes-node, Expo, Rozenite, Harness | TC39, Bun, pnpm, npm, Yarn, Node AI Copilot vs AI Agent Architecture - What's Actually Different (And Why It Matters) Smart Contract Security: NEAR's Futures Surge and AI Token Risks Database Maintenance: Tracing Production Incidents to Their Root Cause Stop juggling AI SDKs in PHP — meet Prisma Google Quietly Changed What “Apps” Mean at I/O 2026 The Infrastructure Team Is the Real Single Point of Failure Building SQLite from Scratch: 740 Lines of C++23 to Understand Every Byte of a .db File The 4 Levels of Hermes Agent Scaling Framework: From One Hermes Agent to a Fully Automated Team Your AI Has a Memory. It Just Doesn’t Know What to Remember. Claprec: Engineering Tradeoffs - Limited time vs. Perfection (6/6) Building a Daily Google News API Monitor in Python Building RookDuel Avikal: From Chess Steganography to Post-Quantum Archival Security Google I/O e IA: o que realmente muda na vida do dev? Color Contrast Failures: The Number One Accessibility Issue and How to Fix It # I Watched 15 Hours of Hermes Agent Videos So You Don't Have To Cómo solucionar el bucle infinito en useEffect con objetos y arrays en React The First Agent-Centric Cloud Security Platform — And Why We Didn't Build It That Way On Purpose Most Treasure Hunts Engines on Hytale Servers Are Built to Fail - Lessons from a Burned Database GhostScan v3.0 — From Closed-Source EXE to Open-Source Pentest Framework De hojas de cálculo a IA: construyendo una plataforma SRM moderna When is AI fine in education? Python Tools for Managing API Rate Limits in Data Pipelines How to Implement Exponential Backoff for Rate-Limited APIs in Python "My Web Chat Wasn't a Real Channel. That Broke My Agent Pipeline" next-advanced-sitemap v1.0.7 — safer URL ingestion & automatic trimming for Next.js sitemap generation I keep seeing people build an AI lead processing agent when they really need a 6-step rules engine AI Powered Student Learning Assistant Using Gemma 4 How I Built a Drop-In Proxy to Slash My OpenAI Bills by 20%+ Automatically Building a Sarcastic AI English Tutor with Persona-as-Code and Gemini Audio Input for Pronunciation Correction Five Years Later, I Finally Have 96GB VRAM — What It Actually Unlocks for Agent Loops Turning a 1-Line Idea Into a 40-Second Short with a 10-Beat Local Video Pipeline Running LTX-2.3 Alongside TTS on a Single 96GB GPU with a Cold-Start Architecture Cutting LTX-2 22B Peak VRAM by 40% with fp8_cast — and Why optimum-quanto Was a Trap HiDream Skeleton Mode: Prompt Beats OpenPose Ref — 8 Patterns Benchmarked Replicating a Language-Learning Comedy Short with Claude Code — Gemini as a Multimodal Sub-Agent HiDream-O1-Image 3–8x Faster: Benchmarking Steps, CFG, and Resolution AWS Savings Plan Buying Strategy: How to Layer, Size, and Time Commitments application.properties I built a macro tracker powered by AI + attitude Solace: A Global Mental Health First Responder Built with Gemma 4 Why Blocking Prompt Injection Is Wrong — and What to Do Instead The AI code tools Dutch developers actually use in 2026 (field notes) Automatic Error Recovery in AI Agent Networks You Are Not Choosing Building a Cinematic Adaptive Learning Intelligence with Gemma 4, Gemini, and OpenAI(Powered by Gemma 4) CLAUDE.md for Angular: 13 Rules That Make AI Write Idiomatic, Production-Ready Components I tested 7 vector databases for my RAG stack in 2026, here's the one nobody is talking about (yet) Claude agreed with a false fact I gave it. Confidently. That broke my workflow Google's "Budget" Model Just Beat Its Own Flagship. Here's What That Actually Means for Developers. How I built a monitoring SaaS for Joomla, WordPress & PrestaShop agencies Shifting from Passive Dashboards to Automated Remediation: A Guide to Next-Generation FinOps and CloudZero Alternatives Automating CSV WooCommerce Imports Without Plugins Why Wobbly Plugs and Overheating Outlets Are More Dangerous Than You Think (UL 498 Explained) Building an AI Model Evaluation Pipeline on AWS for Audio Content Generation Your Side Project Is Not a Business Neurodiversity and the two layers of cognition GitHub Internal Repositories Breached: Source Code and Internal Data Allegedly Exfiltrated in 2026 Supply Chain Attack Stop drowning in files: auto-organize your Google Drive with n8n (free workflow JSON) Secure Firmware Updates with a Secure Element: Building Trust Into the Bootloader I Thought Domain-Driven Design Was a Waste of Time. I Was Wrong. AI Content Is Getting Tagged Like Livestock — And That's Actually Good ESP32 Into a Speech-to-Text Device Why Simple Audio Transcription Fails in Healthcare: The Need for Clinical Reasoning Engines The 114KB Span Attribute That Hid Our LCP Data How to Scale AI Development Beyond Prototype Speed Agent Execution Environments: Cloud Sandbox vs Local GUI vs Hybrid AI code review checklist that actually catches problems What’s the best tech stack for AI app development? Arc 1 Recap: Keypairs, Wallets, and Solana Fundamentals How Wearables Are Changing Human Decision-Making (Without Us Realizing It) The Perils of Premature Optimisation in Distributed Treasure Hunts Why Engineers Wear Hoodies While Social Media Sells Perfection Stop Treating setTimeout(fn, 0) Like Magic Save any webhook data to a database automatically with n8n — free workflow JSON Translating an entire multilingual site shouldn't mean re-prompting an LLM for every file I built a Vite plugin that uses AI to author Playwright tests, then gets out of the way Project: Restaurant Delivery CRUD Three weeks after I said CLAUDE.md writes itself, it added 4 more rules without me Trois semaines après avoir dit que mon CLAUDE.md s'écrivait tout seul, il a ajouté 4 règles sans moi
From a Phone in a "Cave" to Global Open Source: Why Google’s Gemma Models are a Lifeline for Budget Developers
Mohammed Tha · 2026-05-22 · via DEV Community

This is a submission for the Gemma 4 Challenge: Write About Gemma 4

Before you dive into reading this blog, I want to share one thing with you straight from the heart. I didn’t just write this blog directly using an LLM or any AI tool. I sat down and drafted every single bit of this story in my notepad first. Then, I put it into AI and said,
Hey, look, don’t add extra artificial content. Just help me organize my thoughts and put this info correctly so I can share my real experience.

What's Covered in This Blog

The Superpower of True Offline Coding

A lot of people ask me, Why do you care so much about running models offline when the cloud exists?

Because I remember my school days. I didn’t have a flashy, high-end laptop. I had a smartphone. I used Termux, Acode, and Anwriter to write code directly on my tiny screen. I still remember the absolute thrill of building my very first Tic-Tac-Toe game using pure HTML, CSS, and JavaScript, entirely offline.

Back then, the absolute biggest roadblock to studying and building things was documentation. If you wanted to learn or look up an error code, you needed an internet network to scroll through endless pages. If your network failed, your learning stopped.

But when you code or study completely offline with an AI tutor, a superpower unlocks: The noise disappears, and the barriers vanish.
Here is exactly what my brain looks like when the Wi-Fi is on versus when I cut the cord:

Image generated using Google Gemini

Now, that entire network problem is completely solved. Anyone can build anything and learn easily without spending a single penny on expensive internet plans or premium data subscriptions.

Historically, the problem with offline AI was hardware. Running a capable LLM required an expensive machine with massive amounts of VRAM. If you didn’t have the cash for a high-end gaming GPU, you were locked out.

Google just shattered that barrier. By releasing the Gemma model family under an open, commercially permissive Apache 2.0 license, they didn't just give us a powerful model; they gave every single student around the globe access to frontier-level AI on regular, everyday devices. When I spin up a Gemma model to help me generate Go code on an older, struggling laptop and watch it handle the logic flawlessly, I genuinely feel like Tony Stark.

"TONY STARK BUILT THIS IN A CAVE! WITH A BOX OF SCRAPS!" > —— Me screaming at my old laptop when the local Go code compiles perfectly. 😂

What Makes Gemma So Mind Blowing?

Google engineered these lightweight, open-weights models specifically to bring massive reasoning capabilities straight to accessible hardware.

In plain English? It means the model doesn't choke your device's RAM. Instead of needing a massive corporate data center to process complex logic, it uses ultra-efficient token processing and smart memory layouts. This drastically shrinks the hardware footprint, allowing you to feed it prompts without crashing your device or causing your phone to overheat like a hot potato.

Whether you are running a lightweight version on a smartphone or a larger variant on a laptop, you are getting incredible coding and debugging help completely locally.

Showcasing the Setup: Gemma Running Natively on My Phone!

To show you that this isn't just theory I actually live this setup. Check out this video of my actual screen while using it:

Seeing text stream into a mobile terminal screen like that when you are completely disconnected from the outside world is an unmatched feeling. It makes you realize that the barriers to education and software engineering are completely tearing down.

Just to be transparent with you guys on the dates: what you are seeing in that video above is my older mobile setup running the Gemma 2B model. I originally took this screen recording on May 16, 2026, after using the model heavily in Termux, and I just uploaded the clip to YouTube on May 20, 2026, to share it here. I wanted to show you this video because it proves just how smooth and massive the performance is even on a small phone. It makes you think—if a lightweight 2B model can do all this, how crazy is the new Gemma 4 going to be?

Recreating the Magic: Step-by-Step Native Mobile Setup

Want to turn your phone into an offline powerhouse running a model like Gemma 2B? Here are the actual commands to set up Termux and compile llama.cpp directly on Android:

Note: These images reflect the exact workflow from my mobile device. If the upstream repository receives new updates down the line, simply check their latest branch logs and run your build!

Step 1: Install Required Packages & System Headers

pkg update && pkg upgrade -y
pkg install -y git cmake clang make python ndk-sysroot wget

Enter fullscreen mode Exit fullscreen mode

This installs all the essential tools required to compile llama.cpp natively inside Termux.

Step 2: Hitting the Nasty spawn.h Error

While compiling, I encountered a spawn.h error on Termux during the build process.

To fix this issue, I rolled back to a stable build tag and rebuilt the project.

# Roll back to a stable release build tag to bypass the spawn.h error
git checkout b4833

# Clear the old broken build artifacts
rm -rf build

# Reconfigure and trigger the compilation process using 4 threads
cmake -B build
cmake --build build -j4

Enter fullscreen mode Exit fullscreen mode

This successfully compiled the project without errors on Android.

Step 3: Download and Run Your Offline Assistant

# Create a models directory inside llama.cpp
mkdir -p models
cd models

Enter fullscreen mode Exit fullscreen mode

Download your preferred GGUF model and place it inside the models folder.

Example model used:
gemma-2-2b-it-Q4_K_M.gguf

Step 4: Run the Model

./build/bin/llama-server -m models/gemma-2-2b-it-Q4_K_M.gguf -c 512 -t 2 -ngl 0

Enter fullscreen mode Exit fullscreen mode

  • m → Path to the model
  • c 512 → Context size
  • t 2 → Number of CPU threads
  • ngl 0 → Disable GPU layers (recommended for mobile)

Once launched, the model runs completely offline on your Android device.
Below is a screenshot of the model running in my Termux terminal.

My New Setup: Pure Focus Mode (Minus the Distractions)

Right now, I am so eager to test Gemma 4 on my phone next, but for serious coding work, I've deployed it on my laptop setup instead.

I have a dual-boot machine running Windows and Ubuntu Linux, and for serious focus sessions, I always boot straight into Ubuntu.

And look, the beauty of llama.cpp is that you can host a local server and run the model directly inside a clean, beautiful browser interface on your local machine. It is absolutely superb. I get to pull up my project files, ask my local model questions, and have zero internet tabs open to distract me. Bye-bye internet, hello focus mode!

Step-by-Step Guide: Compiling Gemma 4 on Ubuntu Linux for High Performance

Step 1: Update System and Install Core Build Tools

sudo apt update && sudo apt install -y git build-essential cmake

Enter fullscreen mode Exit fullscreen mode

Step 2: Download and Compile llama.cpp

git clone https://github.com/ggml-org/llama.cpp.git

cd llama.cpp

cmake -B build
# Build project (e.g., -j4 for 4 cores, or -j$(nproc) for all cores)
cmake --build build -j

Enter fullscreen mode Exit fullscreen mode

Step 3: Create Directories and Download Gemma 4 GGUF

# Create the models directory inside llama.cpp if it doesn't exist
mkdir -p models 

# Download your preferred Gemma 4 GGUF flavor from Hugging Face into the models folder

Enter fullscreen mode Exit fullscreen mode

Step 4: Launch Gemma 4 (Choose Web UI or Terminal)

*Option A: Launch the Local *

./build/bin/llama-server -m models/gemma-4-E4B-it-Q3_K_S.gguf -c 4096 --host 127.0.0.1 --port 8080

Enter fullscreen mode Exit fullscreen mode

Once that server is running, you just open your browser, head to http://127.0.0.1:8080, and you have a stunning interface running 100% locally from your machine!

Here is exactly what it looks like when I boot into Ubuntu and run the Web UI interface. Look at how seamlessly it breaks down complex topics like Deep-First Search (DFS) and Breadth-First Search (BFS):

Option B: Launch Interactive Chat in Terminal

./build/bin/llama-cli -m models/gemma-4-E4B-it-Q3_K_S.gguf -p "Hi" -env

Enter fullscreen mode Exit fullscreen mode

Here is a live screenshot of the terminal variant spinning up on my desktop. You can see htop running on the right side, showing how lightweight and light on resources the execution is on my CPU:

Looking at this laptop setup makes me feel incredibly grateful. I have to give a massive shoutout to Google. A while back, the high-quality technical information and support I got from the Gemma ecosystem actually helped me write high-value blogs on Dev.to. Those blogs gained traction, helped me clear technical hurdles, and ultimately allowed me to make the money I needed to finally step up from just a phone and get this laptop.

Final Thoughts: Thank You, Google

From tinkering with basic text editors in my school days to deploying advanced Go routines on an old, hardware-challenged laptop today, local execution has completely shaped my trajectory as a developer.

AI shouldn't just be a luxury for those who can afford massive monthly cloud subscriptions or elite hardware. By putting open-weights, highly compressed powerhouses like the Gemma family directly into our hands, Google has leveled the playing field for student developers everywhere who are suffering through hardware or internet constraints.

Thank you, Google, for making things easy for me from my childhood all the way to right now. You always clear the path for devs who are trying to learn and grow.

Now, go clone the repository, download the weights, shut off your internet, and go build something awesome in your own cave!

Before I close this blog completely, I want to say one last thing. If anything I wrote here caused any misunderstanding, or if my blog made any person feel sad or unhappy, I am really, truly sorry. That was never my intention. I just wanted to share the happiness and excitement of a kid coding from his phone and moving up to a laptop. Thank you so much for reading, and good luck with your own builds!