惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

DEV Community

AI-Discovered Vulnerabilities Need A Triage Queue, Not A Panic Channel AI Agent Workboards Need Audit Controls Before They Need More Agents Demystifying DevRel: What It Actually Is (And Why Should You Become One?) Gemma 4 GenAI Coach - GenAI Concepts Made Easy with an Interactive Playground QuietPulse - Mood Tracker Principal Components in TypeScript (Part 3) The pgAudit Attribution Gap: Why Role-Level Logging Fails GDPR and How to Close It Gemma 4 CAD Orchestrator I built a local Postgres triage co-pilot because HIPAA says I can't paste plans into ChatGPT or Claude Live Holographic Editor In Fractal Time Everbench: A document management system with Local Intelligence Instanton in Fractal Time The Hidden Features of Claude How I Built an AI News Brief with Next.js, Supabase, Vercel, and GPT-4o-mini How We Built a Multi-Agent AI Documentation System (And What We Learned) I got tired of writing post-mortems — so I built RCAi for SREs MIA: A Futuristic AI Desktop Assistant Built with Voice, Gestures, and Controlled Chaos Best Programming Language for Backend Web Development: PHP vs Python PayPal Alternatives for Indian Businesses: Best Payment Gateways for International Card Payments (2026) Gemma 4 Made Me Rethink Local AI: Not Just Text, But Images Too Clean Architecture in .NET Explained (The Dependency Rule) I Compiled Rust to WebAssembly and Made My JavaScript 6 Faster Outlook.com Is the Final Boss of 'Just Send an Email' Conditional Statements and Control Flow in Python Insults & Cutlasses, Local LLM Sword Fighting on Melee Island Production Lab: ECS Fargate + Prometheus + Grafana + Loki + Alloy + Node Exporter How 12 AI agent frameworks handle human approval (most badly) The Four-Index Reality: Why AI Search Isn't One Thing I Scanned 1 Million AI Services. Here's What Worries Me More Than the Vulnerabilities Managing multiple docker hub accounts using docker-use System Design Interview: Decentralized Web Crawler Metric Cardinality: High or Low? 4 Steps to Making the Right Choice 로컬 LLM 셋업 가이드 (v23) GEO vs SEO in 2026 — What Google's May Guidance Changed Cursor Review 2026 — Honest 'Not For Me' Take From a VSCode User Hello from rikuq — a practitioner blog for solo AI SaaS founders Why DevOps Engineers Need Practical Tutorials, Not Just Theory AI Agents in CI/CD: Give Them Context, Not Production Authority Now I See Why Translators Are Panicking Over AI—Should Coders Panic Too? Why I Track HRV Every Morning (And How It Actually Changes My Day) Diffusion Language Models: How NVIDIA's Nemotron-Labs DLM Is Killing Token-by-Token Generation Chatbots GPT pour le support client : ce que les équipes françaises ont réellement besoin de savoir I Hit the 1,232-Byte Wall So You Don't Have To Google Just Rebuilt the Search Box (Again) — But This Time It's Different Aether: A local Android assistant built with Gemma 4 BoxAgnts Introduction (1) — Out of the Box mkdev: trusted HTTPS for localhost, mapped by name Just one question, one answer. Why Java Still Rules the Programming World in 2026 Four Architectures for Letting Claude Edit Elementor (and Why We Shipped Clone-and-Mutate) yard-yaml 0.1.1: safer UTF-8 handling for YAML documentation I Built a Mac App That Keeps Your Clipboard in Sync Across All Your Android Devices Stop Using UUIDs: Why B2B SaaS Needs ULIDs in Laravel 🐘 I'm a non-technical founder who built a Slack approval tool. Here's what actually broke first. Open-Sourcing Our Game AI Stack — SDKs, Templates, and CLI Tools for NPC Dialogue I Built an AI System That Makes 1,000 Decisions a Day. Here's Where I Drew the Line. Lets Encrypt DNS Challenge with Traefik and AWS Route 53 Building an agent-ready website: how to make your site readable for ChatGPT, Perplexity and autonomous agents A productivity tool with GitHub as your cloud database How We Built Dynamic NPC Dialogue with LLMs — Lessons from Early Access cmux: The Native macOS Terminal Built for Running AI Coding Agents in Parallel Deep Atlantic Storage: Rewriting in Rust How I Built a Bulk Image Optimizer with $0 Server Costs Using Vanilla JS and Canvas API Humans and Machines read differently, I think I have a fix? Claude Code Deleted 92 Images Without Asking. This Happens More Than You Think. Method Calling Stack in Java I Built Schedule Sensei & Pushed It to GitHub – Here's What's Inside (And I Need Your Help 👀) OIC: From a Working Toast Watcher to a General "Watch It for Me" Agent Memory is two-thirds of what an AI chip costs to build The XState persistence problem is five years old. Here is what we built to finally solve it. i added MCP support to my SaaS in an afternoon. here's the whole thing. Framework: Link Building ☁️ Importing existing S3 buckets into Terraform state made easy with terraform import existing s3 bucket I Built a Token System on Solana (Without Any Backend Code) 터미널 AI 에이전트 구축 (v21) I Built an AI 3D Model Generator — Here's How I Handle Meshes in the Browser 🛡️ PromptGuard: I Built a Local AI Privacy Firewall That Sanitizes Your Prompts Before They Leave Your Machine PostgreSQL WAL Bloat: Why Automatic Management Is Often Insufficient? Seven PRs Before Lunch: Parallel Claude Code Tabs Plus Audit-Before-Bump Deployment using all three Kubernetes probes Qwen 3.6 Has Four Tiers. Here's How to Route Without Burning Cash. RAG 시스템 실전 구축 (v21) How I handle my errors in PHP The Blind Spot in Treasure Hunt Engine Configuration: Long-Term Server Health Run NVIDIA NIM on Your Own GPU — Same API, Different Endpoint Webflow SEO Implementation 로컬 LLM 셋업 가이드 (v21) How Logs Travel From Your EKS Pod to Datadog 𝗦𝘁𝗼𝗽 𝗖𝗿𝗮𝗺𝗺𝗶𝗻𝗴 𝗙𝗼𝗿 𝗘𝘅𝗮𝗺𝘀, 𝗦𝘁𝗮𝗿𝘁 𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗥𝗲𝗮𝗹 𝗦𝗸𝗶𝗹𝗹𝘀 How to Use EXPLAIN ANALYZE in PostgreSQL: A Visual Guide gRPC Performance: tonic (Rust) vs grpc-go Benchmarked at Scale Hack The Box (HTB): Cap Machine (Full Walkthrough) Visual Search Optimization studygemma: AI study buddy for CS students Architectural Tradeoffs in Webhook Idempotency and SaaS API Versioning One Open Source Project a Day (No. 75): Understand Anything - The AI Engine That Turns Any Codebase Into an Explorable Knowledge Graph From mock-only-works to real-world-works: 48 hours of reCAPTCHA debugging I built a free music tool AI Talking Avatar Pipelines Broke Our Ad CTR by 3.7% 800G to 400G Breakout: How to Scale 400G Networks with 800G Ports
Your AI, Your Device, Your Data - Introducing Aide
Swapnil · 2026-05-25 · via DEV Community

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

The way we use AI on our phones is changing. People want more from their devices, but the main way we reach all these breakthroughs in AI is still a chatbot window. And we don't just want assistance, we want assistance that actually knows us. The AI should understand who you are, get the help you really need, and step in right where you need it. The catch is that for a long time you had to choose: you could have real personalization, or you could have real privacy, but not both. Anything that knows you that well usually ends up living in someone else's cloud.

Gemma 4 changes that. It's small enough to run on the phone in your pocket, but still capable enough to reason, see, and hold a real conversation, which means an assistant can finally know you without sending your life off to a server. That's the whole idea behind Aide: a personal, on-device assistant powered by Gemma 4, where the intelligence stays with you and you stay in control.

What I Built

Aide application interface

Aide is a private-first Android app that puts a frontier-class AI model on the two surfaces you already touch every day: the system keyboard and the assistant button. Instead of asking you to open yet another chatbot, it brings Gemma 4 to where you already type, talk, and tap. Everything runs on-device by default through LiteRT-LM, and the cloud is strictly opt-in. The same loaded model backs all three surfaces, so chat, keyboard, and assistant stay in sync.

A real Android  raw `InputMethodService` endraw  with a transform bar above the keys (rephrase, simplify, fix grammar, summarize, tone shifts, key points, bullets, table view, three-reply suggestions) and a magic button for one-shot Custom Instructions.

The keyboard is a real Android input method with a transform bar sitting above the keys. You can rephrase, simplify, fix grammar, summarize, change tone, pull out key points, or turn a mess of text into bullets or a table, and a magic button runs your own one-shot custom instruction. Every transform is just a row in a local task table, so you can edit the prompt, reorder them, or add your own with a {{text}} template. The best translation model on device is the Gemma 4 which can translate across 37 languages paired with 20+ on-device speech models, so it can listen in one language and write back in another without the data ever leaving the device.

Roughly 23 Piper, Kokoro, MeloTTS, and Matcha voices across 10+ languages for output; Zipformer, Whisper, Moonshine, Parakeet, SenseVoice, and GigaAM for input. Auto / Sherpa / System is a per-surface choice.

The assistant button opens a turn-based voice loop (speech to text, then VAD, then Gemma 4, then text to speech) that stays local by default. The built-in voice recognition and TTS on phones are fast but flat and robotic, so Aide ships a real choice of voices instead: roughly 23 Piper, Kokoro, MeloTTS, and Matcha voices for output and a stack of Zipformer, Whisper, Moonshine, Parakeet, SenseVoice, and GigaAM models for input. The whole voice toolkit runs on Sherpa-ONNX, which gives Aide one fast, on-device runtime for STT, TTS, and VAD across all of those models, so none of your audio has to leave the phone to be heard or spoken back.

Long-press the assistant button for a turn-based voice loop: STT → VAD → Gemma 4 → TTS, on-device by default.

Under all of that, Gemma 4's native function calling drives a single tool dispatcher for alarms, calendar, contacts, phone, clipboard, files, web search, calculator, and time, with per-category permission toggles and a confirmation gate in front of anything destructive. Chat is fully multimodal, reading images as a Gemma 4 turn and exposing the model's reasoning through a tappable thinking chip. Pick a local weight (E2B or E4B) for daily use, or point at an Ollama endpoint when you want the heavier 26B or 31B models, swappable per chat and mid-session. The whole point is trust: an assistant that actually knows you, without shipping your life, your memories, and your conversations off to someone else's server.

Gemma 4's native function calling drives a single dispatcher for alarms, calendar, contacts, phone, clipboard, filesystem, web search, calculator, and time. Per-category toggles in settings; destructive actions route through a confirmation gate.

And here is where the real power is: all of this works offline. One-time setup downloads the Gemma 4 weight and the voice models you want, and after that everything keeps running with the network off. No account, no sign-in, nothing to phone home to. Few apps this feature-rich stay fully on-device, and that was the bar Aide set for itself. Turn on airplane mode and the assistant is still right there.

Demo

Video walkthrough (Please excuse the poor video quality, I am not much of a video editor)

Code

Code Repository: https://github.com/swaptr/aide
The APK file is available under the Releases section.

How I Used Gemma 4

Gemma 4 is the right fit because of what an on-device personal assistant actually demands: it has to see, reason, and call tools, all while running on a phone instead of a datacenter. Gemma 4 is natively multimodal, reasons well, and carries a 128K context window, and it does all of that at a size that still fits in your pocket. That combination is rare.

For day-to-day use, Aide runs the E2B and E4B weights fully on-device through LiteRT-LM. These small variants are built for mobile and edge, and they are quick enough to back the keyboard transforms, the voice loop, and chat without a network round-trip. The same multimodal model reads an image in chat, drives native function calling for the tool dispatcher, and exposes its reasoning trace through the thinking chip. For most daily interactions, E2B and E4B carry the whole experience locally.

Pick a local Gemma 4 weight (E2B / E4B) or point at an Ollama endpoint for heavier weights. Selection is per-chat, swappable mid-session.

When you want more reasoning power, Aide hands off to an optional Ollama endpoint, swappable per chat and even mid-session. You can point it at a self-hosted Ollama server for the heavier Gemma 4 26B and 31B Dense weights and keep everything inside your own infrastructure, or use Ollama Cloud when you want frontier-grade output. Either way the choice is yours and the default stays local. That is the whole point of Aide: AI used the way it should be, one that knows you without shipping your life, your memories, and your conversations to someone else's server.

Next, I want to bring some of the Gemini Live experience to Aide: sharing the screen with Gemma 4, drafting artifacts from what it sees, and then referencing those artifacts back in chat and pulling them into your writing straight from the keyboard.

Acknowledgements

Special thanks to the following projects and services that made this application possible:

  • Google AI, for running this challenge and for the work on Gemma, especially for prioritizing multimodal and translation capabilities in the Gemma family of models, which is exactly what made an offline assistant like this realistic.
  • DEV (dev.to), for hosting the hackathon and the community around it.
  • The Ollama team, for making models, including the Gemma weights, easy to self-host and serve.
  • The Sherpa-ONNX project team, whose on-device runtime powers all of Aide's speech-to-text, text-to-speech, and voice activity detection.
  • Google for Android and the broader edge tooling, including LiteRT-LM.