惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

K
Kaspersky official blog
The GitHub Blog
The GitHub Blog
A
About on SuperTechFans
Engineering at Meta
Engineering at Meta
阮一峰的网络日志
阮一峰的网络日志
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
J
Java Code Geeks
罗磊的独立博客
小众软件
小众软件
Stack Overflow Blog
Stack Overflow Blog
T
Tailwind CSS Blog
MongoDB | Blog
MongoDB | Blog
Hugging Face - Blog
Hugging Face - Blog
Vercel News
Vercel News
F
Fortinet All Blogs
V
Visual Studio Blog
P
Proofpoint News Feed
C
CERT Recently Published Vulnerability Notes
T
Tor Project blog
P
Privacy International News Feed
MyScale Blog
MyScale Blog
F
Future of Privacy Forum
T
ThreatConnect
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
O
OpenAI News
H
Hackread – Cybersecurity News, Data Breaches, AI and More
N
Netflix TechBlog - Medium
量子位
M
MIT News - Artificial intelligence
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
www.infosecurity-magazine.com
www.infosecurity-magazine.com
N
News and Events Feed by Topic
AWS News Blog
AWS News Blog
The Register - Security
The Register - Security
WordPress大学
WordPress大学
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
CXSECURITY Database RSS Feed - CXSecurity.com
B
Blog RSS Feed
IT之家
IT之家
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Google DeepMind News
Google DeepMind News
Apple Machine Learning Research
Apple Machine Learning Research
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
D
Docker
P
Proofpoint News Feed
aimingoo的专栏
aimingoo的专栏
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
The Cloudflare Blog
李成银的技术随笔

DEV Community

HDD Eksternal Tiba-Tiba Tidak Bisa Diakses di Windows? Ini Tiga Lapis Fix-nya DSA Application in Real Life: How Git Diff Works: LCS Intuition, Myers Algorithm, and Real Code Changes I built an AI faceless video generator in 2 months — here's the stack Diffusion Language Models: How NVIDIA Nemotron-Labs Diffusion Shatters the Autoregressive Speed Ceiling llm-nano-vm v0.8.0 — deterministic FSM runtime for LLM pipelines, now with output validation and per-step timeouts From the Renaissance to the Quantum Dawn: AI, Computation, and the Next Paradigm Shift How I Built a Review Site with 800+ Articles Using AI I Built a Smart Kitchen AI with Gemma 4 That Turns Fridge Photos Into Recipes Why your vulnerability dashboard is lying to you (and how to fix it) From Abandoned Prototype to Smart AI System: Reviving Trafiq AI with GitHub Copilot Why Country/State/City Pickers Are Weirdly Hard Node.js 22 LTS — EOL Date, Support Timeline, and What Comes Next The 7-Layer Memory Architecture Behind Modern AI Agents I Imagined Hermes Agent Running an Entire Smart City — And It Changed How I See AI One backend, four products: why we bet on platform-per-brand AI's tech debt is invisible — even to AI. I solved it at the architecture layer. Why ROAS 300% Can Still Mean Losses — Gross Margin in 5 Ecommerce Verticals You Don’t Need to Try Every AI Tool to Keep Up NovelPilot: A Novel Writing Agent Powered by Gemma 4 BoxAgnts is an Out-Of-The-Box Secure AI Agent ToolBox in a WASM SandBox Gemma 4 deep dive: why a 1.5 GB model scores 37.5% on competition mathematics, how the MoE routing actually works, and which model fits your hardware. Full breakdown inside. BeeLlama v0.2.0: 164 tok/s on a 27B model, one RTX 3090 Google Just Declared the Chat-Log Interface Dead. Here's What Neural Expressive Actually Signals for Developers. ARCHITECTURE SPECIFICATION & FORMAL SYSTEM REPORT: k501-AIONARC Notes from a Hammock What's Google Antigravity 2.0 ? Here's What the Agent Harness Actually Changes for Developers. Building an E2EE Chat App in Flask - Part 3: Keeping File Uploads Safe Google's Gemini Spark. Here's What It Actually Does for Developers. Microsoft Just Shipped MCP Governance for .NET. Here's What It Actually Enforces. How I Built a Pakistan Internet Speed Test Platform at 16 How to Build a Supervisor Agent Architecture Without Frameworks I Built My Own Corner of the Internet — Here's What It Looks Like How does VuReact compile Vue 3's defineExpose() to React? Neo-VECTR's Rift Ascent Idempotency Keys: The API Safety Net You Probably Aren't Using Building E-Commerce Sites for Niche Products: Technical Lessons from Specialty Outdoor Retailers Audit Logs: The Silent Guardian of Every Serious System Open-source SDS tooling for Japanese MHLW compliance: the gap nobody filled BetAGracevI I Built a Post-Quantum Cryptographic Identity SDK for AI Agents — Here's Why It Needs to Exist Running Claude Code across multiple repos without losing context There Are Cameras in Every Room of My House. I Put Them There. Why your AI agent loops forever (and how to break the cycle) How does VuReact compile Vue 3's defineSlots() to React? Building a Privacy-First Resume Editor with Typst WASM and React One Soul, Any Model: Portable Memory for Open-Source Agents with .klickd From Pixels to Prescriptions: Building an Autonomous Healthcare Booking Agent with LangGraph MonoGame - A Game Engine for Those Who Love Reinventing the Wheel # Day 24: In Solana, Everything is an Account Mastering Node.js HTTP Module: Build Servers, REST APIs, and Handle Requests Mastering Node.js HTTP Module: Build Servers, REST APIs, and Handle Requests RP2040 Wristwatch Tells Time With a Vintage VU Meter Needle observations about models / 2026, may From Video Transcripts to Source-Grounded AI Notes: A Practical Look at Notesnip AI Agent Dev Environment Guide — Real Experience from an AI Living Inside a Server How I Run 7 AI Models 24/7: Multi-Agent Architecture in Practice What exactly changes with the Claude Max plan? I Revived a Broken MLOps Platform — Now It's Self-Service, Policy-Guarded, and Operationally Credible OpenAI's $2M-tokens-for-equity YC deal, decoded Why DMX Infrastructure is Still Stuck in the 90s Agent Series (2): ReAct — The Most Important Agent Reasoning Paradigm Open Source Project (No.73): Sub2API - All-in-One Claude/OpenAI/Gemini Subscription-to-API Relay I Made the Wrong Bet on Event Streaming in Our Treasure Hunt Engine #ai #productivity #chatgpt #python Symbolic Constant Conundrum From Manual RAG to Real Retrieval — Embedding-Based RAG with NVIDIA NIM Building an outbound-only WebSocket bridge for local AI agents Our System's Sins in Ghana: Why We Had to Rethink Digital Product Sales Execution Governance, AI Drift, and the Security Paradox of Runtime Enforcement Differential Pair Impedance: Why USB and HDMI Routing Is a Geometry Problem Small AI database questions can become big scans Claude Code 2.1 Agent View & /goal: Autonomous Dev Guide 2026 Your AI database agent should not see every column Rust's Low-Latency Conquest: Why We Ditched C++ for a Treasure Hunt Engine Floating-point will quietly corrupt your emissions math, and 0.1 + 0.2 already warned you Autonomous Agents: what breaks first (and why that's the real product) [2026-05-23] Agent payments are the new cloud bill footgun ORA-00069 오류 원인과 해결 방법 완벽 가이드 How I Built a Local, Multimodal Gemma 4 Visual Regression & Patch Agent: Closed-Loop Validation, Canvas Pixel Diffing, and Reproducible Benchmarks Pressure-testing Ota on Supabase: from setup prose to executable repo readiness VPC CNI en EKS: cómo dejar de pagar nodos que no usás The Future of Text Analysis: Introducing TechnoHelps Semantic Engine I built a Chrome Extension that saves product images + context directly to Google Drive & Sheets 95+ browser-based dev tools that never touch a server Running Qwen 2.5 Coder 14B Locally in Cursor with Ollama From a 10,000-line OpenSearch export script to a log analysis tool Ghost Bugs Cost $40K: A Neural Debugging Postmortem SECPAC: A Lightweight CLI Tool to Password-Protect Your Environment Variables 🚀 PasteCheck v1.7 + v1.8 — Hints that tell you what to fix, and a nudge panel that tells you where to start 8 Real Ways Developers Make Money in 2026 (Ranked by Effort) I built a free AI-powered Git CLI that writes your commit messages for you sds-converter: Converting Safety Data Sheets to MHLW Standard JSON with Rust and LLMs OpenLiDARViewer: A Browser-Based LiDAR and Point-Cloud Viewer Local-First Browser Tools: What You Should Not Upload Online Why most freelancers undercharge (and the maths behind fixing it) We built a mahjong dangerous-tile predictor calibrated on 4.97M real hands Building a Chord Progression Generator in the Browser — Music Theory in JS, Sound via Web Audio API tutorial #10: 148 Opens, 0 Replies — How My Forge Cold Email v1 Completely Failed 9 in 10 Docker Compose files skip the basic security flags How to Forward Android SMS to Telegram Automatically
Synaptic: A Local-First AI Dev Companion That Remembers How You Think
Adedeji Olam · 2026-05-23 · via DEV Community

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Three weeks into learning Rust, I had Copilot open and Stack Overflow in the next tab. I was making progress — or so I thought. I'd ask Copilot how to handle an error, paste the answer in, the compiler would go green, and I'd move on.

Six days in, I realised I had no idea what I was doing.

I could produce Rust. I couldn't think in Rust. Copilot was making me a faster copy-paster. It wasn't making me a developer.

What I needed wasn't more autocomplete. I needed something to stop me before I typed and ask: "Do you actually understand what you're about to do?"

That question became Synaptic.

Synaptic is a local-first AI dev companion powered entirely by Gemma 4. It watches your entire development environment — files, terminal, errors, shell history — and builds a persistent model of how you specifically think and solve problems. When you're stuck or learning a new language, it surfaces your own past solutions rather than generic documentation pulled from the internet.

The centrepiece is the Socratic gate.

When you open a code file, Synaptic reads your last few hours of activity, identifies what concept is most at stake in that file, and streams a targeted question into a HUD overlay within seconds:

"You've been writing JavaScript closures recently. Before you edit this Rust file: how are you thinking about ownership and when values go out of scope?"

The question isn't generic. It's built from your actual history. Gemma 4 evaluates your answer and either lets you through or asks a sharper follow-up targeting exactly what you glossed over. Over time, your explanations get sharper — because the bar doesn't lower.

Beyond the Socratic gate:

  • Ambient memory pipeline — every file save and terminal command is compressed by Gemma 4 into a structured memory (summary, concepts, significance score, verbatim error text) every 3 seconds, stored locally in SQLite and indexed for semantic search

  • Vision error pipeline — on macOS, when a terminal error fires, Gemma 4 reads a screenshot of your actual screen to extract the full stack trace, not the truncated shell history
  • Four query modes — Translate (JS patterns to Rust), Explain (grounded in your own history), Map Concept (to what you already know), Find Solution (have I solved this before?)

  • Habit mismatch detection — runs continuously, warns when you apply patterns from your old language that will break in your new one

  • Stuck detection — watches for compound signals (repeated errors, thrashing, excessive app switches) and auto-surfaces the HUD with relevant context before you ask
  • Electron HUD overlay — always-on-top, appears uninvited when it has something worth saying Everything runs on your machine. No API keys required. No data leaves without your permission.

Demo

Quick start to try it yourself:

git clone https://github.com/cybort360/synaptic
cd synaptic && npm install
ollama pull gemma4:e4b
ollama pull nomic-embed-text
cp synaptic.config.example.json synaptic.config.json
# Edit synaptic.config.json and add a directory to watchPaths
npm run launch

Enter fullscreen mode Exit fullscreen mode

Open a .rs, .ts, .py, or .go file in a watched directory. The HUD will appear within seconds with a question grounded in your recent activity.

Code

GitHub: https://github.com/cybort360/synaptic

The full pipeline in one view:

Files / Terminal / Shell history
        ↓
    Observer (chokidar + history polling + stuck detector)
        ↓
    Archivist
      ├─ Compressor: gemma4:e4b (every 3s, with vision on errors)
      ├─ Embedder: nomic-embed-text (semantic search vectors)
      └─ SQLite (local, persistent, yours)
        ↓
    Connector
      ├─ Semantic search over history
      ├─ Prompt builder (4 modes)
      └─ Reasoner: gemma4:e4b (streaming)
        ↓
    Socratic Engine
      ├─ Fires on file open for recognised code files
      ├─ Streams question word-by-word to HUD
      └─ Evaluates answer, asks follow-up or passes
        ↓
    Dashboard (http://localhost:3777) + HUD Overlay (Electron)

Enter fullscreen mode Exit fullscreen mode

Stack: Node.js 20, TypeScript, Express, WebSocket, Electron, chokidar, sql.js (SQLite), Ollama. No frameworks. No bundler. Around 5,700 lines of original code.

How I Used Gemma 4

I used gemma4:e4b — the 4B effective parameter model — and the choice was deliberate at every level.

Why E4B specifically

Speed is the constraint. Synaptic compresses every file save and terminal command into a structured memory on a 3-second batch cycle. That cycle is a constraint, not a goal — if compression falls behind, the memories become stale. The Socratic question about your Rust ownership file would be grounded in something you did yesterday rather than what you were doing five minutes ago.

I tested gemma4:26b for this. The batches piled up. The tool became a liability. At 4B effective parameters, gemma4:e4b compresses an event in under 2 seconds on a MacBook. The 3-second cycle stays clean. Memories are always fresh.

Multimodal was the unlock. When you hit a terminal error, your shell history often truncates it. The important part — the line number, the variable name, the exact constraint violated — is three screens down.

Gemma 4 is natively multimodal. When Synaptic detects a terminal error, it captures a screenshot of your screen and sends it to Gemma before compression. Gemma reads the actual stack trace off your screen, not the truncated history. This is only possible with a vision-capable model. A text-only 4B model cannot do it. A vision-capable model too large to run locally cannot do it either. gemma4:e4b is the exact intersection: small enough to run constantly, fast enough for real-time use, and genuinely multimodal.

The Socratic gate depends on first-token latency. The gate fires when you open a code file. The HUD slides up. The question starts streaming in word by word while Gemma is still generating. With gemma4:e4b, the first token arrives in under 3 seconds on a MacBook M-series. The question types itself in as the developer reads it. With a larger model, the developer is already ten lines deep before the question finishes generating — and the gate becomes noise.

What breaks with a different model

Alternative What fails
A 27B local model Compression batches pile up. The 3-second cycle becomes 30+ seconds and memories fall behind real activity
A non-multimodal 4B Vision pipeline silently degrades. Errors are compressed without reading the actual screen output
A cloud-only model The entire privacy guarantee breaks. Code, errors, and history leave your machine
No local model at all Socratic gate cannot fire on every file open. Latency makes it unusable as a real-time feature

The three tiers Gemma 4 powers

Task When Why Gemma 4
Event compression Every 3 seconds Speed. Needs to complete before the next event arrives.
Vision error analysis On terminal errors (macOS) Multimodal. Reads the actual stack trace off the screen.
Reasoning + Socratic evaluation On every query and file open Quality. Generates personalised questions and evaluates answers.

A separate nomic-embed-text model handles embeddings for semantic search — it outperforms a generalist 4B model at this specific task, so I kept it.

What Gemma 4 made possible that wasn't before

The core thesis of Synaptic is that local AI is now capable enough to be the primary intelligence of a real product — not just a demo. gemma4:e4b can read screenshots, generate coherent structured analysis, evaluate the quality of a developer's reasoning, and do all of this on a consumer laptop in real time.

That's new. Six months ago you had to choose between capable and local. With Gemma 4 you don't.