慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
觅汝之途
Afreen Hossa · 2026-05-24 · via DEV Community

Afreen Hossain

Gemma 4 挑战:以Gemma 4构建

此乃Gemma 4 挑战:以Gemma 4构建

之呈文 所筑何物

寻路助手,乃边缘对齐、高效通行的市政交通共驾之辅,实时生态影响之仪表盘也,以Gemma 4所筑也极重后端架构之优化,低延迟之管路效能,及离线之韧性

施以生产级后端工程之原则,如语义Redis缓存,上游速率限制之防护,及解耦微管路。,此系统弥合了原始物理车票与即时智能交通路线之隙。通勤者上传实体车票之影,未及二秒,即得量身定制之行程,并辅以视觉温室气体抵消之图示。

核心之困

凡AI驱之交通或导航之术,多罹三重根本之窒碍:

  1. 不可接受之迟滞:运行原始图像于繁重多模态视觉模型,需时逾八秩,此于奔波于喧嚣地铁闸机者,实不可行。复上传同一车票于异时,必再唤API,是增迟滞而费计算也。
  2. :连通性与:可用性跌宕。:云间API于深地隧道之中或API速率拥塞甚重之时,竟全然失效。
  3. 冗余计算:重行昂贵之大语言模型查询于同途之常客,致迟滞益增,API之费亦随之。

其解:优化之后台架构

为解此弊,吾辈设计寻汝之路环周一模块、解耦之後台,以速如闪电应答:

  • 一令容器化部署:全後台生態系統(FastAPI、EasyOCR管線及Redis緩存)皆已完全面向Docker。一令(docker compose up --build)即可於任一機器上啟動全堆疊,使部署瞬息且可重現。
  • 解耦之輸入管線:非恃云视之API,乃依地设高速EasyOCR之流,以取票文之素,其文由正则之站符器于~1.6息解析之。
  • 高性能Redis缓存:同途之行者,遇典则source:destination键入记忆。系统调取预计算路线洞见。不足一毫秒,得 blazing-fast warm-cache API 之迅捷应答约一秒有八分之十七(全然规避大语言模型之迟滞)
  • 双模边缘回退&API防护:若云API遭遇速率限制或延迟骤增,则后端自动于五百毫秒内切换至本地Ollama实例,该实例运行Gemma 4 E2B于边缘设施,务使通行指引无有中断。
  • 低内存配置优化:藉由运用定制Ollama运行参数,本地Gemma 4 E2B 之模型,行于寻常开发之笔记本电脑及低内存配置,使离线边缘部署甚为可行。
  • 生机盎然之生态分析仪表盘:前端React Native元素,取速捷洁净之JSON负载,以呈交互式、基于SVG之生态足迹指标(所存之树、所减之碳),及未来地铁时刻表与车厢拥挤热力图。

示范

寻路之途

人工智能驱动之多元公共交通协同助手&生态影响仪表盘

Gemma 4 Challenge License: MIT

寻路之途乃为公车之副导,司票之 OCR、路之导引、缓存之务,并较生态之影响。其用 Google 之Gemma 4之模型,

Redis所载之缓存,及应时之仪表盘。

  • 扫描实体票券之图像。
  • 以OCR预处之法,提取并规整站点之详。
  • Gemma 4之助,生成路径之导引。
  • 较碳之影响与行旅之费,于他种交通之选。
  • 当在线API访问不可用时,使用本地Ollama备选方案。
graph TD
    A[Physical Ticket Upload] -->|react-native-image-picker| B[OCR Extraction Pipeline]
    B -->|FastAPI Preprocessing| C[Station Extractor / Normalization]
    C -->|Check Redis Cache| D{Cache Hit?}
    D -->|Yes: sub-second| E[Render UI Response]
    D -->|No: Cache Miss| F[Gemma 4 31B Dense Primary]
    F -->|API Congestion/Failure| G[Gemma 4 e2b Local Fallback]
    F -->|Generate Route Markdown| H[Store in Redis Cache]
    G -->|Generate Route Markdown| H
    H --> E
载入中

速成之始

  • 先决条件:Docker,Docker Compose,及或……

吾如何用Gemma 4

吾等所择之架构&何哉

为驱策吾辈实时通勤之共御,吾等设此混智路由系统藉二主要Gemma 4架构

  1. 主脑:Gemma 4 31B 密集(google/gemma-4-31b-it:free
    • 职司高通量云推理引擎。
    • 何故契合:制行旅之程,需精微之思辨,多端而审慎。Gemma 4 31B Dense善读原始之JSON数据库,明铁路之脉络相接,且将指令以雅致之短句markdown格式之,便行旅于途中览之。
  2. :备用之Gemma 4 E2B:(Gemma 4 E2B)gemma4:e2b
    • :职分:以Ollama韧性边缘部署。
    • 何其契合。E2B之故。之模,精于局域或离线部署于限制之硬件(若站务售票机或局域节点)。其思辨之效至高,而内存之占甚微,使纵地下之站,公网之连若断,则本地之备亦发,于五百毫秒之内,引行旅安抵。

管弦之流

Pipeline

工程之艰 & 顿悟之奇

妨碍一:地存之阻(8GB RAM之限)

  • 难题:当于地测试时,系统常会僵滞,致发内存溢出之崩。Ollama难与Docker、VS Code及众开浏览器标签同运。
  • 其解:初,吾辈须闭诸背景应用与标签,方使Ollama得存。察此非善用户体验,遂研而施以低内存配置。gemma4-lowmem).

挑战二:Docker 容器网络& Redis 配置

  • 难题:以 Docker Compose 协调 FastAPI 后端与 Redis 缓存时,屡现连接失败,后端于启动之际无法触及 Redis 容器.
  • 解决之法:初以localhost / 127.0.0.1配Redis,然不效,盖因各Docker容器独运故也。后更Redis主为redis (REDIS_HOST=redis),去密码之设,并更新Docker之配置,使后端得正接Redis。

应用之像!

Landing Page

Ladning Page

Analytics

Congestion

Gemma Response

Afreen Hossain所呈