惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

L
LangChain Blog
宝玉的分享
宝玉的分享
酷 壳 – CoolShell
酷 壳 – CoolShell
N
Netflix TechBlog - Medium
F
Fortinet All Blogs
T
Tailwind CSS Blog
Google DeepMind News
Google DeepMind News
Jina AI
Jina AI
J
Java Code Geeks
Recent Announcements
Recent Announcements
The Cloudflare Blog
D
DataBreaches.Net
Hugging Face - Blog
Hugging Face - Blog
WordPress大学
WordPress大学
Vercel News
Vercel News
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Microsoft Azure Blog
Microsoft Azure Blog
雷峰网
雷峰网
H
Help Net Security
博客园 - Franky
S
SegmentFault 最新的问题
T
The Blog of Author Tim Ferriss
博客园_首页
C
Check Point Blog
腾讯CDC
美团技术团队
Martin Fowler
Martin Fowler
The GitHub Blog
The GitHub Blog
M
MIT News - Artificial intelligence
Apple Machine Learning Research
Apple Machine Learning Research
P
Proofpoint News Feed
U
Unit 42
人人都是产品经理
人人都是产品经理
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Engineering at Meta
Engineering at Meta
M
Microsoft Research Blog - Microsoft Research
阮一峰的网络日志
阮一峰的网络日志
G
Google Developers Blog
Stack Overflow Blog
Stack Overflow Blog
B
Blog
Last Week in AI
Last Week in AI
博客园 - 三生石上(FineUI控件)
博客园 - 聂微东
云风的 BLOG
云风的 BLOG
H
Hackread – Cybersecurity News, Data Breaches, AI and More
李成银的技术随笔
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知

DEV Community

Optic is dead. A 2026 migration guide for OpenAPI breaking changes Smart Blind Stick, Mini Project The NSA just published an MCP security playbook. We created Agent Trust Transport Protocol ATTP - Implement today with MCPS Symfony 8 AWS Secrets Bundle What RepoSignal Surfaced in React — and Why Review Alone Doesn't Catch Everything LeetCode Solution: 1752. Check if Array Is Sorted and Rotated Breaking the Matrix at 15: How I Built a Cyber-Aesthetic AI Assistant Core Powered by Gemma 4 Разработка Android Kiosk приложения Trafik Cezaları Platformları Geliştirirken Öğrendiğim Teknik Dersler The Myth of Low Latency: Why Event Meshes Make Your System Slow Building EIDOLON OS — A Local-First AI Cognitive Operating System qrrot - database with AI I Built a Local Gemma 4 Reviewer for Merchant Registry Evidence Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift How to build your first MCP server in 10 minutes Expo SDK 56 Is Out, and a Few Things Finally Clicked Into Place Building a 100ms Browser-Native WebSocket Clipboard Cómo solucionar `docker run` con `Exited (1)` en Raspberry Pi Why Claude Code Sessions Diverge: A Mechanism Catalog When One AI Agent Is Not Enough: A Practical Delegation Pattern for Enterprise Systems Cómo solucionar el bucle infinito en `useEffect` con objetos y arrays 🛢️ The Dangote Chain: What a Blockchain-Native Refinery IPO Would Look Like Build a "Where to Watch" feature in 50 lines with the StreamWatchHub API Gemma 4 on Android: Tricks for Faster On-Device Inference Your AI agent has amnesia. You've just normalized it. 🚀 Reviving My Women Safety System – From Idea to Real-Time Smart Safety Solution I built an AI that reviews every PR automatically (because nobody was reviewing mine) 🌿 Git Mastery: The Complete Developer Guide Bringing Gemma 4 E2B to the Edge: Building a Privacy-First Dream Analyzer with Flutter & LiteRT Google I/O 2026 Wasn’t About Features — It Was About AI Becoming the Developer Environment Building an AI Vedic Astrology App in 25 Days — What Actually Worked (and What Didn't) Hermes Agent Has Four Memories — And That's Why It Doesn't Forget You Pressure Isn't Killing You -Your Relationship With It Is 🐳 How to Run Any Project in Docker: A Complete Guide AccessLens — a blind person's lanyard, powered by Gemma 4 on-device Glyph v0.2: the release is the joinery How I Built a Blazingly Fast, Privacy-First Batch Image Converter in the Browser Using OPFS and Web Workers Cómo solucionar \"Text content does not match server-rendered HTML\" en Next.js App Router FCoP 3.0: Why AI Agents Need a Track, Not a Brake Fibonacci: Quiz app which anyone can make revenue by viewing ads to the quiz contestants. The Subconscious Powered by Edge AI GPU Utilization Is Becoming the New Cloud Waste Crisis Cómo solucionar `docker run` con exit code 1 en Raspberry Pi JWT is a scam and your app doesn't need it 7 Agent Skill Packs That Actually Make AI Coders Better More Control, More Cost: Why Commanding AI Isn't Delegation SecureScan Synthadoc: We Built an AI Judge for Our AI Wiki Compiler - Here's What We Learned Cómo solucionar el error de permiso al ejecutar `pip.exe` en entorno virtual (Python 3.10 en Windows) Postgres-grade Serializable at 20k+ ops/s — on a laptop. Don’t try this at home. Pure Core, Imperative Shell in Rust with Stillwater Lean 4 for Programmers: Building a Todo List with Proof Trustless Bug Bounty Releases with a PoW-Gated DLC Oracle Building Autonomous DevOps Agents with MCP and LangChain Multimodal Gemma 4 Visual Regression & Patch Agent Git Time Machine — How Version Control Can Save Your Project My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable. My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable. Read Replicas Lie About Consistency. 4 Sync Modes Behind the Lie. Reviving My Coding Project with GitHub Copilot I Tried Gemini 3.5 Flash After Google I/O 2026 - Here is What I Found :)) Zero-Cost AI in VS Code Blueprints Might Be More Important Than Frameworks AI CareCompanion - Offline Health Assistant Long-Context Models Killed RAG. Except for the 6 Cases Where They Made It Worse. I Built a Neural Network Engine in C# That Runs in Your Browser - No ONNX Runtime, No JavaScript Bridge, No Native Binaries An In-Depth Overview of the Apache Iceberg 1.11.0 Release Your Agent Just Called the Same Tool 47 Times. Here's the 20-Line Detector. How I Built a Multi-System Astrology Bot in Python (And What Meta Banned Me For) Gemma 4 Has Four Variants. Here's How to Pick the Right One Before You Write a Single Line of Code. Log Level Strategies: Balancing Observability and Cost Why WebMCP Is the Most Important Thing Google Announced at I/O 2026 (And Nobody's Talking About It) Making LLM Calls Reliable: Retry, Semaphore, Cache, and Batch Google's 2x Energy Efficiency Claim Is Real — But Here's What They're Not Measuring What's actually going on with CORS, under the hood Language-Agnostic Code Generation: The Driver Plugin Model Why We Rewrote Our Python CLI in Go (and What We Gained) I added up everything Google gives developers for free after I/O 2026. It's kind of absurd The Dawn of Smarter Apps: My Take on Google I/O 2026 AI Announcements Why AI Agents Like Hermes Need a Semantic Execution Layer for the Physical World Why We Built TestSmith: The Test Coverage Problem Nobody Talks About How to Convert Bank Statement PDFs to Excel: The Complete 2026 Guide Have You Ever Used a Website That Keeps Working After You Turn Off Your Internet? From idea to indexed: how I launched a SaaS in 60 days with Laravel + React Building a local-first AI tutor for my daughter (and 10–14 year-olds in Austrian schools) with Gemma 4 EC2 SSH Not Connecting? Here Are the 5 Things That Were Wrong (And How I Fixed Them) Best AI Tools for HVAC Contractors 2026 From Closed Internal Stack to Open-Source Ecosystem: I Finally Shipped Three Years of .NET Infrastructure Scrumpan is offlically LIVE!! Building a BMI Calculator CLI with TypeScript — Types, Functions, and Vitest From Building WordPress Websites to Node.js APIs: My Honest Full Stack Journey XiHan Snore Coach: Privacy-First On-Device MedTech Guardian powered by Gemma 4 Mobile Why AI Coding Agents Hallucinate and How to Fix It mcp-probe v1.4.0: Contract assertions for production MCP servers Google I/O 2026 Wasn't About One More Model. It Was About the Agent Stack. How I built 100+ crypto calculators in 6 languages on Astro The Dawn of Local Multi-Agent Architectures: Why Gemma 4 Changes Everything for Cloud Developers # I Told My AI to Simulate a Planet for 10,000 Years. It Built the Whole Thing Itself. 18/30 Days System Design Questions! From Hackathon Chaos to Clean CLI: Reviving My Daily Routine Analyser with GitHub Copilot
No More Manual Test Writing: How I Used Gemma 4 to Turn a GitHub Repo Into a Full Test Suite 🎯
Arjun Sharma · 2026-05-24 · via DEV Community

This is a submission for the Gemma 4 Challenge: Build with Gemma 4


What I Built

I built Scriptless.ai — an AI-powered testing workspace that lets you go from "I have a GitHub repo" to "I have running browser tests with failure analysis" without writing a single line of test code yourself.

Here's the honest backstory: every team I've seen treats testing as something you do after you're already exhausted from writing the feature. Test coverage slips, QA bottlenecks pile up, and debugging a CI failure at 2 AM means staring at a red checkbox with zero context. Scriptless.ai is my attempt to flip that dynamic entirely.

The core loop is this:

  1. Connect your GitHub repo — OAuth in, pick a repository, done.
  2. Generate test cases with Gemma 4 — The model reads your actual source files and file tree, then produces structured, route-specific test cases (UI, auth, form, API, edge-case, integration) tailored to your exact codebase. Not generic Lorem Ipsum tests — real ones that know your routes and files.
  3. Execute in a real cloud browser — Each test case gets turned into a Playwright script and run in a live Browserbase session. Real browser, real network, real DOM.
  4. Understand failures visually — When a test fails, Gemma 4's vision capability analyzes a screenshot of the page at the moment of failure, tells you what it sees, what likely went wrong, and what to fix. No more blind log archaeology.
  5. Control everything with your voice — A Speechmatics-powered voice command layer lets you say "run failed tests" or "show only passing" and the UI responds. Hands-free QA.

It's a full-stack product: Next.js frontend, Neon Postgres + Drizzle for persistence, Clerk for auth, Stripe scaffold for monetization, and Vercel for deployment. Credits are tracked per user, deducted on generation and execution, and the billing infrastructure is wired up and ready for subscription plans.


Demo

🚀 Live app: https://scriptless-ai.vercel.app/

Sign in with Clerk → authorize GitHub → connect a repo → hit Generate Tests and watch Gemma 4 do its thing.

What to try first:

  • Connect any public Next.js or React repo you own
  • Click Generate Test Cases and see how the model structures tests specific to your file tree
  • Run a test — the Browserbase session spins up a real Chrome instance
  • If it fails, scroll to the Vision Analysis section to see Gemma 4's diagnosis


Code

📦 GitHub: https://github.com/Arjunhg/scriptless-ai

Key files worth exploring:

File What it does
lib/featherless/client.ts Featherless client config, model aliases pointing to gemma-4-31B-it
lib/featherless/generateTests.ts Calls Gemma 4 with tool-calling to produce structured test cases
lib/featherless/analyzeScreenshot.ts Sends failure screenshots to Gemma 4 vision for diagnosis
lib/featherless/prompts/testGeneration.ts The system prompt that turns Gemma into a QA engineer
app/api/generate-test-cases/route.ts API route that reads GitHub files → calls Gemma → saves to DB
app/api/test-cases/run/route.ts Execution pipeline: script gen → Browserbase → vision fallback
hooks/useVoiceCommands.ts Speechmatics real-time pipeline + utterance finalization logic
lib/speechmatics/commandParser.ts Token-based NLP command parser (stemming, stopwords, synonyms)

How I Used Gemma 4

I'm using google/gemma-4-31B-it (the 31B Dense instruction-tuned variant), accessed via the Featherless AI OpenAI-compatible API. Gemma 4 powers two distinct, genuinely different use cases in this app:

1. 🧠 Test Case Generation (Text + Tool Calling)

When a user clicks "Generate Test Cases," the backend fetches their repo's file tree and a filtered set of source files from GitHub, then sends everything to Gemma 4 with a structured system prompt and a tool call definition for submit_test_cases.

// lib/featherless/client.ts
export const FEATHERLESS_TEXT_MODEL = "google/gemma-4-31B-it";

// lib/featherless/generateTests.ts
const response = await featherlessClient.chat.completions.create({
  model: FEATHERLESS_TEXT_MODEL,
  messages: [...messages],
  tools: [TEST_CASE_TOOL_DEFINITION],
  tool_choice: { type: "function", function: { name: "submit_test_cases" } },
});

Enter fullscreen mode Exit fullscreen mode

The model returns a JSON payload with 5–8 test cases, each with a title, description, type (ui/auth/form/api/integration/edge-case), priority, targetRoute, targetFiles, and expectedResult. It understands your app's structure from context — it won't hallucinate routes or files that don't exist because you gave it the actual file tree.

I chose 31B over the smaller variants because structured output fidelity matters here. A smaller model tends to drift from the tool-call schema or produce partial JSON, especially on larger repos with complex file trees. The 31B model is reliably structured even with multi-thousand-token inputs.

2. 👁️ Failure Vision Analysis (Multimodal)

When a test run fails, the Playwright script captures a screenshot at the point of failure. That screenshot gets passed to Gemma 4's vision capability:

// lib/featherless/analyzeScreenshot.ts
const response = await featherlessClient.chat.completions.create({
  model, // google/gemma-4-31B-it
  messages: [{
    role: "user",
    content: [
      { type: "text", text: `You are a QA engineer analyzing a browser test failure screenshot.\n\nTest case: ${testDescription}\n\nDescribe what is visible on the page, what likely went wrong, and suggest concrete next steps...` },
      { type: "image_url", image_url: { url: screenshotUrl } },
    ],
  }],
  max_tokens: 512,
});

Enter fullscreen mode Exit fullscreen mode

The result is a 3–5 sentence diagnosis rendered directly in the test result card. Instead of "FAILED — element not found," you get something like: "The page shows a 404 error. The route /dashboard/analytics does not exist yet. The test references a navigation action that was likely removed in a recent commit. Suggest updating the target route or adding a redirect."

This is the part that genuinely surprised me during development — Gemma 4's vision analysis is sharp. It doesn't just describe the page, it reasons about why a test targeting that page would fail.

Why 31B Dense?

  • Tool-calling reliability: Smaller models frequently break from structured output schemas under pressure. 31B is consistent.
  • Long-context reasoning: Test generation prompts can be 3,000–5,000 tokens with file content. The 31B model handles this gracefully.
  • Vision quality on UI screenshots: Browser screenshots have dense UI elements. The 31B vision model correctly identifies components, error states, and layout issues that smaller models tend to miss or describe too generically.
  • Two-in-one: Using the same model family for both text and vision keeps the integration simple and the behavior predictable across both tasks.

Built solo during the Gemma 4 Challenge. All infrastructure, prompting, voice pipeline, and UI designed and shipped.