惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

SecWiki News
SecWiki News
M
MIT News - Artificial intelligence
博客园 - 司徒正美
I
InfoQ
V
V2EX
L
LangChain Blog
人人都是产品经理
人人都是产品经理
T
Tailwind CSS Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
The GitHub Blog
The GitHub Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
WordPress大学
WordPress大学
H
Help Net Security
美团技术团队
Y
Y Combinator Blog
G
Google Developers Blog
小众软件
小众软件
The Cloudflare Blog
博客园 - 三生石上(FineUI控件)
Jina AI
Jina AI
量子位
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Spread Privacy
Spread Privacy
博客园 - 聂微东
The Register - Security
The Register - Security
F
Full Disclosure
S
Securelist
G
GRAHAM CLULEY
Cyberwarzone
Cyberwarzone
F
Fox-IT International blog
H
Hacker News: Front Page
C
Cisco Blogs
D
Docker
L
LINUX DO - 热门话题
Google Online Security Blog
Google Online Security Blog
T
Troy Hunt's Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
T
ThreatConnect
aimingoo的专栏
aimingoo的专栏
Last Week in AI
Last Week in AI
J
Java Code Geeks
宝玉的分享
宝玉的分享
Project Zero
Project Zero
L
LINUX DO - 最新话题
博客园_首页
MongoDB | Blog
MongoDB | Blog
Stack Overflow Blog
Stack Overflow Blog
P
Proofpoint News Feed
博客园 - 叶小钗

DEV Community

🚀 New React Challenge: Instant UI with useOptimistic Resolvendo a Alucinação da IA na Arquitetura de Software com Code Property Graphs e .NET 9 S1 — Clean Backtrace Crashes: How to Diagnose and Fix Them Cómo solucionar el bucle infinito en useEffect con objetos y arrays The Brutal Reality of Running Gemma 4 Locally I Fed React's Entire Hooks Transition History to Gemma 4. Here's What It Found That We Missed. Building a Private RAG System: Lessons from a Local-First AI Journal CodePulse AI — Reviving an AI-Powered Repository Intelligence Platform How to Split Video into Segments with FFmpeg (CLI + API) I've audited dozens of estate agency websites. The same 5 problems show up every single time. Part 1: Taming Asynchronous JavaScript: How to Build a "Mailbox" Queue Building My AI-Powered VS Code Extension 🚀 Google Login in Express with PassportJS & JWT Great example of Gemma 4 moving beyond chatbots into real-world decision support. Using AI to guide everyday actions like recycling shows how impactful applied LLMs can be when designed for usability, not just capability. #Gemma4 #AI #Sustainability Building a Production AI Chatbot for an Educational Institute: Architecture, Lessons & Full Stack Deep-Dive Google Login in Express with PassportJS & JWT How I reclaimed 47GB on my MacBook by cleaning developer project junk Operators Are Not Oracles: How We Learned to Stop Worrying and Love the Configuration I Built 6 Free Developer Tools for AI APIs, Cron, Docker, and Self-Hosting How I Built a Real-Time Precious Metals Price Feed for 30,000 Concurrent Users in Laravel How to Use a SERP API to Validate Whether a Project Idea Is Worth Building Gemma 4 discussions often focus on capability, but real-world impact depends on deployment context. For offline education, especially in low-connectivity regions, latency, cost, and local inference matter as much as model strength. Local Mind Explores it Space Complexity + Ω and Θ Notations Google I/O 2026 Just Confirmed the Shift From AI Chatbots to AI Agents How to Add API Monitoring to an Express App in 5 Minutes (2026) Designing an In-Game Inflation Tracking Algorithm for Web Utility Apps Google AI Studio Just Changed the Shape of App Development If you struggle to learn then this is for you. Best AI Agent Security & Guardrails Tools in 2026: LLM Guard vs NeMo vs Guardrails AI Building Dynamic RBAC in React 19: From Permission Strings to Component-Level Access Control How to Build a Self-Hosted AI Code Review Tool in Python Why We Switched from React to HTMX in Production: A 200-Site Case Study Gemma-Loom: The Intent-Based Virtual Machine (IVM) for Edge Sovereignty Java实习海投攻略:3天300个沟通,我是怎么拿到面试的 I Deployed Netflix's Web Server in 30 Seconds (And So Can You) - Docker Project 1 Debugging Android 14 WebRTC Disconnects on a coturn Relay Path 1/30 Days System Design Question Testing FastAPI + SQLAlchemy with Real PostgreSQL Fixtures: No More Mocking Misery FAQ Schema Markup Generators: What They Actually Do (and What They Don't Tell You) How a pure-TypeScript flex layout engine closed the last WASM-Yoga gap Spot instances as GitHub Actions runners Agents Need Receipts, Not Just Better Prompts readmegen — Generate beautiful README.md in seconds (12 templates, open source) When AI Reads Blueprints: The Hidden Attack Surface of Multimodal Engineering Intelligence Simplicity scales — complexity kills side projects AI does exactly what you ask — that's the problem How a model upgrade silently broke our extraction prompt (and how we caught it) The Best Form Backend for Static Sites in 2026 # ⛽ I Built a Cross-Platform Fuel Finder with React & Supabase: The Indie Dev Journey The 11 Major Cloud Service Providers in 2025 Membangun Karya Visual: Mengintip Fasilitas Multimedia dan Studio Kreatif Amikom What Is IOPS? Visualizing Database Design: From Interactive Canvas to Drizzle, Prisma, and SQL in Real-time A tool to make your GitHub README impossible to ignore 🚀 Zero-Downtime Blue-Green and IP-Based Canary Deployments on ECS Fargate I reproduced a Claude Code RCE. The bug pattern is everywhere. We Replaced Our RAG Pipeline With Persistent KV Cache. Here's What We Found. Jenkins CI/CD Pipeline for a Dockerized Node.js Application: Manual Trigger vs Automatic Trigger Using GitHub Webhooks How to Stream Live Forex Rates to Google Sheets API: A Complete Guide Small Models Will Beat Giant Models (And Most People Haven’t Realized Why Yet) How I Built 5 Linux Automation Scripts on AWS EC2 I built TokenPatch to measure AI coding cost per applied patch I built a Chrome extension to stop squinting at the web Producer audit clean, six tests red Conversa — A Multi-Agent AI Platform Powered by Gemma 4 Build a Real Agent in 15 Minutes with Gemini's New Managed Agents API What I Actually Build: AI Systems That Ship, Not Demos That Impress The Box Ticked While You Read This: LinkedIn, AI Training, and the Switch You Did Not Flip Investasi Masa Depan: Mengintip Fasilitas Laboratorium Komputer Kelas Dunia di Yogyakarta I Cancelled My $20 Claude Cowork Plan After a Week With OpenWork Stop Reviewing Every Line of AI Code - Build the Trust Stack Instead How To Build an Image Cropper in Browser (Simple Steps) I built a macOS disk cleaner for developers and just launched it would love feedback Membangun Kompetensi dan Relasi: Mengapa Ekosistem Kampus Itu Penting I Built an AI That Decides Which AI to Talk To — Running 24/7 From My Living Room Codex Team Usage SOP How to Actually Become a Programmer: The Hard Part Nobody Wants to Explain Building a Production-Style Multi-Tool AI Agent with Python, Flask, React & Gemini AI The Caretaker Sandbox: An Offline-First Visual Playground & Template Engine powered by Gemma 4 # Building Instagram OSINT Projects with HikerAPI Your AI can read. Gemma 4 can see The Battle of the Senior Dev: Why AI Gives You Wings But Only If You're Ready to Pilot HiDream Raw Output Failed Tried Dev-2604 VRAM Math Killed It Won with a Prompt Enhancer Instead I Finally Finished a Project I Abandoned — And GitHub Copilot Helped Me Ship It SafeSMS: On-Device Threat Detection with Gemma 4 E4B, no internet required I Built OpenKap — A Loom Alternative for Small Teams Who Just Want to Ship Gemma 4 is Here: The Dawn of Local Multimodal Reasoning Offline-First Flutter: How We Built a CRM That Manages 100K+ Leads With No Internet Memory for Agents: When Vectors Meet Graphs, Bugs Drop 4 The Rise of Production-Grade AI Infrastructure I ran my idea-validation product through its own validator. The verdict was PIVOT. We Built an Agent Commerce API. Google I/O 2026 Changed Our 3-Month Roadmap in 24 Hours. "My Partner's Memory Was Full. I Didn't Know — Until We Tried to Talk." I’m a Front End Web Developer Learning Machine Learning From Scratch Laravel Waiting Request I Built a Chrome Extension to Track How Long You Actually Spend on Each Tab Why Google Can't See Your React Breadcrumbs (And the 4-Line Fix) AI Travel Assistant Powered by Gemma 4; With Streaming, Image Input, and Visual Recommendation Cards Microsoft tried to kill the printer driver. Healthcare said no. The Blueprint Beneath the Blueprint: Designing Data Model and Choosing Its Database
I made Claude Code refuse to write code unless the ticket scores 80/100
Lex cano · 2026-05-23 · via DEV Community

Lex cano

I've been using Claude Code daily on real projects for several months. It's excellent. It's also a brilliant improviser — and that's the problem.

The failure mode

The recurring pattern looked like this:

  • I'd write a vague ticket. Claude Code would happily start coding.
  • It would ship something close to what I asked, but not quite.
  • It would touch files I hadn't anticipated.
  • Security and UI review lived in the same head as the implementation — so nothing caught the obvious.
  • Each session forgot what the last one learned. The same bugs reappeared in different forms.

The model wasn't wrong. The workflow was wrong. I was treating a senior pair programmer like a junior who'd guess if I was unclear. The result was a slow drift toward "vibe-coded" software — superficially impressive demos that broke when they met real users.

So I stopped treating each ticket as a prompt and started treating it as a spec. The methodology that came out is now open-source. It's called Forgekeel, and the core idea is a quality gate I named KERNEL.

KERNEL: a scored gate before any code

The premise: no code is written until the ticket passes a gate.

The gate scores every ticket against 6 orthogonal dimensions, 100 points total:

Dimension Weight Question
Clarity 20 Is the objective unambiguous in one sentence?
Scope 20 Are inclusions and exclusions explicit?
Context 15 Enough context (files, deps, prior decisions) to execute without asking?
Risk 15 Risks identified and mitigated? (auth, DB, migrations, deletes, PII)
Validation 15 Acceptance criteria verifiable and reproducible?
Priority 15 Does this advance an active goal or unblock a critical path?

The score decides what happens:

  • < 60 → reject. Return to me with concrete flags per dimension.
  • 60–79 → conditional. The architect subagent (Opus, read-only) drafts a plan before any write action.
  • ≥ 80 → execute. The builder subagent (Sonnet) proceeds directly.

The 6 dimensions are orthogonal. You can't compensate for low Risk with high Clarity — a ticket with Clarity 20 and Risk 5 still totals well numerically, but the low Risk forces an architect review before any DB or auth touch.

The score is not a vanity metric. The line that ended up at the top of the rubric file:

A ticket that scores 95 and breaks production is worse than one that scored 65, was flagged, and got a plan first.

A concrete example

Here's a ticket I rejected on myself: "Add password reset email."

KERNEL pass:

Clarity      15 / 20   "Reset" undefined — link only, or full flow + template?
Scope        10 / 20   No mention of expiry, rate limits, or email provider
Context       8 / 15   Doesn't say which auth provider or where templates live
Risk          5 / 15   Auth flow + email = high risk, not addressed
Validation    8 / 15   "Make it work" isn't testable
Priority     12 / 15   Active KPI
                       ─────
                       58 / 100   → REJECT

Enter fullscreen mode Exit fullscreen mode

Forgekeel refused to touch code. I rewrote the ticket as I would have if a junior asked me to spec it properly:

Clarity      19 / 20   Add `/reset-password` flow: email link with 30-min expiry token
Scope        18 / 20   This ticket: link email only. Excluded: SMS, recovery codes
Context      14 / 15   Supabase Auth `resetPasswordForEmail`. Template in /emails/
Risk         13 / 15   Rate-limit 3/h per email. No token leak in logs. RLS audit
Validation   13 / 15   Cypress: request reset, click link, set new password, login
Priority     13 / 15   Unblocks login retention KPI
                       ─────
                       90 / 100   → EXECUTE

Enter fullscreen mode Exit fullscreen mode

Same problem, two tickets, two outcomes. The first version would have shipped something. Probably without rate limits, without auditing whether the token could leak into logs, with "manual testing" as the validation step. The second version forces every gap shut before the IDE opens.

This is what KERNEL really is: the difference between debugging in your editor and debugging in production.

Around the gate

KERNEL is the most distinctive piece, but the methodology has more:

  • 7 specialized subagents with enforced read-only vs write roles. architect, security-auditor, and ui-reviewer never modify files — their job is to think, not type. builder, tester, and designer can write, but only after the read-only review.
  • A per-project constitution declaring stack, non-negotiable principles, design tokens, allowed/forbidden MCPs. Every agent reads it before acting. Without it, /design-iterate refuses to run because the designer would hallucinate values.
  • A learnings loop. Every closed ticket appends "what went wrong / how it was resolved / what not to repeat" to a learnings.md file. By ticket 20, the project has a written history of bugs it should never re-introduce. Future sessions read it as part of context.

Full breakdown in the README.

Why it's stack-locked

Forgekeel is opinionated and stack-locked to Next.js + Supabase + TypeScript + Tailwind v4 + shadcn/ui + pnpm.

That's deliberate. The agents reference specific patterns (Server Actions, RLS on every table, Tailwind @theme tokens). If I made them stack-agnostic, the methodology would become advice instead of execution. Adapting to a different stack means editing the agents directly — there's no abstraction layer planned.

What I'm looking for

MIT, v0.1.0, used internally on real projects before this release.

Repo: https://github.com/forgekeel/forgekeel
npm: https://www.npmjs.com/package/forgekeel

I'd genuinely value feedback on:

  • The KERNEL rubric — would you weight the dimensions differently? Blind spots in the 6?
  • Anyone running similar structured workflows on Claude Code — what worked, what didn't?
  • The subagent setup — what's missing for your daily loop?

If KERNEL stops one ticket from shipping broken to your users, my week was worth it.