惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
人人都是产品经理
人人都是产品经理
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
V2EX
博客园 - 三生石上(FineUI控件)
Martin Fowler
Martin Fowler
WordPress大学
WordPress大学
D
Docker
S
SegmentFault 最新的问题
博客园 - 聂微东
美团技术团队
Apple Machine Learning Research
Apple Machine Learning Research
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Last Week in AI
Last Week in AI
M
MIT News - Artificial intelligence
F
Fortinet All Blogs
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
GbyAI
GbyAI
L
LangChain Blog
Vercel News
Vercel News
博客园 - 叶小钗
MongoDB | Blog
MongoDB | Blog
Stack Overflow Blog
Stack Overflow Blog
H
Help Net Security
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
The Cloudflare Blog
Engineering at Meta
Engineering at Meta
T
Threat Research - Cisco Blogs
T
Threatpost
Scott Helme
Scott Helme
T
Tailwind CSS Blog
Latest news
Latest news
Stack Overflow Blog
Stack Overflow Blog
Blog — PlanetScale
Blog — PlanetScale
The Register - Security
The Register - Security
罗磊的独立博客
P
Proofpoint News Feed
腾讯CDC
S
Schneier on Security
雷峰网
雷峰网
A
About on SuperTechFans
T
Tenable Blog
F
Full Disclosure
Cyberwarzone
Cyberwarzone
博客园_首页
有赞技术团队
有赞技术团队
K
Kaspersky official blog

DEV Community

The Principle of Least Privilege: Operational Speed's Security Cost How Zone01 Kisumu "Build from Scratch" Approach Transformed Me from a Framework User to a Problem Solver Bringing MongoDB Atlas and Voyage AI to Dify: Build RAG Workflows and Data Agents Without Heavy Glue Code Sass isn't dead, but native CSS just replaced its biggest use case. We can finally write reusable, type-safe functions directly in the browser, with zero build tools. I wrote up a practical guide on Dev.to explaining exactly how native `@function` works. Intel Targets World's First Mass Production of Glass Substrates for AI Chip Packaging Stop Burning Tokens on Chat / Agent Loops — Here's What Actually Works 🔮 Hermes Agent 🤖: A Practical Guide 🔥 — and How It Stacks Up Against OpenClaw & GoClaw 📊 I Built a Free AI Business Manager for Street Vendors in Hindi & English CSS @function CSS @function Agent Payment Stablecoin Fallbacks: Do Not Retry the Changed Quote Daily-summary-agent Opus 4.8 barely moved the leaderboard. It moved the one number that decides if your agents can be trusted. I Built an AI Interview Coach That Turns Any Resume Into a Personalized Prep Package — No API Keys Needed The best Claude Code agents are defined by what they refuse to do I Built a Tiny Skeleton Loader for React Why I Generated Synthetic Patients to Make Identity Matching Better SPIFFE Compliance Deep Dive PostgreSQL 08007 오류 원인과 해결 방법 완벽 가이드 I Was Tired of Writing Daily Standups, So I Built an AI Agent using claude code I got tired of LLM observability tools getting acquired. So I built one that can't be. Oracle ORA-00072 오류 원인과 해결 방법 완벽 가이드 Multi-Agent Negotiation Protocols: How AI Agents Should Bargain for Resources uBlock Origin No Longer Works on Chrome - Here Are the Best Alternatives in 2026 SSH Agent Forwarding vs ProxyJump: Why Agent Forwarding Is Dangerous and What to Use Instead The Best Technology Disappears I Built a Production-Oriented Multi-Provider AI Chatbot in Rust — Here's How Markov Chain Coin Sequence: E[HH] vs E[HTH] Explained LLM Deal Flow Automation in CRM The Do-Over Game: Nash Equilibrium at the Golden Ratio Cash Flow Waterfall Model for LBO Automated Client Reporting The Monty Hall Problem: Why Switching Wins 2/3 of the Time Chat With Your Database Using Natural Language: The Future of Business Analytics Google Apps Script Automation Amoeba Extinction Probability: The Branching Process Solution RAG Architecture Deep Dive Real-Time KPI Dashboards OpenAI Agents SDK的5个隐藏用法 🔥 Algorithmic Trading Pipelines 131 tokens per second on GPU under Kubernetes one of the best blogs about hermes agent Nous Research Hermes Agent: Setup and Tutorial Guide Day 20 - AWS Lambda Spending Hours Designing the UI? Or Just Telling AI the Pain Story Karpenter on AKS in 2026: What Actually Works I built a Chrome extension that shows your ChatGPT token usage in real-time Day 1 Field Report — Barriers to an Autonomous Agent Earning Money Online Mastering Background Processing in Rails 8: Sidekiq & Redis Optimization I shipped three fixes to my product in seven days. All three came from readers. Claude Code Model Switching: The Verification Notes That Could Save You $200/Month Three agent-memory threads this week, one missing field The Way to Break Through: Why Others Sail Through While You Struggle Simple Snap Layout Overlay for Tauri v2 CSS Animation vs Lottie: Which Should You Use in 2025? How to Add Lottie Animations to Vue.js (2025 Guide) Building BayouOps Suite Pro — Lightweight Operational Readiness & Visibility for IT Teams Detecting Adversary-in-the-Middle (T1557) with Data Science HTTP Headers Every Developer Should Know (2026) Detecting Ingress Tool Transfer (T1105) with Python Linux Command Line: The 25 Commands I Use Every Day (2026) Starting My Cybersecurity Learning Journey 🚀 CSS in 2026: Modern Techniques You Might Not Know (2026) TypeScript Deep Dive: Advanced Types and Patterns (2026) Three SQL Injection Patterns That Still Ship in Node.js — And the ESLint Rule That Catches Them From Idea to Production: How I Built a Decoupled Chatbot Ordering Engine I Spent 8 Months Building a Framer Killer as a Solo Undergrad. Here's What Happened. unknown 5 Git Commands I Wish I Knew 5 Years Ago How to Find users who don't follow you back in Github Bulk-check DNS, SSL and email auth for a whole list of domains (no scraping) Monolithic vs Microservices Architecture: Which One Should You Choose? The Full-Stack Developer's 2026 Playbook: 7 Shifts That Separate Senior Engineers from the Rest MCP Tool Budget for AI SaaS: Stop Agents From Burning Tokens, Tools, and Trust Untrusted Code, Trusted Cluster Scaling Secure AI Agent Workspaces with GKE Agent Sandbox Learning, Experimenting - Concurrency in Go Building Dhrishti Part 2: Go-Lang Quirks Announcing My New Book: Web Automation with Playwright and Python using AI and MCP Why MTP Batch Transfers Slow Down Between Files How We Cut Our AI Coding Bill by 65% Without Sacrificing Quality Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips I Benchmarked 4 Lightweight Transformers for Fault Detection. Here's What Survived. 🗡️ Tsundoku Slayer: An Agent That Decides What Not To Read Animated Icons for Web Apps — The Complete 2025 Guide How to Use Lottie Animations in React (2025 Guide) Azure API Management - Deploy gRPC API on Azure API management using self hosted gateway I Built pretext-pdf: Serverless PDFs Without Chromium Lottie JSON vs .lottie Format — What's the Difference and Which Should You Use? SVG Icon Systems in 2025 — Everything You Need to Know My Trading Bot Tried to Execute the Same Trade Twice. That Became SafeAgent. Free Loading Animations for Web Apps — Lottie, GIF, and SVG Spinners (2025) How to Add Lottie Animations to Your Website (Free JSON Files Included) Idempotency Keys: The One API Pattern That Prevents Duplicate Payments (and Worse) CONFIGURING SEMANTIC MODEL IN POWER BI Surviving Global Vendor Outages: Federated Cellular Architecture with EKS, AKS, and Istio I Turned My Cursor + Claude Code Setup Into 12 Reusable Files I Built a Cognitive Threat Hunter on Hermes Agent — It Analyzed the Session Where I Built It and Found Three Blind Spots Making AI-Generated Code Fail Gracefully How to Convert Lottie JSON to GIF (Free, Browser-Based, No Signup) Observability 2.0: Tracing AI "Thought Chains" with OpenTelemetry
Your AI Sucks at Math. Fix It With One Command.
Chenrui Hu · 2026-05-31 · via DEV Community

Chenrui Hu

You've seen this before.

You ask your AI agent: "Find ∫ x·e^x dx"

It confidently replies: e^x + C, complete with a plausible-looking derivation. You nod. Then you check — the correct answer is (x−1)·e^x + C. It was wrong by a mile, and you almost shipped it.

This is the fundamental problem with AI math today: LLMs can talk, but they can't verify their own work. They sound convincing while being catastrophically wrong. And the more complex the problem, the better the hallucination.

Math.skill changes that. It's an open-source mathematical reasoning skill for AI agents — install it, and your agent stops guessing and starts verifying.


What Makes It Different

Typical AI Math Plugin Math.skill
Workflow Prompt → LLM → answer Prompt → 7-step pipeline → ≥2 verifications → answer
Verification None Answer blocked if verification fails
Open problems Might hallucinate a "solution" Honestly says "this is unsolved"
Error recovery No mechanism Auto-backtrack, fix, recompute, re-verify

The core differentiator: a verification engine that runs at least 2 of 11 independent checks on every answer. No answer leaves the pipeline unverified. Period.


The 7-Step Pipeline

Every problem flows through this:

Step What Happens Why It Matters
1. Parse Extract conditions, goals, variables, implicit domain constraints Catches misread problems before they waste your time
2. Model Build formal representation: equation, function, matrix, probability space, etc. Prevents building the wrong mathematical structure
3. Select Choose the optimal method from 30+ strategies Avoids brute-forcing when elegance exists
4. Solve Step-by-step with mathematical justification at every transformation Full traceability — nothing hidden
5. Verify Apply ≥2 of 11 independent verification methods The differentiator — catches what LLMs miss
6. Correct If verification fails: backtrack to last known-good step, fix, recompute, re-verify No "doubling down" on wrong answers
7. Deliver Exact answer (not approximate), domain conditions, verification summary You know it's right, and you know why

The Verification Engine: 11 Independent Methods

This is the heart of Math.skill. Each method catches a different class of errors:

ID Method What It Catches
A Back-substitution Extraneous roots, sign errors — plug the answer back in
B Domain check Division by zero, negative radicands, log(0), arcsin(2)
C Boundary analysis Missed interval endpoints, parameter edge cases
D Reverse derivation Irreversible step errors — work backwards from answer
E Numerical sampling Coefficient drift, off-by-factor — test with specific values
F Dimensional analysis Unit mismatches, P > 1, variance < 0
G Limits & special cases Degenerate behavior as parameters approach 0 or ∞
H Cross-validation Solve with a completely different independent method
I Counterexample search Disprove false universal claims by construction
J Formal logic check ∀∃ order errors, necessary vs. sufficient, circular reasoning
K Computational consistency det(A−λI) = 0, total probability = 1, trace = sum of eigenvalues

At least two methods per problem. The engine selects which ones based on the problem type. You don't have to think about it — it just works.


34 Math Categories. One Skill.

Math.skill covers everything from arithmetic to abstract algebra. Each category has its own verification protocol and common-error checklist:

Arithmetic · Algebra · Equations/Inequalities · Functions
Geometry · Trigonometry · Sequences · Combinatorics
Probability/Statistics · Limits · Differentiation · Integration
Multivariable Calculus · Linear Algebra · ODEs
Complex Analysis · Real Analysis · Abstract Algebra
Topology · Number Theory · Discrete Math · Optimization
Mathematical Modeling · Proofs · Counterexamples
Solution Checking · Problem Generation · Research-Level Problems

Enter fullscreen mode Exit fullscreen mode

Not a one-size-fits-all. Each category gets targeted handling.


It Won't Lie About Unsolved Problems

Ask it to "prove the Riemann Hypothesis" and you won't get a hallucinated Nobel-worthy breakthrough. You'll get:

"This is a known open problem. Here's what I can provide: partial results, known bounds, and why this remains unsolved."

Honesty is the baseline. If a problem is open, it says so. If it can only give partial results, it clearly labels what's proven vs. conjectured.


Preemptive Error Prevention: 8 Guard Categories

The most common AI math failures are blocked before they happen:

  • Algebra: Check division by zero before dividing. Verify roots after squaring. Re-expand after factoring.
  • Inequalities: Sign reversal on multiply-by-negative. Case analysis for variable expressions.
  • Functions: Find domain first. Distinguish critical points from extrema. Check non-differentiable points.
  • Probability: Reject P ∉ [0,1]. Reject negative variance. Verify total probability = 1.
  • Calculus: Verify L'Hôpital conditions. State Taylor remainder order. Always add +C. Check improper integral convergence.
  • Linear Algebra: Check matrix dimensions. Verify Av = λv. Verify A = PDP⁻¹.
  • Geometry: Don't rely on visual intuition. State theorem conditions explicitly. Explain auxiliary constructions.
  • Abstract Math: Verify all definition components. Check quantifier order (∀ε∃δ ≠ ∃δ∀ε). Verify well-definedness.

One Command to Install

npx skills add Wholiver/Math.Skill

Enter fullscreen mode Exit fullscreen mode

That's it. No config. No API keys. No dependencies to wrestle with.

Works with: Claude Code · GitHub Copilot · Cursor · Windsurf · Codex · OpenCode — any AI agent that supports skills.sh.

MIT Licensed. Free to use. Free to modify. Free to ship with your product.


Who Is This For?

  • Students — homework help with verified solutions. Learn the how and the why, not just the answer.
  • Teachers — generate well-posed problems with full solutions. Check student answers against verified references.
  • Researchers — quickly validate intermediate derivations. Catch errors before they propagate into your paper.
  • Developers — if your AI coding agent touches math, stop it from hallucinating incorrect calculations.
  • Everyone who's been burned by AI math — you know the feeling. This is the antidote.

The Bottom Line

Your AI agent is brilliant at many things. Math isn't one of them — unless you give it the right tools.

Math.skill gives your agent what it's missing: a mathematician's discipline. Parse, model, solve, verify, correct, deliver. Every time. No exceptions.

"One question. A verified answer."

npx skills add Wholiver/Math.Skill

Enter fullscreen mode Exit fullscreen mode

GitHub → Wholiver/Math.Skill