惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园 - 三生石上(FineUI控件)
T
Threat Research - Cisco Blogs
月光博客
月光博客
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
爱范儿
爱范儿
Hugging Face - Blog
Hugging Face - Blog
腾讯CDC
云风的 BLOG
云风的 BLOG
D
Docker
罗磊的独立博客
U
Unit 42
博客园 - 聂微东
人人都是产品经理
人人都是产品经理
P
Proofpoint News Feed
博客园 - Franky
Apple Machine Learning Research
Apple Machine Learning Research
MyScale Blog
MyScale Blog
B
Blog RSS Feed
美团技术团队
J
Java Code Geeks
S
Securelist
Cyberwarzone
Cyberwarzone
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
NISL@THU
NISL@THU
Security Latest
Security Latest
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Recorded Future
Recorded Future
Hacker News - Newest:
Hacker News - Newest: "LLM"
L
LINUX DO - 热门话题
Recent Announcements
Recent Announcements
Last Week in AI
Last Week in AI
A
About on SuperTechFans
MongoDB | Blog
MongoDB | Blog
Spread Privacy
Spread Privacy
T
Tenable Blog
I
Intezer
N
News | PayPal Newsroom
大猫的无限游戏
大猫的无限游戏
A
Arctic Wolf
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
V
V2EX - 技术
S
Schneier on Security
S
SegmentFault 最新的问题
Latest news
Latest news
宝玉的分享
宝玉的分享
V
Visual Studio Blog
V
V2EX
T
Tor Project blog
C
Comments on: Blog

DEV Community

Where Tensor-Parallel Inference Hits the NVLink Wall Database WAL Bloat Management: The Core Anatomy for Performance WordPress Emails Were Failing Silently on DigitalOcean. Here's What Broke. Reading Belgium's KBO/CBE registry: what the live API returns 🤫 I Built CodeMoji: A VS Code Extension That Turns Code Into Emojis 5 AI Pair Programming Patterns That Actually Speed Up Development LLD Object-Oriented Design: From Requirements to Classes (Bridging Thinking to Domain Modeling) How We Built a CTO-Grade Grafana Dashboard With Codex How We Built a CTO-Grade Grafana Dashboard With Codex T-Slot Bolts and Nuts for Secure Industrial Clamping From Tools to Economic Actors: Why AI Agents Need Independent Financial Infrastructure Designing a Scalable Event-Driven Data Processing Pipeline with Apache Kafka Streams M4 Pro vs M5 Pro: Which Apple Silicon Chip Wins? Calling GET API & Mapping Response in VBCS (Service Connection) How to Connect a Trailhead Org to Salesforce CLI WebAssembly - Why Your Browser Can Finally Do What Desktop Apps Could I built a CLI tool to find worthwhile GitHub issues to contribute to I Built a Global Location Data API with 12M+ Cities — Here's How Why we built a metadata-driven Angular framework instead of using Retool Your PR Queue Is the New Technical Debt. AI Code Review Is the Fix Nobody Set Up Yet. Hermes Quant: Zero-Cost Autonomous Equity Research Agent Powered by Hermes 3 RedBase / redb.Route / redb.Tsak 3.0.0 shipped Road To KiwiEngine #5: The Future of SaaS Might Be Operational Ownership Comparison pages that say where the competition wins Devlog #3 Turning OpenClaw Governance Into an Operating Layer AI Governance as Infrastructure on AGTP How Instagram, WhatsApp, Uber & Netflix Would Be Built Today Using Expo Router Linux: A few more tips From Zero to Manifest V3: How GitHub Copilot Helped Me Finish an RSC Vulnerability Detector for CVE‑2025‑55182 Best use of Gemini in everyday life. How to configure Claude Code (and Cursor) so it stops ignoring your conventions The AI Agent Ecosystem in PHP - From Simple OpenAI Calls to Multi-Agent Platforms Building a Micro-SaaS Empire: A Step-by-Step Guide to Creating and Monetizing Open-Source Developer Tools Google's Agentic Leap: How Gemini Turned Workspace Into Your Autonomous Executive Assistant The Art of Package Publishing: Best Practices for Creating and Maintaining Popular Open-Source Libraries Architectural Foundations of Adobe Experience Manager: A Developer's Deep Dive (Part - 1) Automating Bulk Content Authoring in SitecoreAI with PowerShell Extensions I tried to hide semantic meaning from embeddings without breaking search Static vs Non-Static in Java: Understanding Class and Object Through a Shop Story Selling a macOS app outside the App Store is easy. Licensing is the hard part. The Agent That Actually Remembers You: A Deep Dive into Hermes Agent published What Is HTTP & HTTPS ? Frontend Architecture: Where Does This File Go? W. Edwards Deming: The Father of Total Quality Who Predicted the Future of AI Vue Teleport component React: How does VuReact convert it? How to audit an AI agent skill: the 7-check framework we used on 200 skills AI for Knowledge Management: Real Workflows That Hold Up W. Edwards Deming: El Padre de la Calidad Total que Predijo el Futuro de la IA Mamba/SSM Basics 4 Ways to Get Started with AITuber, Sorted by Level How I Fixed a CORS Error Without Knowing Backend - and What I Learned From It Using jQuery to hide a DIV when the user clicks outside of it. How to Build a PostgreSQL Backed Job Queue in Go Manifest AI联创Jacob谈Transformer的不足与提出 Power Retention Gemma Mentor AI: From an Unfinished Prototype to a Real-Time Multi-Agent Learning Companion Giving Your Digital Employee a Company Credit Card (With Limits) Your AI Assistant Just Bought a $30,000 Cloud Subscription How Writing Can Help You Escape AI Delirium Self-Hosted VPN in 2026: WireGuard, Headscale, NetBird and More Compared Build Cache Strategies: The Operational Burden of Speed Inside agent-gov: Architecture of an Agent Cost Governance Platform Why we chose a Rust template engine and Go APIs Announcing agent-gov: Open-Source AI Agent Cost Governance AI doesn't fail because the model is bad. It fails because there's nothing underneath it Polly wants a transcript: giving agents ears and a voice, on your own machine AI coding assistants make junior devs faster and worse at the same time AI Won't Save You From Forgetting How to Think encodeURI vs encodeURIComponent: The JavaScript URL Encoding Trap Building an MCP Server Using Spring AI, JSON-RPC and SSE (Server-Sent Events) PgBouncer: Effectively Managing Your PostgreSQL Connection Pool How much should I charge for 3D prints? A complete pricing breakdown for Etsy sellers How to Fix Core Web Vitals in a MERN Stack App (Complete Guide) Is AI-Native .NET Development Actually Happening in 2026? The 54-point production deployment checklist that saves you from 3am rollbacks How I Built Hidden Collector Game in Unity Moving Beyond the Context Window: The Agentic Memory Architecture 🚀 Building an open-source email blast tool — free, self-hosted, no Mailchimp needed. Looking for contributors to help add: 📊 Open & click tracking 🐳 Docker support All issues are open. Jump in 👇 https://github.com/nikhilt101/email-blast-tool Progressive Distillation System Design - 6.CAP Theorem & PACELC, CAP Theorem & PACELC: The Most Important Trade-off in Distributed Systems [Boost] AWS Summit India Online 2026: Get a FREE AWS Certificate Without Any Exam or Fees! 🚀 Why your React tournament bracket breaks in Safari (and a 4 KB pure-CSS fix) Markdown Is Becoming the AI App Interface AstroFit – My Fitness Tracking Web Application When WP-CLI fatals on the plugin you came to rescue # Agentic AI: Architecture of Autonomous Systems Why Most AI Agents Forget Everything — And Why Hermes Agent Changes the Game Hermes Agent's Brain: How Its Skills & Memory System Actually Works How to Structure Reusable Components in a Next.js Project Scribe vs ClickTrek vs Tango vs Guidde vs Floik: Workflow Documentation Tools Compared (2026) Awk! Awk! Add a diagram!: Greptile-style PR diagrams, minus the SaaS How to Share Client Links Safely: Custom URLs, Passwords, and Expiration Dates The AI Agent That Deleted Everything in 9 Seconds — And What Every Developer Needs to Know I built a time-decaying knowledge graph for my terminal — here's how it works Demystifying Linux File Permissions and chmod (Without the Guesswork) I Let Claude Design 4 Chaos Experiments via MCP. The 4th Took Down Staging and Found a 6-Month-Old Bug. Two agents passing strings to each other is not a multi-agent system — it's a pipeline, and the distinction matters System Design - 5.Latency vs Throughput Latency vs Throughput: Why "Average Response Time" Is the Biggest Lie in Engineering Insights of git (part :2) Mix and Match: Running Kiro on Google Cloud Shell
I open-sourced a World Cup 2026 prediction model — and tested it honestly
Jerry Chen · 2026-05-31 · via DEV Community

Jerry Chen

Every World Cup, "supercomputer predicts the winner" headlines show up everywhere — and almost none of them let you see how the sausage is made. I wanted a forecast I could actually read, run, and argue with. So I built one for the 2026 World Cup, and I open-sourced the whole thing:

👉 github.com/Hicruben/world-cup-2026-prediction-model (MIT)

No machine-learning black box, no scraped bookmaker odds — just three classic, transparent pieces. And, more importantly, an honest, reproducible test of how good it actually is.

The model in three layers

1. Team strength (Elo). Every nation gets an Elo rating, seeded from long-run strength and then calibrated on hundreds of recent real internationals. Wins over strong sides in important games move a rating more than friendlies; recent form outweighs old form.

2. Each match (Dixon-Coles bivariate Poisson). Two ratings become expected goals, which feed a Dixon-Coles model to produce win/draw/loss probabilities. Dixon-Coles (1997) fixes a well-known flaw of plain Poisson: it under-counts the low-scoring draws (0-0, 1-1) that are so common in football.

import { matchProb } from "./elo.mjs";

// Elo 2056 vs Elo 1951, neutral venue
const p = matchProb(2056, 1951);
// → { winA: 0.45, draw: 0.26, winB: 0.29, expectedGoalsA: 1.6, expectedGoalsB: 1.2 }

Enter fullscreen mode Exit fullscreen mode

3. The tournament (Monte Carlo). Play all 104 matches through the real bracket 10,000 times. Count how often each team reaches each round → championship and advancement probabilities.

There's a tiny CLI to poke at it:

$ node predict.mjs brazil argentina

  brazil (Elo 1994)  vs  argentina (Elo 2064)   [neutral]
  brazil           win   26.7%  ████████
  draw                   28.3%  █████████
  argentina        win   45.0%  █████████████

Enter fullscreen mode Exit fullscreen mode

The part I actually care about: is it any good?

Anyone can spit out percentages. The hard question is whether they mean anything. So I tested it the honest way — walk-forward, out-of-sample. The script steps through 920 real internationals (Oct 2023 → May 2026) in date order, predicts each match using only data available before kickoff, then reveals the result and updates the ratings. No hindsight, no curve-fitting. One command reproduces it:

$ node backtest.mjs

=== Walk-forward backtest — 770 of 920 matches ===
MODEL
  Accuracy (top pick):   61.0%
  Favourite acc (p≥50%): 66.8%
  Brier (3-way, ↓):      0.536
BASELINES (same matches)
  Always pick home:      48.6%
  Coin-flip (uniform):   Brier 0.667

Enter fullscreen mode Exit fullscreen mode

So: ~61% correct on a three-way (win/draw/loss) outcome, versus 49% for "always pick home" and ~33% for a coin toss. When the model had a clear favourite, it was right about two times in three. The Brier score (0.54 vs 0.67 for uniform) says the probabilities carry real information, not just the top pick.

What I learned (and what I won't claim)

  • It is not state-of-the-art, and it does not beat the betting market. A 61% hit rate also means ~2 in 5 matches surprise it — by design. Draws are genuinely the hardest thing to predict, and a 7-game tournament is dominated by variance.
  • Transparent baselines are underrated. No deep learning, ~300 lines of plain Node, zero dependencies — and it still lands in the same ballpark as far fancier models for tournament-level questions.
  • Calibration > accuracy. Getting the probabilities shaped right matters more than the headline hit rate, especially for a bracket simulation.

Try it / see it live

Clone it and run the backtest yourself (Node 18+, no deps):

git clone https://github.com/Hicruben/world-cup-2026-prediction-model.git
cd world-cup-2026-prediction-model
node backtest.mjs      # reproduce the numbers
node predict.mjs spain germany

Enter fullscreen mode Exit fullscreen mode

The full 48-team tournament simulator (10k sims, live title odds, an interactive bracket) runs the same engine at cup26matches.com, and there's a plain-English write-up of the methodology and the backtest here.

I'd genuinely love feedback on the modelling — the Dixon-Coles ρ, the home-field handling, the best-third tiebreaks. Tear it apart in the comments or open an issue. ⭐ the repo if it's useful!