惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

DEV Community

Using Python to Do the Wonders: How Flet Changes the Game for Developers OpenDev: From Zero Clients to Linux Independence – How I'm Building a One-Man Linux Revolution Migrating from Jest to Vitest 4: A Complete 2026 Guide HTTP request headers: canonical reference Prefix caching in vLLM under multi-tenant agent traffic Introducing Oracle Support in Dory How I built 3 products solo as a CA student using AI — no coding background What is AEO? How to Get ChatGPT, Perplexity & AI Search Engines to Cite Your Website — 2026 Guide HTTP rate-control headers: canonical reference Im attending Manifest 2026! AI Music Doesn’t Need Better Prompts — It Needs Better Systems ORA-00215 오류 원인과 해결 방법 완벽 가이드 Stop Making Your AI Chatbot Slower: Streaming Responses with Spring AI and Server-Sent Events Annotations in Spring Boot What is the Model Context Protocol (MCP)? Gemini CLI Skills: Teaching Your Terminal Agent How to Think 🧠 What the Heck is an API? FairLens AI: An Intelligent Dashboard for Automated Bias Auditing RAG vs Fine-Tuning- Choosing Right Strategy for Modern AI Applications AI Metrics Decoded: From Parameters to TOPS I made git merge finish itself — in VS Code, in my terminal, and in CI You just can’t miss this… Redis Essentials: Architecture, Caching, and Setup Docker with AI: A Practical Guide to Running LLMs, Agents and MCP Design to Code #5: Using AI to Build a Design System Analyzing 1,000 Engineering Problems Through GitHub Data Open Graph protocol: canonical reference How a 400-Engineer SaaS Company Cut PR-to-Production from 4.2 Days to 6.4 Hours with Claude Code Multi-Agent DevOps 💬 Embedded AI Chatbots vs Popup Bubbles — Which One Creates Better Engagement? Bajándole todos los minutos posibles al CI del backend con mas de 1000 tests Harness Engineering: Stop Re-Prompting Your Coding Agent Every Session HTML meta referrer: canonical reference AWS MCP Server Just Gave AI Agents Your Cloud Keys — Here's Why That Should Worry You Announcing the Trust Identity Protocol (TIP): HTTPS for the AI Era We built the feature in two days. Making it reliable took two weeks. LuisCore /for-agents.json — agent bootstrap — daily syndication · 2026-05-26 A Curious Journey Into Reverse Engineering an AI-Generated Python .exe Part 2: Enterprise Decision Intelligence Architecture: AI Governance, Threshold Policy Engines, and Operational AI Systems I will continue using Devise with Rails 8! The Developer's Guide to Picking the Right AI Code Model in 2026 (I Spent $500 So You Don’t Have To) 30 Kubernetes Tasks Every CKA Candidate Should Practice Before Exam Day Why Some Websites Feel Instantly Better to Use Advanced React Patterns I Wish I Knew 5 Years Ago ¿Cómo optimizar algoritmos en arreglos y listas con la técnica de dos punteros? I scanned 8 popular open source repos with one command. Here's what I found. mcp-probe v1.6.0: Stricter GitHub Actions checks for MCP CI gates How we connect two strangers' webcams fast (and keep the TURN bill small) LLM Agents Are Now Finding Zero-Days: How AI is Autonomously Rewriting the Rules of Vulnerability Research Minimal Code Doesn’t Mean Stable Code How I manage 40+ skills across Claude Code, Codex, and .agents folders Hardening Stealth Browser Fingerprint Integrity and State Persistence Quick Tip: Benchmarking Multimodal APIs in Under 10 Minutes How I Slashed My AI API Bill by 92% in 2026 — A Cost Optimizer's Speed Benchmark Guide How I Slashed My AI API Bill by 95% — A Practical Guide for 2026 A Go outbox library that runs inside your own DB transaction How I Built a Credit Optimizer That Saves 30-75% on AI Agent Costs (Open Architecture) The Missing POP: How I Ported a Yul Contract to Huff by Reading Every Opcode The Moment the Config Parser Became the Bottleneck Churn Tool Stack by Revenue Stage ($5K to $50K+) What I Learned Exploring AI-Generated 3D: A Hands-On Tour of Meshy, Tripo, and Three.js Day 15 - Software Composition Analysis(SCA) Contributing Upstream Instead of Forking: My grape-swagger-rails Story Behind The Badge: How We Built 2,000 Hackable Badges For Temporal Replay Access Control Doesn't Scale Linearly -- Part 3 33x faster than Rust: Why I stopped waiting for my compiler and built my own. I Built My First Production AWS Project as a Career Changer Why Detecting PII Matters More Than Ever JSON Schema in 10 Minutes — Validation, Types & Real Examples Python Tasks How I Started My Cybersecurity Journey as an SQA Engineer 🔐 Why "fancy fonts" in Discord and Instagram bios turn into boxes ☁️ GKE private cluster setup — common mistakes and how to avoid them I Thought a Username Didn’t Matter… Until I Saw How Much People Care About It Claude for Small Business: 382K Day-One Buyer's Guide I Built a Diagnostic Toolkit for PyTorch Because I Was Tired of Guessing Why Models Fail How I Built an AI-Powered Incident RCA Platform with LangGraph and RAG The Paywall Was a Painted Door Sonnet hallucinated. My agent stored it as fact. How React-Style Time-Slicing Keeps UIs Responsive 这个 Princeton 开源项目让 AI 自己修 Bug,19K Stars 但 90% 的人只用了 1% 功能 🔥 SWE-agent's 5 Hidden Uses Nobody Told You About 🔥 Decompiling Serial Number U-36: Python TERCOM Reconstruction, Cryptographic Logistical Forensics, and Swarm Consensus Fault Tolerance Microservices Patterns You Cannot Outrun a Wave I Fired My Entire Node.js Stack — Rust Rebuilt It in 3 Weeks (The Ugly Truth) BoxAgnts Introduction (2) — AI Agent Toolbox Cursor 3 ships parallel AI agents. Here is the multi-agent workflow that actually works. Prisma-7 A Complete Beginners Guide (With Free Cloud Database!) Akses HDD Rumah dari Laptop Kantor Pakai Tailscale + SMB (Tanpa VPN Ribet) Content Pipeline in MonoGame: Why I Don't Use It Debug Log #1 — The Pipeline That Looked Broken Data Structures in JavaScript: When to Use What (2026) BGP Route Flap Damping: A Solution or a New Problem? First look at AWS DevOps Agent The Next Big “Cult App” Probably Isn’t Another Social Media Platform From Template to Production-Shaped: An AI-Native Dev Flow for Go Side Projects Idempotency Keys: The API Pattern That Saves You From Duplicate Payments and Phantom Records Everyone's Building Jarvis. Nobody's Even Close. The Moment the Jaeger Tracer Exhausted Itself and What We Switched To How to Fix Tool-Use Loops in Autonomous Coding Agents
Making Equation (2.2) of the OpenAI Erdős Result Executable
Kwansub Yun · 2026-05-26 · via DEV Community

Why a proved theorem still needs reproducible claim custody

open ai

On May 20, 2026, OpenAI announced that an internal reasoning model had produced a counterexample to the Erdős planar unit-distance conjecture.

The problem is easy to state: given $n$ points in the plane, how many pairs of points can be exactly distance $1$ apart?

For nearly eighty years, the prevailing expectation was that square-grid-type constructions were essentially optimal up to a slowly growing exponent. OpenAI’s announcement changed that. Its internal reasoning model produced an infinite family of examples giving a polynomial improvement, and the proof was checked and written up in mathematical form by external mathematicians.

In this article, “the remarks paper” refers to the companion PDF by Alon, Bloom, Gowers, Litt, Sawin, Shankar, Tsimerman, Wang, and Matchett Wood, linked from OpenAI’s announcement.

The proof-level result belongs to those authors and the source papers.

My focus here is narrower: equation (2.2) in that remarks paper, and whether its explicit numerical value can be reproduced as executable code.

This is not about proving the theorem again. It is about what happens after a theorem contains a fragile numerical claim.


The proof is not the artifact

A mathematical proof and a software artifact do different jobs.

The proof establishes the theorem. It gives the definitions, the argument, the dependencies, and the mathematical reason why the result holds.

A software artifact should not pretend to replace that.

But some claims inside a mathematical paper have a finite, numerical, or computationally checkable surface. Those claims can be preserved differently. They can be run. They can be tested. They can fail when precision is wrong.

That is the narrow role of an executable reproduction artifact: not proof replacement, not automated peer review, and not authority over the theorem, but a reproducible object for the part of the claim that can be computed.


The specific target: equation (2.2)

In the OpenAI Erdős result, one checkable surface is equation (2.2) of the remarks paper.

For the explicit choice

math1

the remarks paper gives an explicit numerical lower bound on the exponent excess above the classical Erdős exponent:

math2

These parameters are taken directly from the remarks paper without modification. The artifact does not derive the multiquadratic choice; it reproduces the finite numerical calculation built from that choice.

This is not the later stronger explicit bound associated with Sawin’s separate preprint. It is not $\delta \approx 0.014$. It is the numerical value appearing in equation (2.2) of the remarks paper.

That narrowness is important. It is exactly what makes the claim suitable for executable reproduction.


Where the numerical fragility comes from

4

The numerical fragility comes from the exact form of equation (2.2), not from a large computation.

Immediately after the published expression, the parameters are:

math3

and

math 4

With the paper’s definitions of $u, v$, and $\delta$ substituted into equation (2.2), the exponent excess reduces to:

math5

The constant $36$ is not introduced by the implementation. It is already present in the remarks paper’s equation (2.2), both in the numerator term $u\pi/(36v)$ and in the denominator term $\log(36/\delta^2).$

After substituting $u = K/r^2, v = r/2$, and $\delta = 101^{-2K}$, the numerator simplifies to $\log(K\pi / 18r^3)$, while the denominator becomes $\log 36 + 4K \log 101$.

Here the $101$ comes from the finite prime in $S = {101, \infty}$.

In other words, this artifact does not derive the constant $36$ from first principles; it reproduces the published equation with the stated substitutions.

The precision problem is in the numerator:

math 7

Because $K$ is the ceiling of $18r^3 / \pi$, the ratio $K\pi / 18r^3$ is only barely larger than $1$.

More precisely:

math8

For $r = 510510$,

math 9

So the numerator is effectively $\log(1 + \varepsilon)$ with $\varepsilon$ at the $10^{-18}$scale.

IEEE 754 double precision has machine epsilon around $2.2 \times 10^{-16}$. A naive float64 computation therefore cannot reliably distinguish the near-one ratio from $1$. The ratio rounds to $1$, leading to $\log(1) = 0.$

The exponent excess disappears before the computation reaches the value stated in the paper.

This is not a flaw in the mathematics. It is a precision failure in the numerical evaluation of a valid expression. That is the reason the artifact evaluates equation (2.2) using mpmath at 200-bit precision.

A PDF can state the value. A verifier can expose when the value disappears.


What we built

last

We built:

https://github.com/Flamehaven-Labs/openai-erdos-eq22-reproduction

The purpose is deliberately narrow: reproduce the finite, explicitly checkable numerical surface of equation (2.2) in the OpenAI Erdős unit-distance disproof remarks.

The package evaluates the expression using mpmath at 200-bit precision and returns:

6.2391e-38

Enter fullscreen mode Exit fullscreen mode

This matches the published two-significant-figure value $\approx 6.24 \times 10^{-38}$ to $1.4 \times 10^{-4}$ relative error.

The repository includes 60 unit tests, 21 verifier checks, a frozen per-source-file SHA-256 manifest, GitHub Actions CI across Ubuntu and Windows, Python 3.11 / 3.12 verification, and a frozen-report mode that prints a verdict without mutating tracked evidence.

The basic reproduction path is:

git clone <https://github.com/Flamehaven-Labs/openai-erdos-eq22-reproduction>
cd openai-erdos-eq22-reproduction
pip install -e ".[dev]"
python -m erdos_ant.verify

Enter fullscreen mode Exit fullscreen mode

Expected output includes:

Verdict: PASS
Checks: 21/21 passed
eq (2.2) exponent excess: 6.2391e-38

Enter fullscreen mode Exit fullscreen mode

This is not a large system. That is part of the point. A small claim with a clear boundary is easier to inspect than a broad claim that blurs proof, computation, and interpretation.


From reproduction to custody

2

This repository was not built as a one-off reaction to an OpenAI announcement. We are not announcing a grand framework here; we are showing the discipline in miniature.

For us, the work is part of a longer routine: take a mathematical or technical claim, isolate the checkable surface, pin the environment, and make drift visible.

That is intentionally plain work.

Read the source.

Extract the claim.

Reproduce the computation.

Record the boundary.

Let the verifier fail if the result disappears.

To execute this routine reliably, the scope must be uncomfortably narrow. This repository intentionally leaves the proof of Theorem 1.1, the construction of the infinite tower, and Sawin’s separate $\delta \approx 0.014$ preprint to their respective sources. It does not pretend to be peer review.

This is not just a disclaimer. It is the point of the artifact.

A sharp, restricted boundary is exactly what makes a claim inspectable, repeatable, and challengeable. This is what I mean here by claim custody.

It addresses a technical governance question, but not in the policy sense: what exactly is being trusted, from which source, and what makes the claim fail if the implementation changes?

A PDF can state the value. A verifier can expose when the value disappears.

We claim no authority over the broader theorem. We simply maintain a reproducible boundary around the fragile numerical claim inside it.


Closing

repo

The theorem was proved in the mathematical papers.

This repository asks a smaller question: can the numerical value in equation (2.2) survive execution?

In float64, it does not. The exponent excess collapses to zero.

At 200-bit precision, with the source parameters pinned and the verifier running under CI, the artifact recovers:

6.2391e-38

Enter fullscreen mode Exit fullscreen mode

matching the published value to $1.4 \times 10^{-4}$ relative error.

That is the point.

Not a new theorem. Not a proof replacement.

A reproducible claim surface for one precision-sensitive number in a major AI-assisted mathematical result.

Repository:

https://github.com/Flamehaven-Labs/openai-erdos-eq22-reproduction

Paper / Zenodo:

https://doi.org/10.5281/zenodo.20383217