惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

DEV Community

I Built a Self-Healing Extension Stabilizer for Ungoogled Chromium (and You Can Use It Too) I scanned Dub's codebase. It's not a link shortener. AI Coding Subscriptions: Where to Go After GitHub Copilot Changes EClaw vs Slack and Mattermost for Multi-Agent Workflows 🐍 Custom Django middleware request response — what devs get wrong I Built a Free Interactive GitHub Learning Platform — Web Guide + Terminal Guide + Git Reference + CLI Sandbox 9 Dart Syntactic Sugar Features That Make My Codebase Happier The Day We Realized Events Were the Bottleneck (And Why We Moved to Rust) Stripe and Friendly Fraud: What the HN Crowd Got Right — and What Progenix Does About It BGP Knowledge for Indie Hackers: Is It Really Necessary? LangGraph vs CrewAI vs AutoGen in 2026: Pick the Right AI Agent Framework (Or Skip Frameworks Entirely) Bulk Downloading 1688 Product Images: A Lesson in Maxing Out Bandwidth I built a Rust inference engine that streams MoE expert weights from NVMe SSDs, no GPU required Open vs Closed LLMs in 2026: The Game-Changing Convergence [03:32:15] AI Agents Are Quietly Taking Over Your Industry — Here's What's Happening [03:32:02] Understanding React Rendering Flow I shipped 29 browser-only image tools. These 5 boring patterns kept the codebase sane Your Treasure Hunt Engine Was Probably a Latency Minefield (And Heres the Postmortem) Before You Add More Agents, Design the Control Plane 𝗖𝗮𝗰𝗵𝗶𝗻𝗴 𝗦𝘁𝗿𝗮𝘁𝗲𝗴𝗶𝗲𝘀 𝗘𝘅𝗽𝗹𝗮𝗶𝗻𝗲𝗱 (Backend & Frontend Developers) I Let AI Replace Me for a Week as a (Kinda Junior) AI Engineer 😅 The Day Our Configs Were Backwards (And How Rust Fixed It) Deploying NextDNS Router-Side to Strip Ads From Video Discovery Traffic I Migrated Redis to KeyDB — Same Protocol, 5x Throughput, $0 Rewrite Vibe Coding for Senior iOS Developers - 6 Takeaways after Shipping 10 Apps in 4 Months Revisiting Benchmarking- Building a Rust A2A Agent I Built a Daily News Newsletter Bot with Hermes Agent — Here's Everything That Went Wrong (and Right) The Django Singleton Model: How to Manage Page Headers Without a CMS I built 51 free browser-based developer tools — here's why and how How I Built a 28-Tool AI Video SaaS Solo with Python, Flask and OpenAI xAI Just Dropped 'Grok Build': The Terminal-Native Agentic AI Changing How We Code Solana's Account Model Explained By Someone Who Got Confused By It First That 0.8 second P99 Latency Cliff in Production Wasnt Supposed to Happen Chia sẽ câu hỏi pv backend dev REST API Design: Building APIs Developers Love (2026) Code Signing a Tauri App for macOS — The Complete Flow Adding Gemma 4 speech recognition to a .NET desktop app: the llama-server sidecar that survived The Moment We Realized Our Treasure Hunt Engine Was Lying to Us Is it a good practice to use a single Builder pattern for both Creating and Updating an entity? BMAD Method + Claude Code: How I Actually Ship Projects with Spec-Driven AI Development I Vibe-Coded a Stock Screener Into Production. Then My 2GB Server OOMed and Google De-Indexed Me. Developing WriterzRoom: Governed Multi-Agent AI for Regulated Content Workflows I Built a Profiler to Audit My Own AI Tool Calls. Here's What I Learned About Observability contributions. From Simple GitHub Contributions to a Production Wikimedia Merge — My Open Source Journey as Gautam Kumar Maurya (GKM) What Is Identity on Solana? (For Web2 Developers) RAG - Sparse Embedding On Age Verification Repo Drift Is the Hidden Cost of AI Coding Agents — and one Fix Is Simpler Than You Think Building an Image-to-3D Workflow with Pixal3D: From One Image to a GLB Asset Rust Was the Constraint: How We Discovered the Language Was Our Scaling Bottleneck Infinite Tool Call Loops in LangChain Agents: A Real Fix Estimating Distance to BLE Beacons Using RSSI and TxPower in HarmonyOS How I Used Kubernetes Documentation Effectively During the CKA Exam Agentic Transformation: From AI Assistance to Engineering Leverage When Your ChatLlamaCpp Stream Causes an Infinite Loop MartinLoop: a control plane for AI coding agents Stop Cloning Entire Repos for Your Doc Builds Rux: A Modern Systems Programming Language Worth Watching Building calculatefreelance: A lightweight Next.js utility for the 1099 economy MUDs — The Grandfather MMOs Chapter-marker survival across the EPUB to multi-voice audio pipeline Magnifica Humanitas: How the Pope walked into the room full of AI engineers and said what few else dared to say. Race-Condition: How a Single SQL Line Eliminated 100 Lines of Retry and Lock Code Multi-Line Formatting by Default AI Agents Also Need ID - When Your AI Assistant Starts Using Your Credit Card rdev-go-ddgen: Automating Domain Directory Boilerplate for Go Applications refactor: optimize core execution modules and integrate ContractGuard logic How does VuReact implement Vue v-on in React I Replaced My Entire Business Stack with 4 Notion Templates We Tried 6 Memory Providers for Hermes Agent — Here's What We Learned Can Google Antigravity 2.0 Pass the "Napkin Challenge"? 📝🚀 Multiplexing SSH Connections with Control Master: Speed Up Deployments and Automation I Built a Screenshot-to-React Generator in 3 Hours Why 'AI Without Hype' Stopped Differentiating in 2026 A SEC filing research prompt pack for source-aware stock research SchemaSpy vs SchemaCrawler - Which Database Documentation Tool is Right for You? One of the First Public HiDream-O1-Image LoRAs — and How to Train Your Own Human-in-the-Loop: The Most Important Concept in AI That Keeps You Employed TIL 5/22/2026 How We Shipped more than 60 Design System Components in 5 Weeks Using Figma as the Single Source of Truth Why HVAC Owners Lose More Money in the Office Than They Make in the Field What will you think of when you read about a neural network!!? Mathematics? 🤔 I Built a Free Finance Dashboard as a Solo Dev — Here's What I Learned Drive JHipster with your AI agent: introducing jhipster-mcp (v0.0.4) Pokemon Battle Simulator Napkin Challenge! Looking for a Founding Engineer Copy Job CDC with SQL estate is now GA in Microsoft Fabric what terminal for CLI in Windows 10 do users like most Is Claude API Worth $3/1M Tokens Over Self-Hosted Llama? Vibe Coding Meets Spec-Driven Development: The Best of Both Worlds We Asked 10 LLMs to Write Efficient Code. Only 4 Got Better. 10 Models Tested: From 81.6% to 10%. The Free Tier is a Full-On Gamble. Building a Browser-Based Free Isometric Illustration Maker for Modern UI Animation Workflows Use Blunt Prompts and Get Shit Done MCP servers are just REST APIs in a polite wrapper - here's 5 lines of Python I Got Tired of LLMs Hallucinating Compliance, So I Built an Open-Source Governance Layer Containers & Agents with Docker & OpenClaw All About AI & Using Claude On the Shoulders of Giants: Package Registries, Node & NPM Decoupling Webhook Verification and Automating Unstructured Data Ingestion
How to Brier-grade your own ML option-pricing forecasts in 40 lines of Python
connerlambde · 2026-05-27 · via DEV Community

If you ship a probabilistic forecast, the single highest-value habit you can build is logging your forecasts so you can grade them later. Sabermetrics figured this out forty years ago. Weather forecasting has done it for a century. Most ML model owners still do not do it.

This post walks through a 40-line Python recipe that logs an ML option-pricing model's per-contract probability-ITM forecast to a CSV, so you can compute the Brier loss after the option expires. The recipe is part of a small open-source cookbook for the Helium MCP REST surface — an MCP server that also exposes its tools as plain HTTPS GETs, which makes it convenient as a teaching substrate even if you do not use MCP.

You will not need an API key, a signup, or a Python SDK.

What we are doing

For every option contract we care about, we want one row that records:

  • The contract identifier (symbol, strike, expiration, type)
  • The model's predicted fair value
  • The model's probability the contract finishes in the money
  • The model's data date
  • (Filled in later) the market mark at the same timestamp
  • (Filled in at expiration) the realized underlying price
  • (Computed) whether the contract was actually ITM
  • (Computed) the Brier loss for the probability forecast

When we Brier-grade later, we get one number per contract. Average across many contracts and we have a directly comparable calibration score — exactly the discipline a baseball win-probability model or a weather precipitation forecast gets graded on.

The endpoint

The Helium server exposes its option-pricing tool at this URL:

GET https://heliumtrades.com/mcp_option_price/
    ?symbol=AAPL&strike=310&expiration=2026-06-26&option_type=call

Enter fullscreen mode Exit fullscreen mode

Plain GET, JSON in / JSON out, no auth header, free tier of 50 calls per IP per day. A live call returns:

{
  "symbol": "AAPL",
  "strike": 310.0,
  "expiration": "2026-06-26",
  "option_type": "call",
  "predicted_price": 6.53,
  "prob_itm": 0.42,
  "options_data_date": "2026-05-26"
}

Enter fullscreen mode Exit fullscreen mode

Two of those fields are forecasts about the future: predicted_price (the model's fair value) and prob_itm (the model's probability the option finishes ITM at expiration). The expiration date in the request is the fixed resolution date. That gives us a clean falsifiable target.

The recipe

"""Log Helium's ML option-price + prob_itm forecasts to a CSV so you can
Brier-grade them at expiration.
"""
import csv
import sys
from datetime import datetime
from pathlib import Path

import requests

ENDPOINT = "https://heliumtrades.com/mcp_option_price/"
LOG_FILE = Path("calibration_log.csv")


def main(symbol, strike, expiration, option_type):
    params = {
        "symbol": symbol, "strike": strike,
        "expiration": expiration, "option_type": option_type,
    }
    resp = requests.get(ENDPOINT, params=params, timeout=30)
    resp.raise_for_status()
    data = resp.json()

    is_new = not LOG_FILE.exists()
    with LOG_FILE.open("a", newline="") as f:
        w = csv.writer(f)
        if is_new:
            w.writerow([
                "timestamp", "symbol", "strike", "expiration", "option_type",
                "helium_predicted_price", "helium_prob_itm", "helium_data_date",
                "market_mark", "realized_underlying_price", "realized_itm",
                "brier_loss",
            ])
        w.writerow([
            datetime.utcnow().isoformat(timespec="seconds"),
            symbol, strike, expiration, option_type,
            data.get("predicted_price"), data.get("prob_itm"),
            data.get("options_data_date"),
            "", "", "", "",
        ])
    print(f"Logged {symbol} ${strike} {option_type.upper()} {expiration}: "
          f"predicted={data['predicted_price']} prob_itm={data['prob_itm']}")


if __name__ == "__main__":
    main(sys.argv[1], float(sys.argv[2]), sys.argv[3], sys.argv[4])

Enter fullscreen mode Exit fullscreen mode

Save as track.py, then:

pip install requests
python track.py AAPL 310 2026-06-26 call
python track.py AAPL 295 2026-06-26 put
python track.py NVDA 220 2026-07-17 call
# repeat for any contracts you want to grade later

Enter fullscreen mode Exit fullscreen mode

The script appends one row per contract to calibration_log.csv. Snapshot the file once a day to capture how the forecast evolves over time.

Grading the forecast after expiration

At expiration, fill in the realized underlying price and compute Brier loss. For a single contract the Brier loss for the prob_itm forecast is:

brier_loss = (prob_itm - realized_itm) ** 2

Enter fullscreen mode Exit fullscreen mode

where realized_itm is 1 if the contract finished in the money and 0 otherwise. Score every contract you logged, average the losses, and you have a calibration number you can compare across models, weeks, or strike regimes.

A quick scorer:

import csv
import pandas as pd

df = pd.read_csv("calibration_log.csv")

def realized_itm(row):
    s = float(row["realized_underlying_price"])
    k = float(row["strike"])
    if row["option_type"] == "call":
        return 1 if s >= k else 0
    return 1 if s <= k else 0

resolved = df[df["realized_underlying_price"] != ""].copy()
resolved["realized_itm"] = resolved.apply(realized_itm, axis=1)
resolved["brier_loss"] = (
    resolved["helium_prob_itm"].astype(float) - resolved["realized_itm"]
) ** 2

print(f"Contracts graded: {len(resolved)}")
print(f"Mean Brier loss: {resolved['brier_loss'].mean():.4f}")
print(f"Calibration histogram:")
print(resolved.groupby(
    pd.cut(resolved["helium_prob_itm"].astype(float), [0, 0.25, 0.5, 0.75, 1.0])
)["realized_itm"].mean())

Enter fullscreen mode Exit fullscreen mode

The calibration histogram is the part most people skip. A model with mean Brier loss of 0.18 can still be wildly miscalibrated in specific probability bins (overconfident at extreme ends, say). The histogram tells you where it is miscalibrated.

Why this is useful

Most quant content compares predicted prices to current prices and stops there. That comparison cannot distinguish between "the model is right and the market is wrong" and the reverse — and both are unfalsifiable until expiration. Probability-ITM, on the other hand, has an unambiguous resolution: the underlying either closes above the strike or it does not.

So prob_itm is the friendliest output to grade. If you want to spend an hour playing with calibration intuition, log forecasts for 50 contracts across a few different expirations, wait for them to resolve, and run the scorer.

Other recipes in the cookbook

The same pattern — one endpoint, one short script, real output — works for the other tools the Helium server exposes:

  • News-bias dashboard: pull every tracked source's bias profile and rank by overall credibility, fearful bias, emotionality_score, or any other dimension
  • Balanced-news synthesis: pull multi-source synthesis on any topic with probability-weighted falsifiable outcomes already baked in
  • Source credibility ranking: top-N and bottom-N sources by credibility, with their emotionality and prescriptiveness alongside
  • Ticker forecast explorer: pull HTML-stripped bull/bear narrative cases for a watchlist
  • Top-strategies explorer: pull the daily short-vol and long-vol candidate lists

All six recipes are in the open-source cookbook here:

➡️ github.com/connerlambden/helium-mcp-cookbook

The cookbook is MIT-licensed. Fork it, modify it, write your own recipes. PRs welcome.

If you want MCP instead of REST

The same ten tools are also exposed as a remote MCP server. If you would rather call them from inside Claude Desktop, Cursor, or any MCP-aware client, the config is:

{
  "mcpServers": {
    "helium": {
      "command": "npx",
      "args": ["mcp-remote", "https://heliumtrades.com/mcp"]
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

After a client restart your LLM can call the same tools by name. The Helium repo is at github.com/connerlambden/helium-mcp.

Closing thought

If your model emits probabilities, you should grade them. The friction-free version is a 40-line script and a CSV. The day you put that habit in place is the day your forecasts start improving — not because the model changes, but because you finally have a feedback signal to learn from.