惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Vercel News
Vercel News
S
Security @ Cisco Blogs
雷峰网
雷峰网
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
MyScale Blog
MyScale Blog
有赞技术团队
有赞技术团队
人人都是产品经理
人人都是产品经理
C
Check Point Blog
M
MIT News - Artificial intelligence
B
Blog RSS Feed
The Cloudflare Blog
宝玉的分享
宝玉的分享
博客园 - Franky
T
Tenable Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Microsoft Security Blog
Microsoft Security Blog
NISL@THU
NISL@THU
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
C
Cisco Blogs
Project Zero
Project Zero
P
Palo Alto Networks Blog
小众软件
小众软件
C
CERT Recently Published Vulnerability Notes
Spread Privacy
Spread Privacy
T
The Blog of Author Tim Ferriss
Google Online Security Blog
Google Online Security Blog
Security Archives - TechRepublic
Security Archives - TechRepublic
F
Fortinet All Blogs
SecWiki News
SecWiki News
L
LINUX DO - 热门话题
The Register - Security
The Register - Security
G
GRAHAM CLULEY
The Last Watchdog
The Last Watchdog
罗磊的独立博客
Martin Fowler
Martin Fowler
V2EX - 技术
V2EX - 技术
L
LINUX DO - 最新话题
N
Netflix TechBlog - Medium
S
Secure Thoughts
大猫的无限游戏
大猫的无限游戏
Last Week in AI
Last Week in AI
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Cloudbric
Cloudbric
Simon Willison's Weblog
Simon Willison's Weblog
I
InfoQ
P
Privacy International News Feed
Recent Commits to openclaw:main
Recent Commits to openclaw:main
V
Vulnerabilities – Threatpost
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events

Hacker News: Show HN

PurrrrrFocus: Pomodoro Timer App - App Store Workflow Engine — Multi-Step Orchestration for Bun RapidPhoto: Pro Photo Editor App - App Store GitHub - DheerG/swarms: Achieve extraordinary results with claude code across a variety of tasks SPICE simulation → oscilloscope → verification with Claude Code — Lucas Gerads Show HN: VCoding – A 5 MB native Windows IDE with no dynamic dependencies Show HN: LLMs don't hallucinate because they're bad at math, it's the format GitHub - Agent-FM/agentfm-core: AgentFM is a peer-to-peer network that turns everyday computers into a decentralized AI supercomputer. AgentFM lets you run massive AI workloads directly across a global mesh of idle CPUs and GPUs. Show HN: Tracking Top US Science Olympiad Alumni over Last 25 Years GitHub - Potarix/agent-hub: One place to talk to all your agents Show HN: Runtime security for AI agents(injection,tool abuse, data exfiltration) GitHub - dubeyKartikay/lazyspotify: Terminal Spotify client for macOS and Linux GitHub - the-banana-tool/king-louie: Easy to use GUI Personal AI Assistant. Win/Linux/Mac. Show HN I made my vacation rental bookable by AI agents–no Airbnb, 0% commission GitHub - basteez/jsf-autoreload: maven plugin to enable hot reload on jsf projects uvm32/hosts/host-gdbstub at main · ringtailsoftware/uvm32 GitHub - labsai/EDDI: Config-driven engine that turns JSON into production-grade AI agents. Multi-agent orchestration, 12+ LLM providers, MCP/A2A protocols, RAG, persistent memory, and enterprise compliance (EU AI Act, GDPR, HIPAA). Built on Quarkus. GitHub - glitchnsec/fortyone-oss: AI Executive Assistant Platform Quickstart | Alien GitHub - muxshed/shed: One stream in, or many. Every destination, simultaneously. No cloud middleman, no per-channel fees, no limits. GitHub - ocrbase-hq/ocrbase: 📄 PDF/IMG ->.MD/JSON Document OCR API for PaddleOCR and GLMOCR. Self-hostable. GitHub - impactjo/home-memory: MCP server that lets your AI assistant remember everything about your home. GitHub - Sets88/dbcls: DbCls is a powerful terminal database client that supports various databases GitHub - neptun2000/heor-agent-mcp GitHub - SeanFDZ/macmind: Single-layer transformer in HyperTalk for the classic Macintosh RollQuation: Math Puzzles - Apps on Google Play GitHub - dropbox/witchcraft Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis GitHub - opentalon/opentalon: OpenTalon is an open-source platform built from the ground up in Go as a robust alternative to OpenClaw LinkedIn™ 职位抓取工具 - Chrome 应用商店 GitHub - EdoardoBambini/Agent-Armor-Iaga: AI agents are getting tool access — shell, file system, databases, APIs, secrets. But **nobody is governing what they actually do with it**. Frameworks like LangChain, CrewAI, AutoGen, and Claude Code give agents the power to execute. Agent Armor gives you the power to control, audit, and approve every single action before it happens. HN Vibes — Week 15, Apr 7–13 2026 GitHub - chojs23/ec: Easy terminal-native 3-way git mergetool vim-like workflow GitHub - SethPyle376/hiraeth: Local AWS emulator focused on fast integration testing, with SQS support, SQLite-backed state, and a debug-friendly web UI. GitHub - JakOb-dotcom/cloud-sandbox-security-analysis: Technical analysis and Proof of Concept (PoC) regarding environment variable exfiltration in containerized cloud sandboxes via side-channel data leaks. Springboards - Flint Alpha Show HN: A simpler coding agent harness GitHub - audiodude/sudomake-friends GitHub - 256thFission/mini-mythos: OSS clone of Anthropic’s Mythos harness to locate C/C++ memory vulnerabilities Show HN: OpenParallax: OS-level privilege separation for AI agent execution Hacker News Sorted - Chrome 应用商店 Show HN: How to Install Docker on Ubuntu 24.04 LTS: Complete 2026 Guide GitHub - himanshudongre/smriti GitHub - sverrirsig/claude-control: macOS desktop dashboard for monitoring and managing multiple Claude Code sessions GitHub - ory/dockertest: Write better integration tests! Dockertest helps you boot up ephermal docker images for your Go tests with minimal work. Chiral - Chrome 应用商店 Show HN: Two Claudes collaborating through shared memory on a $100 mini-PC GitHub - pmichaillat/latex-cv: Minimalist LaTeX template for academic CVs GitHub - oguzbilgic/posse: A web UI for Anthropic Managed Agents. GitHub - sshiraz/depsly: Dependency risk analysis tool for npm packages ABI Add safari/agent-harness — Safari browser automation via safari-mcp by achiya-automation · Pull Request #212 · HKUDS/CLI-Anything GitHub - Halfblood-Prince/trustcheck: Verify PyPI package attestations and improve Python supply-chain security GitHub - oguzbilgic/kern-ai: Agents that do the work and show it. GitHub - bruits/satteri: High-performance Markdown and MDX processing for the JavaScript ecosystem GitHub - tylergibbs1/feedstock: High-performance web crawler and scraper for TypeScript, powered by Bun and Playwright GitHub - Grimm67123/grimmbot: The self-improving sandboxed and open-source AI agent. With persistent memory and scheduling. GitHub - whitevanillaskies/whitebloom: Local whiteboard that blooms. GitHub - hwdsl2/docker-whisper: Docker image for a self-hosted Whisper speech-to-text server with speaker diarization and OpenAI-compatible transcription and translation APIs. Powered by faster-whisper. Supports all Whisper models, NVIDIA GPU (CUDA) acceleration, JSON/SRT/VTT output, SSE streaming, offline mode, and multi-arch (amd64, arm64). GitHub - yisding/reviewwiggum GitHub - MarwanAlsoltany/serrors: Structured errors for Go: sentinel hierarchies, typed data, custom formatting, and slog integration. GitHub - soatok/age-php GitHub - Luthiraa/markitme GitHub - stagas/rtdiff: realtime git diff gui and AI-assisted commits GitHub - tombedor/excalicharts GitHub - wh1le/excalidraw-edit: Open and edit .excalidraw files from the terminal. Offline, auto-saves to disk. MalExt Sentry - Malicious Extension Scanner - Chrome 应用商店 GitHub - syi0808/asciianimesvg: Generate animated ASCII art SVGs from text. CLI, Rust library, WASM, and web editor. GitHub - zaina-ml/ml_forge: A visual-based graph node editor for training computer vision models. GitHub - anakin87/llm-rl-environments-lil-course: 🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models GitHub - takaakit/superpowers-uml: Superpowers-UML modifies Superpowers to ensure a software development workflow in which AI agents design through UML modeling. AdriByte Studio - Sviluppo Web e Soluzioni Digitali GitHub - chouligi/angel-copilot: Your personalized Angel Investment Advisor Show HN: MoodSense AI (ML and FastAPI and Gradio, Deployed on Hugging Face) Moodsense Ai - a Hugging Face Space by aman179102 GitHub - agenteractai/lodmem: Level Of Detail Context Management for Agents GitHub - ostefani/subnetlens: A fast, concurrent network scanner with a TUI and plain-text CLI, built in Go. It discovers live hosts on your network, scans their open ports, resolves hostnames, and fingerprints operating systems—delivered. Cyber Pulse: Agentic Intel - Apps on Google Play Whisper API: Self-Hostable Speech to Text Transcription The Agent-Web Protocol Stack: A Research Thesis GitHub - msmarkgu/RelayFreeLLM: A restful API designed to route user prompts to various AI model providers. Show HN: Provepy – A Python decorator that proves your code using Lean and LLMs Show HN: Pardonned.com – A searchable database of US Pardons GitHub - patrickdappollonio/dux: Dux is a terminal UI that lets you run multiple AI coding agents side by side, each in its own git worktree, with full companion terminals, macros, commit generation, and a command palette that knows more tricks than you do. kMC Crystal Simulator Show HN: HyperFlow – A self-improving agent framework built on LangGraph GitHub - stef41/vibescore: 🎵 Grade your vibe-coded project. One command, instant letter grade across security, quality, dependencies, and testing. GitHub - stef41/lmscan: 🔍 Detect AI-generated text and fingerprint which LLM wrote it. Open-source GPTZero alternative. Zero dependencies, works offline. imgur.com GitHub - visionscaper/collabmem: Enabling long-term collaboration with Agentic AI - building up episodic and world model memory over time with in-context awareness 在 Steam 上购买 FriedrichAI: Offline AI 立省 10% GitHub - atripati/ark: AI Runtime Kernel — a context operating system for AI agents. Eliminates tool bloat, loads only what’s needed, and gives LLMs their reasoning space back. GitHub - nowork-studio/toprank: Open-source Claude Code skills for SEO, SEM, Google Ads GitHub - tacomanator/sash: Lightweight macOS menu bar app for reliably cycling through windows of the current application. Appents | Social Media Management for Product-First Teams GitHub - pnhoang/youtube-spam-blocker: Automatically detects and hides spam messages in YouTube Live chat. Set rate limits, keyword filters, and block repeat offenders. GitHub - decisionnode/DecisionNode: CLI + Local MCP - A shared structured memory store across Claude Code, Cursor, Windsurf, Antigravity, and every MCP client. Semantically queryable. GitHub - AvaCodeSolutions/django-email-learning: An open source Django app for creating email-based learning platforms with IMAP integration and React frontend components. The $100K Gap in Kubernetes Security Tooling Function Calling Harness: From 6.75% to 100%
GitHub - tchauffi/ChessTransformer
tchauffi · 2026-06-18 · via Hacker News: Show HN

A transformer chess engine trained only on human games — no self-play, no reinforcement learning — reaching ~2100 Elo against Stockfish on a single consumer GPU.

The model predicts moves directly from board positions, then plays via an AlphaZero-style MCTS (policy priors + value head). A compiled alpha-beta engine is available as an alternative.

Play it in your browser — live demo on Hugging Face Spaces (no install).

Challenge it on LichessChessTransformerBot runs the Rust engine (rust/ct-bot) live; standard, rated, blitz/rapid. Lichess bot status (true = online)

Demo

ChessTransformer v2.1 (White, MCTS @ 800 sims) checkmating Stockfish Level 8 — the finishing sequence:

ChessTransformer checkmates Stockfish Level 8

Watch the full game (MP4) · 🤗 Play live on Hugging Face Spaces

Highlights

  • 11.7M-parameter transformer (Pos2MoveV2), trained from scratch purely by predicting human moves.
  • ~2100 Elo vs Stockfish — MLE estimate over a 140-game gauntlet (skills 0–12).
  • AlphaZero-style MCTS / PUCT search using the policy head for priors and the value head for leaf scores.
  • 2.3× faster inference via torch.compile + CUDA graphs — lossless (identical moves).
  • Runs on one GPU. No self-play, no RL, no cloud.

How it works

Model — Pos2MoveV2

Component Details
Parameters 11.7M
Attention Grouped Query Attention (8 heads, 4 KV groups) + QK-norm
Position bias Learnable chess-geometry relative bias (8 relation categories: file, rank, diagonal, knight-reach, king-adjacent, nearby, far, global)
Policy head AlphaZero-style 64×73 action planes
Value head Board state → scalar in (−1, 1)
Training Muon + AdamW mixed optimizer, BF16, stochastic depth

Search

Two engines share the same network; both run a torch.compile / CUDA-graph forward (~2.3× faster, lossless).

  • MCTS / PUCT (Pos2MoveV2MctsBot, default) — policy head → priors, value head → leaf scores, most-visited move chosen. Batched-leaf evaluation with virtual loss amortizes the GPU→CPU sync (~8× faster than single-leaf). Tuned for exploitation: first-play-urgency (fpu=0.2) and c_puct=1.0. Tree reuse re-roots the retained subtree under the moves played, giving deeper search at the same per-move cost. Default 800 sims/move.
  • Alpha-beta (Pos2MoveV2Bot) — iterative-deepening negamax with quiescence search, policy-prior move ordering, and a Zobrist transposition table.

Results

MCTS @ 800 sims (c_puct=1.0, fpu=0.2, tree reuse), model v2.1, 20 games/level vs Stockfish (scripts/tune_vs_stockfish.py):

Stockfish skill Approx. Elo Score
0–6 ≤ 1500 100%
8 ~1700 90%
10 ~1900 85%
12 ~2100 35%

MLE estimate: ~2100 Elo. Elo is fit by maximum likelihood over all games rather than averaging per-level estimates (which is biased low — saturated easy levels cap at a low value and drag the mean down).

Strength scales with search

More MCTS simulations per move = stronger play — the same human-trained network gains ~+850 Elo from 25 → 800 sims, with no retraining. Each point is an MLE Elo fit over a Stockfish gauntlet (scripts/sims_scaling.py):

Elo vs MCTS simulations

MCTS sims 25 50 100 200 400 800
Est. Elo 1327 1436 1691 1819 1977 2175
What moved the needle (inference-side, no retraining)
Change Effect
MCTS / PUCT engine (new default) beat the alpha-beta engine ~82% head-to-head
torch.compile + CUDA graphs forward ~2.3× faster (lossless)
Batched-leaf MCTS (virtual loss) ~8× faster per sim — amortizes the GPU→CPU sync
Search tuning — FPU (fpu=0.2), c_puct=1.0, 800 sims +~280 Elo over the untuned MCTS@400 baseline (~1793)
Tree reuse across moves re-roots the retained subtree — deeper search at the same per-move cost
MLE Elo estimator per-level averaging was biased low; fit a single Elo over all games

Tried and rejected: Stockfish policy distillation — no gain even at 200k labels (the policy is near the 11.7M model's capacity ceiling).

Quick Start

Docker (recommended)

docker compose up --build

Model weights are baked into the backend image — no volume mounts needed. For GPU, the deploy.resources.reservations are already set in docker-compose.yml; you just need the NVIDIA Container Toolkit on the host.

Local development

uv sync                            # install deps
uv run python backend/api.py       # start backend (port 5001)

cd frontend && npm install && npm run dev   # start frontend (port 3000)

Open http://localhost:3000 and start playing.

Environment variables:

Variable Default Description
MODEL_PATH data/models/pos2move_v2.1 Path to a checkpoint directory
ENGINE mcts Search engine: mcts or alphabeta
MCTS_SIMS 800 MCTS simulations per move (when ENGINE=mcts)
ALLOWED_ORIGINS * Comma-separated CORS origins

Training

1. Build the dataset. scripts/build_db.py downloads elite games from database.nikonoel.fr and converts them to HDF5 in one step (bullet/blitz excluded by default).

uv run scripts/build_db.py                      # last 12 months (default)
uv run scripts/build_db.py --from 2024-01 --to 2024-12   # date range
uv run scripts/build_db.py --last 6             # last 6 months
uv run scripts/build_db.py --all                # everything available
uv run scripts/build_db.py --skip-download      # re-convert existing PGNs

Output goes to data/elite_db.h5; raw PGNs are cleaned up unless --keep-raw is passed.

2. Train.

uv run src/chesstransformer/trainers/pos2move_v2_trainer.py

3. Evaluate.

# Tune & benchmark search budget vs Stockfish (alpha-beta depths + MCTS sims)
uv run scripts/tune_vs_stockfish.py data/models/pos2move_v2.1 --games 8 --skills 0 2 4 6 8

# Deterministic engine-vs-engine A/B (MCTS vs alpha-beta, model A vs B, ...)
uv run scripts/engine_match.py --a-mcts --a-sims 400 --b-quiescence 4 --b-depth 3

# Inference speed + lossless-regression guard
uv run scripts/bench_inference.py --depth 3 --save-golden golden.json
uv run scripts/bench_inference.py --depth 3 --check golden.json

# Render a gameplay clip vs Stockfish
uv run scripts/render_game_clip.py --skills 8 10 --sims 800 --out clip.mp4

Project layout

Directory tree
ChessTransformer/
├── backend/
│   ├── api.py                        # FastAPI server (move, evaluate, validate endpoints)
│   └── Dockerfile
├── frontend/                         # Next.js web app (human vs bot)
│   └── app/components/ChessGame.tsx  # Main game component
├── data/
│   └── models/
│       ├── pos2move_v2.1/            # Bundled model weights (default)
│       └── pos2move_v2/              # Previous weights (fallback)
├── scripts/
│   ├── build_db.py                   # Download elite games and build HDF5 database
│   ├── tune_vs_stockfish.py          # Sweep alpha-beta depth / MCTS sims vs Stockfish
│   ├── elo_gauntlet.py               # Elo estimation vs Stockfish (alpha-beta)
│   ├── engine_match.py               # Deterministic engine-vs-engine A/B
│   ├── bench_inference.py            # Inference speed + lossless-regression guard
│   ├── render_game_clip.py           # Render a bot-vs-Stockfish game to MP4
│   ├── export_onnx.py                # ONNX export for TensorRT
│   ├── quantize_onnx.py              # INT8 quantization
│   ├── dataset_sanity_check.py       # Dataset distribution analysis
│   └── compress_pgn_to_zst.py        # PGN compression utility
├── src/chesstransformer/
│   ├── bots/
│   │   ├── pos2move_v2_mcts_bot.py   # MCTS / PUCT bot (default)
│   │   ├── pos2move_v2_bot.py        # Alpha-beta bot (with quiescence + compile)
│   │   └── random_bot.py
│   ├── models/
│   │   ├── transformer/pos2move_v2.py  # Model architecture
│   │   └── tokenizer/
│   │       ├── alphazero_move_encoder.py  # 64×73 action planes
│   │       ├── position_tokenizer.py
│   │       └── move_tokenizer.py
│   ├── datasets/
│   │   ├── h5_lichess_dataset.py     # HDF5 dataset with phase-weighted sampling
│   │   └── dataset_h5_convertor.py
│   ├── optimizer.py                  # AdamW + Muon combined optimizer
│   └── trainers/
│       └── pos2move_v2_trainer.py
├── docker-compose.yml
├── pyproject.toml
└── uv.lock

Development extras:

uv sync --group dev         # linting / formatting
uv sync --group optimized   # ONNX / TensorRT export