惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
量子位
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
WordPress大学
WordPress大学
V
V2EX
D
Darknet – Hacking Tools, Hacker News & Cyber Security
S
SegmentFault 最新的问题
G
GRAHAM CLULEY
T
The Exploit Database - CXSecurity.com
Apple Machine Learning Research
Apple Machine Learning Research
Martin Fowler
Martin Fowler
V
Vulnerabilities – Threatpost
Recent Announcements
Recent Announcements
L
LINUX DO - 最新话题
腾讯CDC
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Y
Y Combinator Blog
N
News and Events Feed by Topic
T
Tor Project blog
S
Securelist
IT之家
IT之家
罗磊的独立博客
人人都是产品经理
人人都是产品经理
Microsoft Security Blog
Microsoft Security Blog
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Attack and Defense Labs
Attack and Defense Labs
Recorded Future
Recorded Future
云风的 BLOG
云风的 BLOG
T
Threat Research - Cisco Blogs
PCI Perspectives
PCI Perspectives
美团技术团队
J
Java Code Geeks
小众软件
小众软件
宝玉的分享
宝玉的分享
AWS News Blog
AWS News Blog
博客园_首页
Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
T
Threatpost
H
Heimdal Security Blog
V2EX - 技术
V2EX - 技术
H
Help Net Security
S
Schneier on Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
www.infosecurity-magazine.com
www.infosecurity-magazine.com
爱范儿
爱范儿
Spread Privacy
Spread Privacy
Google DeepMind News
Google DeepMind News
Engineering at Meta
Engineering at Meta

Hacker News: Front Page

Trump administration reclassifies cannabis as less dangerous Release raylib v6.0 · raysan5/raylib GitHub - russellromney/honker: SQLite extension + bindings for Postgres NOTIFY/LISTEN semantics with durable queues, streams, pub/sub, and scheduler Writing a C Compiler, in Zig crawshaw - 2026-04-22 MacBook Neo and How the iPad Should Be It's time to reclaim the word "Palantir" for J.R.R. Tolkien Arch Linux now has a bit-for-bit reproducible Docker image Fundamental Theorem of Calculus | David Álvarez Rosa | Personal Website Bring Your Agent to Teams Ars Technica newsroom AI policy France confirms data breach at government agency that manages citizens’ IDs New study compares growing corn for energy to solar production. It's no contest NAEP Long-Term Trend Assessment Results: Reading and Mathematics Convergent Evolution: How Different Language Models Learn Similar Number Representations We Found a Stable Firefox Identifier Linking All Your Private Tor Identities GitHub - besimple-oss/broccoli: Broccoli turns Linear tickets into shipped PRs — powered by Claude and Codex, running on your own Google Cloud. Youth Suicides Declined After Creation of National Hotline Top MAGA influencer revealed to be AI — created by a guy in India who made a mint off lonely men online Ping-pong robot beats top-level human players Announcing DuckDB 1.5.2 The handmade beauty of Machine Age data visualizations Treetops glowing during storms captured on film for first time Columnar Storage is Normalization TPU 8t and TPU 8i technical deep dive Our eighth generation TPUs: two chips for the agentic era Introducing Google Cloud Fraud Defense, the next evolution of reCAPTCHA | Google Cloud Blog Kernel code removals driven by LLM-created security reports tante.cc Nobody Got Fired for Uber's $8 Million Ledger Mistake? Introducing workspace agents in ChatGPT Sure, xor’ing a register with itself is the idiom for zeroing it out, but why not sub? What Async Promised and What it Delivered — Causality GitHub - justrach/kuri: Browser automation and web crawling for AI agents. Zig-native, token-efficient CDP snapshots, HAR recording, and a standalone fetcher. Drunk Post: Things I’ve Learned as a Senior Engineer Claude Code to be removed from Anthropic's Pro plan? Another Day Has Come 'Something sinister could be happening': FBI looks into dead or missing nuclear and space defense scientists tied to NASA, Blue Origin, and SpaceX | Fortune GitHub - calcom/cal.diy: Scheduling infrastructure for absolutely everyone. Meta to start capturing employee mouse movements, keystrokes for AI training The Vercel Breach: OAuth Supply Chain Attack Exposes the Hidden Risk in Platform Environment Variables Member of Technical Staff, Product Engineering (full-time) at Trellis AI | Y Combinator CATL's new LFP battery can charge from 10 to 98% in less than 7 minutes Jobs at Bloom | Y Combinator The printing press for biological data (Sterling Hooten) Brussels launched an age checking app. Hackers took 2 minutes to break it Inside GitHub's Fake Star Economy The Illuminated Man by Christopher Priest and Nina Allan review – an unconventional portrait of JG Ballard IEA: Solar overtakes all energy sources in a major global first Stripe’s payments APIs: The first 10 years GitHub - esutcu/planb-lpm GitHub - browser-use/browser-harness: Self-healing browser harness that enables LLMs to complete any task. Claude Token Counter, now with model comparisons GitHub - shivampkumar/trellis-mac Six levels of dark mode The Bromine Chokepoint: How Strife in the Middle East Could Halt Production of the World’s Memory Chips Turtle WoW classic server announces shutdown after Blizzard wins injunction Scoring 500 Show HN pages for AI design patterns Vercel April 2026 security incident | Vercel Knowledge Base Dubai police arrest airline worker after accessing private WhatsApp group Prompt → Diagram — Gemma 4 E2B in desktop Chrome (WebGPU) Binary GCD - Algorithmica madhadron - The seven programming ur-languages Keep Pushing: We Get 10 More Days to Reform Section 702 The world in which IPv6 was a good design Zero-Copy GPU Inference from WebAssembly on Apple Silicon The RAM shortage could last years Any Color You Like: NIST Scientists Create ‘Any Wavelength’ Lasers in Tiny Circuits for Light Optimizing Ruby Path Methods A college instructor turns to typewriters to curb AI-written work and teach life lessons UpCodes | Careers The electromechanical angle computer inside the B-52 bomber's star tracker Why Japan has such good railways - Works in Progress Magazine State of Kdenlive - 2026 GitHub - smol-machines/smolvm: Tool to build & run portable, lightweight, self-contained virtual machines. Head of Engineering at Kyber | Y Combinator GitHub - paniclock/paniclock: Instantly disable Touch ID and lock your Mac with one click or keyboard shortcut. Detecting DOSBox from within the Box I Measured Claude 4.7's New Tokenizer. Here's What It Costs You. Introducing Claude Design by Anthropic Labs Middle schooler finds coin from Troy in Berlin It Is Time to Ban the Sale of Precise Geolocation Isaac Asimov: The Last Question Teddy Roosevelt and Abraham Lincoln in the same photo Healthchecks.io Now Uses Self-hosted Object Storage Bluesky has been dealing with a DDoS attack for nearly a full day. Harness Engineer at Substrate | Y Combinator GitHub - dacracot/Klondike3-Simulator SPICE simulation → oscilloscope → verification with Claude Code — Lucas Gerads Email could have been X.400 times better Newly unsealed records reveal Amazon’s price-fixing tactics, California attorney general claims GitHub - GainSec/AutoProber: Hardware hacker’s flying probe automation stack for agent-driven target discovery, microscope mapping, safety-monitored CNC motion, probe review, and controlled pin probing. A Better R Programming Experience Thanks to Tree-sitter Clojure - Documentary GPT‑Rosalind for life sciences research How a Tiny Yellow Handheld Changed How Duke University Teaches Game Design - Playdate News Android CLI and skills: Build Android apps 3x faster using any agent Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 Codex for almost everything GitHub - GRVYDEV/marky: A lightweight easy to use markdown viewer
GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index
Artificial Analysis · 2026-06-17 · via Hacker News: Front Page

Z ai’s GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index scoring 51 and it sits on the Pareto frontier of Intelligence vs Cost per Task

GLM-5.2 is the same size as GLM-5.1 (744B total / 40B active parameters) but scores 11 points higher on the Intelligence Index v4.1, placing ahead of MiniMax-M3 (44) and DeepSeek V4 Pro (max, 44). On the first-party API it is priced in line with GLM-5.1 at $1.4/$4.4/$0.26 per 1M input/output/cache hit tokens

Key results:

GLM-5.2 is the leading open weights model on the Intelligence Index v4.1. At 51, it leads MiniMax-M3 (44), DeepSeek V4 Pro (max, 44) and Kimi K2.6 (43)

Improvements across most evaluations, particularly scientific reasoning: GLM-5.2 gains over GLM-5.1 on most evaluations, led by scientific reasoning on CritPt (+16 points to 21%) and HLE (+12 points to 40%), alongside AA-LCR (+9 points to 71%), tau3 banking (+15 points to 27%) and SciCode (+7 points to 50%). TerminalBench v2.1 also improves (+16 points to 78%) and GPQA Diamond gains 3 points to 89%

➤ Leading open weights model on GDPval-AA v2 and competitive with proprietary models: GLM-5.2 scores 1524 on GDPval-AA v2, ahead of MiniMax-M3 (1418) and DeepSeek V4 Pro (max, 1328). This impressive result places GLM-5.2 in-line with proprietary models including GPT-5.5 (xhigh reasoning). GDPval-AA v2 builds on the original GDPval-AA by baselining Elo to human performance at 1000, introducing a rotating panel of frontier-model judges, and raising the turn limit from 100 to 250 for longer-horizon agent trajectories

GLM-5.2 uses more output tokens per task than other leading open weights models: the model uses 43k output tokens per Intelligence Index task, up from GLM-5.1 (26k) and above MiniMax-M3 (24k), Kimi K2.6 (35k) and DeepSeek V4 Pro (max, 37k)

On the Intelligence vs. Cost per Task Pareto Frontier: GLM-5.2 is on the Pareto frontier of the Intelligence vs Cost per Task chart, with the lowest cost per task among models at its intelligence level. GLM-5.2 costs ~$0.46 per task, compared to GLM-5.1 ($0.25), Kimi K2.6 ($0.31), MiniMax-M3 ($0.18) and DeepSeek V4 Pro (max, $0.05)

Additional Model Details:

License: MIT

Size: 744B total parameters, 40B active parameters, equivalent to GLM-5.1

Context window: 1M tokens, up from 200K on GLM-5.1

Pricing: $1.4/$0.26/$4.4 per 1M input/cache hit/output tokens

Availability: Alongside Z ai's first-party API, GLM-5.2 is available across third-party providers including DeepInfra, Novita, Nebius, Parasail, Siliconflow, GMI Cloud, Baseten, and Fireworks

GLM-5.2 leads all open weights models on GDPval-AA v2, our primary metric for real-world agentic performance. At 1524 it places ahead of MiniMax-M3 (1418) and DeepSeek V4 Pro (max, 1328), and is effectively level with GPT-5.5 (xhigh, 1514). We visually inspected GLM-5.2's outputs across a range of GDPval-AA tasks. We have attached a selection below.

GLM-5.2 scores 4 on the AA-Omniscience Index, up from GLM-5.1 (2). The gain comes from both higher accuracy (25.1% vs 24.2%) and a lower hallucination rate (28.1% vs 29.4%), with attempt rate flat at 47%.

GLM-5.2 uses 43k output tokens per Intelligence Index task, of which 37k is reasoning. This is up from GLM-5.1 (26k) and higher than open weights peers MiniMax-M3 (24k) and Kimi K2.6 (35k), placing it among the less token-efficient open weights models at its intelligence level. GLM-5.2 sits off the most attractive quadrant on the Intelligence vs Output Tokens chart.

Breakdown of the individual evaluations in the Artificial Analysis Intelligence Index v4.1.

Compare GLM-5.2 with other leading models at: https://artificialanalysis.ai/models/glm-5-2