惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
Scott Helme
Scott Helme
爱范儿
爱范儿
WordPress大学
WordPress大学
博客园 - 三生石上(FineUI控件)
阮一峰的网络日志
阮一峰的网络日志
博客园 - Franky
V
V2EX
腾讯CDC
博客园_首页
博客园 - 司徒正美
酷 壳 – CoolShell
酷 壳 – CoolShell
T
Tailwind CSS Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
小众软件
小众软件
J
Java Code Geeks
大猫的无限游戏
大猫的无限游戏
月光博客
月光博客
Microsoft Azure Blog
Microsoft Azure Blog
B
Blog
雷峰网
雷峰网
Stack Overflow Blog
Stack Overflow Blog
IT之家
IT之家
罗磊的独立博客
Recorded Future
Recorded Future
博客园 - 聂微东
O
OpenAI News
S
Secure Thoughts
Hacker News: Ask HN
Hacker News: Ask HN
S
Schneier on Security
Hacker News - Newest:
Hacker News - Newest: "LLM"
Y
Y Combinator Blog
C
Cyber Attacks, Cyber Crime and Cyber Security
Project Zero
Project Zero
宝玉的分享
宝玉的分享
K
Kaspersky official blog
N
Netflix TechBlog - Medium
T
The Exploit Database - CXSecurity.com
Google Online Security Blog
Google Online Security Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Webroot Blog
Webroot Blog
云风的 BLOG
云风的 BLOG
Simon Willison's Weblog
Simon Willison's Weblog
C
Check Point Blog
D
Darknet – Hacking Tools, Hacker News & Cyber Security
L
LINUX DO - 热门话题
美团技术团队
L
Lohrmann on Cybersecurity

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python) The Hidden Cost of AI Systems Nobody Talks About. undefined vs undeclared, and how typeof behaves Switching from file-based jobs to NATS/Kafka in Rust without changing code io_uring Adventures: Rust Servers That Love Syscalls Why Agentic AI is Killing the Traditional Database The POUR principles of web accessibility for developers and designers Quantum Neural Network 3D — A Deep Dive into Interactive WebGL Visualization How To Install Caveman In Codex On macOS And Windows Automation Pipeline Reliability: Why Your Workflow Breaks When Nobody Is Watching I Built an 'Open World' AI Coding Agent — It Works From ANY Folder From Freelancing to Product: A Tech Service Company's SaaS Transformation China's AI Giants: Adding Tencent Hunyuan & ByteDance Doubao to AI University (74 Providers) On the Vibe Coders and Their Lies clerk: Auto-Summarize Your Claude Code Sessions AI Weekly — 2026/04/10–04/17 | The Model Lockdown Is Here, but the Toolchain Is the Real Battleground AI 週報 — 2026/04/10–2026/04/17 模型封鎖潮來了,但工具鏈才是真戰場 Maybe this is how Open-Source apps are born... 🚀 Fine-Tune LLMs with LoRA and QLoRA: 2026 Guide tRPC v11 + Next.js App Router: End-to-End Type Safety Without the Boilerplate ShadCN UI in 2026: Why I Stopped Installing Component Libraries and Started Owning My Components SaaS Billing in React Server Components: Stripe + Supabase Without a Single `useEffect` Join our DEV Weekend Challenge — $1,000 in Prizes Across TEN winners! Submissions Due April 20 at 6:59 AM UTC. Implementing FSRS Spaced Repetition in Flutter + Supabase — Adding Memory Science to an AI Learning App "I Texted My Localhost From the Train — Claude Code Fixed the Bug Before I Got Home" I Built a Sales Prep AI and It Went Deeper Than Expected Design to Code #2: One JSON, Eleven Outputs Solving the 100M-Row Problem: A Summary Table Pattern for High-Volume Push Notification Logs Flutter Web With Wasm: What Actually Changes For Developers I Built 50 Royalty-Free Soundtracks for My Side Project in a Weekend Using AI Music Generation The Vibe Coding Security Checklist: 7 Things to Check Before You Ship Stop Letting Googlebot Guess Fix Your React App's SEO Right Desconstruindo o Streaming do LinkedIn: Como Criar um Engine de Extração de Vídeo de Alta Performance com HLS e FFmpeg (EDA Part-1) EDA (Exploratory Data Analysis) Explained With Real Life — Why Looking at Your Data Is the Most Important Step in Machine Learning Brand Relationship Management at Scale: Our 4-Touch Outreach System for 200+ Brands Why String.fromEnvironment() Might Return an Empty String in Dart JGuardrails 1.0.0 — Hardening Java LLM Apps Against Jailbreaks, Toxicity, and Prompt Injection Plan and Schedule a Full Week of Threads Content From One Claude Conversation Coding Cat Oran Ep3, Five Tables Changed Everything Updated: BFF Pattern I'm done watching freelancers get buried by 200 proposals. So I'm building the alternative. This is my first post BFS Algorithm in Java Step by Step Tutorial with Examples Tracking LLM Pricing Monthly: An Open Dataset for 22 AI Models How We Measure Content ROI on a Comparison Site: Revenue Attribution Without Perfect Data Introducing Nova AI Ops: The AI-Native Operating System for SRE Teams I built a free desktop video downloader for Windows — Grabbit How Talkie OCR Helps Vision-Impaired & Dyslexic Users Read the World Around Them VRCFaceTracking安装和iPhone面捕配置教程,有bug Even CrowdStrike Can't See Your Agents The Automation Gold Rush: What n8n Workflows and Claude Are Opening Up for Developers Right Now
Searching Billions in Seconds: How HNSW Solved the Scale Problem
LinceMathew · 2026-04-27 · via DEV Community

We have a massive problem with how computers find things. If you have a few hundred photos on your phone, you can find one instantly, but trying to find one specific item out of a billion creates a massive technical strain.

Most systems rely on "Linear Search", which is like looking through every single page of a ten-million-page book to find one word.

This "one-by-one" approach makes real-time tools like chatbots or movie recommendations grind to a halt as the data grows. Furthermore, modern data like images or text is "high-dimensional," which breaks traditional filing systems and makes them no faster than checking every item manually.

To fix this, researchers Yu. A. Malkov and D. A. Yashunin changed the rules of the game. They realized that we don't always need a "perfect" match if it takes an hour to find; we often just need a "good enough" match found in a millisecond.

In this article, we are diving into their research paper: "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs."

We will explore how they used a multi-layered "highway system" to make searching a billion items feel as fast as searching a hundred.

Two Methods for Fast Searching

To solve the slowness problem, the researchers combined two concepts from computer science. The first is the "Small World" phenomenon.

You’ve likely heard of "six degrees of separation", the idea that you are connected to anyone on Earth through a short chain of acquaintances.

HNSW treats data the same way. It builds a map where every data point is a "node," and similar points are connected like friends. By jumping from "friend to friend" toward your target, you can navigate a massive dataset in just a few steps.

six-degrees-of-separation

The second concept is Hierachy, which is inspired by a structure called a Skip List.

Imagine a high-rise building where the elevators only stop at every 10th floor. To get to floor 42, you take the express elevator to floor 40 and then walk down to 42. HNSW creates "express layers" at the top with only a few data points and long-range connections.

These top layers allow the search to "skip" across the entire database to the right neighborhood instantly, before dropping down to the ground floor to find the exact result.

skip-list

The Multi-Layer Highway System

To bridge the gap between speed and accuracy, the researchers introduced a multi-layer structure called a Hierarchical Navigable Small World (HNSW) graph.

You can think of this as a "highway system" for data. Instead of one giant, flat network where everything is connected, HNSW organizes data into layers.

hnsw

Top Layers (Express Lanes)

These layers contain only a few data points with long-distance connections. Much like an express highway between major cities, they allow the search to "skip" across the entire dataset to reach the right neighborhood in just a few jumps.

express-highway

Bottom Layers (Local Streets)

As the search moves down through the hierarchy, the layers become denser with shorter connections. Once the search reaches the ground layer—which contains every single piece of data—it uses these "local streets" to fine-tune the results and find the exact match.

local-streets

This hierarchy is inspired by a data structure called a Skip List, but instead of simple linked lists, it uses complex proximity graphs.

proximity-graph

By separating links by their "distance scales," the algorithm ensures that the search time scales logarithmically. This means that even if the dataset grows from thousands to billions, the number of steps required to find an answer stays manageable and fast.

Balancing Speed and Accuracy

To make this system work, there are a few settings that determine how the graph is built and searched.

  • Connections per point (M): This is the number of "links" each piece of data creates with its neighbors. If you give each point more connections, the map becomes more detailed and accurate, but it also takes up more memory.

  • The Build Effort (efConstruction): This determines how much work the system does when first creating the index. A higher setting means the "highways" and "local streets" are better connected, making future searches much more reliable.

  • The Search Depth (ef): This is a setting you use during the actual search. It tells the computer how many neighbors to check before it decides it has found the best match. You can turn this up at any time to get better results without having to rebuild your entire database.

Building an HNSW Index

To see how this works, we can use a popular Python library called hnswlib. This is the same code used in the original research to prove how fast the system is. The following example shows how to set up a database and perform a search in just a few lines.

Import the Tools

We start by importing hnswlib to build the graph and numpy to handle the coordinates of our "meaning map."

import hnswlib
import numpy as np

Enter fullscreen mode Exit fullscreen mode

Create the "Meaning Map"

Let's give our items coordinates based on two features: Sweetness and Size. Fruits will have high sweetness and low size, while furniture will have low sweetness and high size.

data_labels = ["Apple", "Banana", "Cherry", "Chair", "Table", "Sofa"]
data_vectors = np.array([
    [0.9, 0.2], # Apple 
    [0.8, 0.3], # Banana
    [0.9, 0.1], # Cherry
    [0.1, 0.7], # Chair
    [0.1, 0.9], # Table
    [0.2, 0.9]  # Sofa 
], dtype=np.float32)

dim = 2 # Only 2 features: Sweetness and Size
num_elements = len(data_vectors)

Enter fullscreen mode Exit fullscreen mode

The data now looks like a list of coordinates. For example, Apple is at [0.9, 0.2].

Choose the Measuring Tape

We initialize the index. We use 'l2' (Euclidean distance) to measure similarity. On our map, this is just a straight line between two points; the shorter the line, the more similar the items are.

p = hnswlib.Index(space='l2', dim=dim) 

Enter fullscreen mode Exit fullscreen mode

Build the Highway System

We set the rules for the HNSW structure. M sets the number of connections for each point, and ef_construction tells the system how hard to work to find the best neighbors when first building the "highways."

p.init_index(max_elements=num_elements, ef_construction=100, M=16)

Enter fullscreen mode Exit fullscreen mode

Index the Data

This step takes our fruits and furniture and organizes them into layers. The "High-Sweetness" items (fruits) will naturally end up in one neighborhood, while the furniture ends up in another.

p.add_items(data_vectors)

Enter fullscreen mode Exit fullscreen mode

Perform the Search

Now we search for an "Orange". We haven't told the system what an orange is, but we give it the coordinates 0.85, 0.25. We tell the system to find the 3 closest neighbors.

query_vector = np.array([[0.85, 0.25]], dtype=np.float32)
p.set_ef(10)
labels, distances = p.knn_query(query_vector, k=3)

Enter fullscreen mode Exit fullscreen mode

The system returns the IDs for Apple, Banana, and Cherry. Even though "Orange" wasn't in our database, HNSW used the highway system to skip past the "Furniture" neighborhood and find the items that lived in the same "Sweet and Small" neighborhood.

hnsw-graph

here is the layered flow diagram.

hnsw-layered-flow

In this diagram you can see there are three layers in the graph. The top layer has only a few nodes and the bottom layer has all the nodes. The search starts from the top layer and moves down to the bottom layer.

Realworld Applications of HNSW

The reason this research paper is so influential is that it solved the "scale" problem for some of the most popular technology today. Because HNSW can find a "neighborhood" of similar ideas in milliseconds, it is used in:

AI Chatbots & RAG

When you ask an AI a question, it uses HNSW to search through millions of documents to find the specific paragraphs that contain the answer.

Recommendation Engines

Apps like Spotify or Netflix use this logic to find songs or movies that are "mathematically close" to what you just finished watching.

Image Search

Tools like Google Lens compare the "fingerprint" of your photo against billions of others to find a match instantly.

Fraud Detection

Banks use it to see if a new transaction looks "similar" to known patterns of theft or if it fits your normal "neighborhood" of spending.

Conclusion

Before this paper, we had a difficult choice: we could have a search that was perfectly accurate but incredibly slow, or a search that was fast but broke down as it grew.

Malkov and Yashunin’s HNSW changed that by introducing the "highway" system for data. By accepting a tiny bit of approximation, they gave us a way to search through billions of items in the blink of an eye.