惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Schneier on Security
Schneier on Security
Vercel News
Vercel News
罗磊的独立博客
MyScale Blog
MyScale Blog
人人都是产品经理
人人都是产品经理
GbyAI
GbyAI
D
Docker
L
LangChain Blog
美团技术团队
The Register - Security
The Register - Security
G
Google Developers Blog
U
Unit 42
B
Blog RSS Feed
MongoDB | Blog
MongoDB | Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
aimingoo的专栏
aimingoo的专栏
F
Fortinet All Blogs
Recorded Future
Recorded Future
Last Week in AI
Last Week in AI
大猫的无限游戏
大猫的无限游戏
WordPress大学
WordPress大学
Stack Overflow Blog
Stack Overflow Blog
有赞技术团队
有赞技术团队
M
MIT News - Artificial intelligence
月光博客
月光博客
P
Proofpoint News Feed
Recent Announcements
Recent Announcements
J
Java Code Geeks
宝玉的分享
宝玉的分享
The Cloudflare Blog
Microsoft Azure Blog
Microsoft Azure Blog
K
Kaspersky official blog
G
GRAHAM CLULEY
A
Arctic Wolf
T
Tenable Blog
S
Schneier on Security
C
Cyber Attacks, Cyber Crime and Cyber Security
T
Threatpost
Project Zero
Project Zero
C
CXSECURITY Database RSS Feed - CXSecurity.com
Latest news
Latest news
L
LINUX DO - 最新话题
C
CERT Recently Published Vulnerability Notes
S
Security Affairs
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Spread Privacy
Spread Privacy
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
The Last Watchdog
The Last Watchdog
W
WeLiveSecurity
Security Latest
Security Latest

Hacker News: Show HN

PurrrrrFocus: Pomodoro Timer App - App Store Workflow Engine — Multi-Step Orchestration for Bun RapidPhoto: Pro Photo Editor App - App Store GitHub - DheerG/swarms: Achieve extraordinary results with claude code across a variety of tasks SPICE simulation → oscilloscope → verification with Claude Code — Lucas Gerads Show HN: VCoding – A 5 MB native Windows IDE with no dynamic dependencies Show HN: LLMs don't hallucinate because they're bad at math, it's the format GitHub - Agent-FM/agentfm-core: AgentFM is a peer-to-peer network that turns everyday computers into a decentralized AI supercomputer. AgentFM lets you run massive AI workloads directly across a global mesh of idle CPUs and GPUs. Show HN: Tracking Top US Science Olympiad Alumni over Last 25 Years GitHub - Potarix/agent-hub: One place to talk to all your agents Show HN: Runtime security for AI agents(injection,tool abuse, data exfiltration) GitHub - dubeyKartikay/lazyspotify: Terminal Spotify client for macOS and Linux GitHub - the-banana-tool/king-louie: Easy to use GUI Personal AI Assistant. Win/Linux/Mac. Show HN I made my vacation rental bookable by AI agents–no Airbnb, 0% commission GitHub - basteez/jsf-autoreload: maven plugin to enable hot reload on jsf projects uvm32/hosts/host-gdbstub at main · ringtailsoftware/uvm32 GitHub - labsai/EDDI: Config-driven engine that turns JSON into production-grade AI agents. Multi-agent orchestration, 12+ LLM providers, MCP/A2A protocols, RAG, persistent memory, and enterprise compliance (EU AI Act, GDPR, HIPAA). Built on Quarkus. GitHub - glitchnsec/fortyone-oss: AI Executive Assistant Platform Quickstart | Alien GitHub - muxshed/shed: One stream in, or many. Every destination, simultaneously. No cloud middleman, no per-channel fees, no limits. GitHub - ocrbase-hq/ocrbase: 📄 PDF/IMG ->.MD/JSON Document OCR API for PaddleOCR and GLMOCR. Self-hostable. GitHub - impactjo/home-memory: MCP server that lets your AI assistant remember everything about your home. GitHub - Sets88/dbcls: DbCls is a powerful terminal database client that supports various databases GitHub - neptun2000/heor-agent-mcp GitHub - SeanFDZ/macmind: Single-layer transformer in HyperTalk for the classic Macintosh RollQuation: Math Puzzles - Apps on Google Play GitHub - dropbox/witchcraft Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis GitHub - opentalon/opentalon: OpenTalon is an open-source platform built from the ground up in Go as a robust alternative to OpenClaw LinkedIn™ 职位抓取工具 - Chrome 应用商店 GitHub - EdoardoBambini/Agent-Armor-Iaga: AI agents are getting tool access — shell, file system, databases, APIs, secrets. But **nobody is governing what they actually do with it**. Frameworks like LangChain, CrewAI, AutoGen, and Claude Code give agents the power to execute. Agent Armor gives you the power to control, audit, and approve every single action before it happens. HN Vibes — Week 15, Apr 7–13 2026 GitHub - chojs23/ec: Easy terminal-native 3-way git mergetool vim-like workflow GitHub - SethPyle376/hiraeth: Local AWS emulator focused on fast integration testing, with SQS support, SQLite-backed state, and a debug-friendly web UI. GitHub - JakOb-dotcom/cloud-sandbox-security-analysis: Technical analysis and Proof of Concept (PoC) regarding environment variable exfiltration in containerized cloud sandboxes via side-channel data leaks. Springboards - Flint Alpha Show HN: A simpler coding agent harness GitHub - audiodude/sudomake-friends GitHub - 256thFission/mini-mythos: OSS clone of Anthropic’s Mythos harness to locate C/C++ memory vulnerabilities Show HN: OpenParallax: OS-level privilege separation for AI agent execution Hacker News Sorted - Chrome 应用商店 Show HN: How to Install Docker on Ubuntu 24.04 LTS: Complete 2026 Guide GitHub - himanshudongre/smriti GitHub - sverrirsig/claude-control: macOS desktop dashboard for monitoring and managing multiple Claude Code sessions GitHub - ory/dockertest: Write better integration tests! Dockertest helps you boot up ephermal docker images for your Go tests with minimal work. Chiral - Chrome 应用商店 Show HN: Two Claudes collaborating through shared memory on a $100 mini-PC GitHub - pmichaillat/latex-cv: Minimalist LaTeX template for academic CVs GitHub - oguzbilgic/posse: A web UI for Anthropic Managed Agents. GitHub - sshiraz/depsly: Dependency risk analysis tool for npm packages ABI Add safari/agent-harness — Safari browser automation via safari-mcp by achiya-automation · Pull Request #212 · HKUDS/CLI-Anything GitHub - Halfblood-Prince/trustcheck: Verify PyPI package attestations and improve Python supply-chain security GitHub - oguzbilgic/kern-ai: Agents that do the work and show it. GitHub - bruits/satteri: High-performance Markdown and MDX processing for the JavaScript ecosystem GitHub - tylergibbs1/feedstock: High-performance web crawler and scraper for TypeScript, powered by Bun and Playwright GitHub - Grimm67123/grimmbot: The self-improving sandboxed and open-source AI agent. With persistent memory and scheduling. GitHub - whitevanillaskies/whitebloom: Local whiteboard that blooms. GitHub - hwdsl2/docker-whisper: Docker image for a self-hosted Whisper speech-to-text server with speaker diarization and OpenAI-compatible transcription and translation APIs. Powered by faster-whisper. Supports all Whisper models, NVIDIA GPU (CUDA) acceleration, JSON/SRT/VTT output, SSE streaming, offline mode, and multi-arch (amd64, arm64). GitHub - yisding/reviewwiggum GitHub - MarwanAlsoltany/serrors: Structured errors for Go: sentinel hierarchies, typed data, custom formatting, and slog integration. GitHub - soatok/age-php GitHub - Luthiraa/markitme GitHub - stagas/rtdiff: realtime git diff gui and AI-assisted commits GitHub - tombedor/excalicharts GitHub - wh1le/excalidraw-edit: Open and edit .excalidraw files from the terminal. Offline, auto-saves to disk. MalExt Sentry - Malicious Extension Scanner - Chrome 应用商店 GitHub - syi0808/asciianimesvg: Generate animated ASCII art SVGs from text. CLI, Rust library, WASM, and web editor. GitHub - zaina-ml/ml_forge: A visual-based graph node editor for training computer vision models. GitHub - anakin87/llm-rl-environments-lil-course: 🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models GitHub - takaakit/superpowers-uml: Superpowers-UML modifies Superpowers to ensure a software development workflow in which AI agents design through UML modeling. AdriByte Studio - Sviluppo Web e Soluzioni Digitali GitHub - chouligi/angel-copilot: Your personalized Angel Investment Advisor Show HN: MoodSense AI (ML and FastAPI and Gradio, Deployed on Hugging Face) Moodsense Ai - a Hugging Face Space by aman179102 GitHub - agenteractai/lodmem: Level Of Detail Context Management for Agents GitHub - ostefani/subnetlens: A fast, concurrent network scanner with a TUI and plain-text CLI, built in Go. It discovers live hosts on your network, scans their open ports, resolves hostnames, and fingerprints operating systems—delivered. Cyber Pulse: Agentic Intel - Apps on Google Play Whisper API: Self-Hostable Speech to Text Transcription The Agent-Web Protocol Stack: A Research Thesis GitHub - msmarkgu/RelayFreeLLM: A restful API designed to route user prompts to various AI model providers. Show HN: Provepy – A Python decorator that proves your code using Lean and LLMs Show HN: Pardonned.com – A searchable database of US Pardons GitHub - patrickdappollonio/dux: Dux is a terminal UI that lets you run multiple AI coding agents side by side, each in its own git worktree, with full companion terminals, macros, commit generation, and a command palette that knows more tricks than you do. kMC Crystal Simulator Show HN: HyperFlow – A self-improving agent framework built on LangGraph GitHub - stef41/vibescore: 🎵 Grade your vibe-coded project. One command, instant letter grade across security, quality, dependencies, and testing. GitHub - stef41/lmscan: 🔍 Detect AI-generated text and fingerprint which LLM wrote it. Open-source GPTZero alternative. Zero dependencies, works offline. imgur.com GitHub - visionscaper/collabmem: Enabling long-term collaboration with Agentic AI - building up episodic and world model memory over time with in-context awareness 在 Steam 上购买 FriedrichAI: Offline AI 立省 10% GitHub - atripati/ark: AI Runtime Kernel — a context operating system for AI agents. Eliminates tool bloat, loads only what’s needed, and gives LLMs their reasoning space back. GitHub - nowork-studio/toprank: Open-source Claude Code skills for SEO, SEM, Google Ads GitHub - tacomanator/sash: Lightweight macOS menu bar app for reliably cycling through windows of the current application. Appents | Social Media Management for Product-First Teams GitHub - pnhoang/youtube-spam-blocker: Automatically detects and hides spam messages in YouTube Live chat. Set rate limits, keyword filters, and block repeat offenders. GitHub - decisionnode/DecisionNode: CLI + Local MCP - A shared structured memory store across Claude Code, Cursor, Windsurf, Antigravity, and every MCP client. Semantically queryable. GitHub - AvaCodeSolutions/django-email-learning: An open source Django app for creating email-based learning platforms with IMAP integration and React frontend components. The $100K Gap in Kubernetes Security Tooling Function Calling Harness: From 6.75% to 100%
GitHub - Anush008/fastembed-rs: Rust library for generating vector embeddings, reranking locally!
thoughtfully · 2026-06-15 · via Hacker News: Show HN

Rust library for generating vector embeddings, reranking locally!

Crates.io Apache 2.0 Licensed Semantic release

Features

Not looking for Rust?

Supported Models

Text Embedding

Click to list models

Quantized versions are also available for several models above (append Q to the model enum variant, e.g., EmbeddingModel::BGESmallENV15Q). EmbeddingGemma additionally ships a 4-bit build as EmbeddingModel::EmbeddingGemma300MQ4.

Sparse Text Embedding

Click to list models

Image Embedding

Click to list models

Reranking

Click to list models

✊ Support

To support the library, please donate to our primary upstream dependency, ort - The Rust wrapper for the ONNX runtime.

Installation

Run the following in your project directory:

Or add the following line to your Cargo.toml:

[dependencies]
fastembed = "5"

Text Embeddings

use fastembed::{TextEmbedding, TextInitOptions, EmbeddingModel};

// With default options
let mut model = TextEmbedding::try_new(Default::default())?;

// With custom options
let mut model = TextEmbedding::try_new(
    TextInitOptions::new(EmbeddingModel::AllMiniLML6V2).with_show_download_progress(true).with_intra_threads(4),
)?;

let documents = vec![
    "passage: Hello, World!",
    "query: Hello, World!",
    "passage: This is an example passage.",
    // You can leave out the prefix but it's recommended
    "fastembed-rs is licensed under Apache 2.0"
];

 // Generate embeddings with the default batch size, 256
 let embeddings = model.embed(documents, None)?;

 println!("Embeddings length: {}", embeddings.len()); // -> Embeddings length: 4
 println!("Embedding dimension: {}", embeddings[0].len()); // -> Embedding dimension: 384

Sparse Text Embeddings

use fastembed::{SparseEmbedding, SparseInitOptions, SparseModel, SparseTextEmbedding};

// With default options
let mut model = SparseTextEmbedding::try_new(Default::default())?;

// With custom options
let mut model = SparseTextEmbedding::try_new(
    SparseInitOptions::new(SparseModel::SPLADEPPV1).with_show_download_progress(true),
)?;

let documents = vec![
    "passage: Hello, World!",
    "query: Hello, World!",
    "passage: This is an example passage.",
    "fastembed-rs is licensed under Apache 2.0"
];

// Generate embeddings with the default batch size, 256
let embeddings: Vec<SparseEmbedding> = model.embed(documents, None)?;

Image Embeddings

use fastembed::{ImageEmbedding, ImageInitOptions, ImageEmbeddingModel};

// With default options
let mut model = ImageEmbedding::try_new(Default::default())?;

// With custom options
let mut model = ImageEmbedding::try_new(
    ImageInitOptions::new(ImageEmbeddingModel::ClipVitB32).with_show_download_progress(true),
)?;

let images = vec!["assets/image_0.png", "assets/image_1.png"];

// Generate embeddings with the default batch size, 256
let embeddings = model.embed(images, None)?;

println!("Embeddings length: {}", embeddings.len()); // -> Embeddings length: 2
println!("Embedding dimension: {}", embeddings[0].len()); // -> Embedding dimension: 512

Candidates Reranking

use fastembed::{TextRerank, RerankInitOptions, RerankerModel};

// With default options
let mut model = TextRerank::try_new(Default::default())?;

// With custom options
let mut model = TextRerank::try_new(
    RerankInitOptions::new(RerankerModel::BGERerankerBase).with_show_download_progress(true),
)?;

let documents = vec![
    "hi",
    "The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear, is a bear species endemic to China.",
    "panda is animal",
    "i dont know",
    "kind of mammal",
];

// Rerank with the default batch size, 256 and return document contents
let results = model.rerank("what is panda?", documents, true, None)?;
println!("Rerank result: {:?}", results);

Locally Available Models

Alternatively, local model files can be used for inference via the try_new_from_user_defined(...) methods of respective structs.

Similarity Search

Helpers in the similarity module score and rank the vectors embed returns, so a quick in-memory search needs no extra crate:

use fastembed::similarity::{cosine_similarity, top_k};

// `embeddings` is the Vec<Embedding> from model.embed(...)
let query = &embeddings[0];

// Score two vectors directly ([-1.0, 1.0], higher = closer)
let score = cosine_similarity(query, &embeddings[1]);

// Or rank the corpus: (index, score) pairs, best first
let hits = top_k(query, &embeddings, 5);
println!("Closest: {:?}", hits);

For larger corpora or persistence, push the vectors to a vector search engine (e.g. Qdrant) and query there.

Qwen3 Embeddings

Qwen3 embedding models are available behind the qwen3 feature flag (candle backend).

[dependencies]
fastembed = { version = "5", features = ["qwen3"] }
use candle_core::{DType, Device};
use fastembed::Qwen3TextEmbedding;

let device = Device::Cpu;
let model = Qwen3TextEmbedding::from_hf(
    "Qwen/Qwen3-Embedding-0.6B",
    &device,
    DType::F32,
    512,
)?;

// Text-only usage with the Qwen3-VL embedding checkpoint is also supported:
// let model = Qwen3TextEmbedding::from_hf("Qwen/Qwen3-VL-Embedding-2B", &device, DType::F32, 512)?;

let embeddings = model.embed(&["query: ...", "passage: ..."])?;
println!("Embeddings length: {}", embeddings.len());

For multimodal text/image usage with Qwen/Qwen3-VL-Embedding-2B:

use candle_core::{DType, Device};
use fastembed::Qwen3VLEmbedding;

let device = Device::Cpu;
let model = Qwen3VLEmbedding::from_hf(
    "Qwen/Qwen3-VL-Embedding-2B",
    &device,
    DType::F32,
    2048,
)?;

let image_embeddings = model.embed_images(&["tests/assets/image_0.png", "tests/assets/image_1.png"])?;
let text_embeddings = model.embed_texts(&["query: blue cat", "query: red cat"])?;

println!("Image embeddings: {}", image_embeddings.len());
println!("Text embeddings: {}", text_embeddings.len());

Nomic Embed Text v2 MoE

The nomic-embed-text-v2-moe model is available behind the nomic-v2-moe feature flag (candle backend). First general-purpose MoE embedding model with 100+ language support.

[dependencies]
fastembed = { version = "5", features = ["nomic-v2-moe"] }
use candle_core::{DType, Device};
use fastembed::NomicV2MoeTextEmbedding;

let device = Device::Cpu;
let model = NomicV2MoeTextEmbedding::from_hf(
    "nomic-ai/nomic-embed-text-v2-moe",
    &device,
    DType::F32,
    512,
)?;

let embeddings = model.embed(&["search_query: ...", "search_document: ..."])?;
println!("Embeddings length: {}", embeddings.len());

BGE-M3 Joint Embeddings

The BGE-M3 model produces dense, sparse, and ColBERT embeddings simultaneously in a single forward pass.

use fastembed::{Bgem3Embedding, Bgem3InitOptions, Bgem3Model};

// With default options
let mut model = Bgem3Embedding::try_new(Default::default())?;

// With custom options (supporting custom max length up to 8192 tokens)
let mut model = Bgem3Embedding::try_new(
    Bgem3InitOptions::new(Bgem3Model::BGEM3Q)
        .with_max_length(1024)
        .with_show_download_progress(true),
)?;

let documents = vec![
    "Hello, World!",
    "This is an example passage.",
    "fastembed-rs is licensed under Apache 2.0",
    "i dont know"
];

// Generate all three representations in a single forward pass
let output = model.embed(documents, None)?;

println!("Dense dimension: {}", output.dense[0].len()); // -> Dense dimension: 1024

let sparse_emb = &output.sparse[0];
println!("Sparse non-zero tokens: {}", sparse_emb.indices.len());

println!("ColBERT token count: {}", output.colbert[0].len());

Note

The default quantized model (BGEM3Q) is optimized for CPUs; passing a GPU execution provider (like CUDA) will fail. For GPU inference or custom requirements, you can export your own custom model (FP32, FP16, or INT8) using the ONNX export script from hf gpahal/bge-m3-onnx-int8 and load it via try_new_from_path.

Model cache

Models download on first use and load from cache afterwards (no network needed at runtime once cached).

  • FASTEMBED_CACHE_DIR — cache location (default: .fastembed_cache). Equivalent to TextInitOptions::with_cache_dir.
  • HF_HOME — if set, takes precedence over the above.
  • HF_ENDPOINT — Hugging Face mirror base URL, for restricted networks.

DirectML (Windows)

To run models on a GPU via DirectML on Windows, enable the directml feature:

[dependencies]
fastembed = { version = "5", features = ["directml"] }

Then pass a DirectML execution provider when initializing a model:

use fastembed::{TextEmbedding, TextInitOptions, EmbeddingModel};
use ort::ep::DirectML;

let model = TextEmbedding::try_new(
    TextInitOptions::new(EmbeddingModel::AllMiniLML6V2)
        .with_execution_providers(vec![DirectML::default().into()]),
)?;

When DirectML is detected, fastembed automatically disables memory pattern optimization and parallel execution on the ONNX Runtime session, as required by the DirectML execution provider.

LICENSE

Apache 2.0