惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
L
LangChain Blog
人人都是产品经理
人人都是产品经理
D
DataBreaches.Net
WordPress大学
WordPress大学
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
小众软件
小众软件
The Register - Security
The Register - Security
C
Check Point Blog
Engineering at Meta
Engineering at Meta
The GitHub Blog
The GitHub Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
爱范儿
爱范儿
有赞技术团队
有赞技术团队
酷 壳 – CoolShell
酷 壳 – CoolShell
Vercel News
Vercel News
Google DeepMind News
Google DeepMind News
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
阮一峰的网络日志
阮一峰的网络日志
美团技术团队
P
Proofpoint News Feed
IT之家
IT之家
Martin Fowler
Martin Fowler
云风的 BLOG
云风的 BLOG
V
Visual Studio Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
V
V2EX
MyScale Blog
MyScale Blog
Y
Y Combinator Blog
博客园 - 【当耐特】
Stack Overflow Blog
Stack Overflow Blog
Microsoft Security Blog
Microsoft Security Blog
S
Schneier on Security
G
Google Developers Blog
Hugging Face - Blog
Hugging Face - Blog
F
Full Disclosure
Apple Machine Learning Research
Apple Machine Learning Research
博客园 - Franky
T
The Exploit Database - CXSecurity.com
罗磊的独立博客
Spread Privacy
Spread Privacy
D
Darknet – Hacking Tools, Hacker News & Cyber Security
The Cloudflare Blog
Latest news
Latest news
GbyAI
GbyAI
P
Privacy International News Feed
Last Week in AI
Last Week in AI
T
The Blog of Author Tim Ferriss
H
Hacker News: Front Page
K
Kaspersky official blog

Hacker News: Show HN

PurrrrrFocus: Pomodoro Timer App - App Store Workflow Engine — Multi-Step Orchestration for Bun RapidPhoto: Pro Photo Editor App - App Store GitHub - amitb-quantum/roboapi: The unified API layer for robotics. Connect any robot, any brand, with one SDK. Like Stripe, but for robots. GitHub - manankharwar/fusioncore: ROS 2 sensor fusion SDK: UKF, 3D native, proper GNSS, zero manual tuning. Apache 2.0. Show HN: Wayland Wlroot Hjkl Everywhere Show HN: Codex context bloat? 87% avg reduction on SWE-bench Verified traces GitHub - NoahCristino/llmcat: A simple CLI that transforms your code into clean, structured text for feeding into LLMs. GitHub - actuallyepic/background-computer-use GitHub - h4ckf0r0day/obscura: The headless browser for AI agents and web scraping GitHub - anthonybudd/Express-ts-API-Template: Production-ready minimal REST API boilerplate using Express.js, Sequelize and MySQL. GitHub - molefrog/lilmd: Agent-friendly CLI for reading large Markdown files Show HN: I built a simple site to reduce tool overload and improve focus GitHub - JonathanRosado/claude-anyteam: Native Claude Code teammates, any LLM. Codex today. Gemini, Kimi, GLM, DeepSeek next. GitHub - abi/lilo Show HN: ffmpeg-render-pro – Parallel video rendering with live dashboard GitHub - adam-s/HNswered: Notifies you when someone replies to your Hacker News posts and comments. GitHub - arian-gogani/nobulex: The accountability primitive for AI agents. Cryptographic behavioral commitments with trustless verification. GitHub - Developing-Gamer/roids: Steroids for your AI agents GitHub - nellavio/nellavio: 🚀 Next.js dashboard starter with auth, i18n, 18 pages, 60+ charts and 90+ UI components Spotify – Web Player GitHub - tillahoffmann/cctape: Claude proxy to record all interactions in a local database, allowing you to browse and search sessions, track usage, and let Claude search its own history over MCP. Show HN: Rook, a macOS notes app for developers GitHub - donchuru/mr-links: Chrome extension that shows referenced links above comments on Marginal Revolution assorted links posts GitHub - anubhavgupta/whisper-npu: Speech to text at cursor using NPU. (shortcut -> win+/) SHOW HN: I built a marketplace agency that treats profit as the actual metric GitHub - trycua/cua: Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows). GitHub - montanaflynn/headless-terminal: Headless terminal — puppeteer for TUIs (vim/emacs/htop/nethack) with a Go CLI backed by libghostty-vt The Genesis of TurboPentest.com: Bridging the Gap in an AI-Code Explosion Era GitHub - VincenzoManto/Doxa: A YAML-driven multi-agent simulation platform for economic and social systems. It combines LLM-backed agents, market microstructure, relation graphs, and world events behind a FastAPI API and a React client. GitHub - IdreesInc/PokeMesh: Collaborative Pokémon over a decentralized network, no internet required! GitHub - tamnd/python-one: Historical Python source tarballs (1.0.1 through 2.0c1) mirrored from legacy.python.org, extracted and committed at their original release dates. GitHub - dixalex/decision-linter: Decision Linter — like ESLint for your thinking. Claude Code plugin that scores assumptions, runs consider-the-opposite, and outputs judgment memos. GitHub - pitermarx/Virgulas: An Workflowy Inspired Outliner Show HN: A standalone YouTube Music popup for OBS (no plugins required) GitHub - polotno-project/render-tag: Render HTML string into canvas using 2d native API libgoc/bench/README.md at 81798b3dd5e27d7f1e752b6cff0bdf3a95d381b7 · libgoc/libgoc Show HN: Llm.sql – Run a 640MB LLM on SQLite, with 210MB peak RSS and 7.4 tok/s GitHub - RivoLink/leaf: Terminal Markdown previewer — GUI-like experience. [China AI News] Eight Chinese chip families ran DeepSeek V4 on launch day GitHub - michaelaz774/decision-engine: A decision operating system for startup founders, powered by Claude Code. Synthesizes wisdom from 25+ legendary founders and investors into interactive AI-driven decision frameworks. Sheetflow App - Webflow Apps & Integrations Questly — The Third Rail GitHub - crufter/safer: Sleep better while AI agents have shell access. gitrails/README.md at main · maxawzsinger/gitrails Show HN: RustNmap | Hacker News GitHub - 0x0funky/agent-sprite-forge: Agent Skill for generating 2D sprite sheets, transparent PNG frames, and animated GIFs from prompts. Submissions from github.com/darshanfofadiya | Hacker News GitHub - seb3773/ntfs-repair-rfc: A legally defensible, industry-grade blueprint for building an open-source NTFS structural repair engine from scratch. TOSTask - Chrome 应用商店 GitHub - Fergana-Labs/stash: Shared memory for your team's coding agents GitHub - AdirAmsalem/easl: Instant hosting for AI agents — turn output into pages worth sharing Show HN: Pdfnative – zero-dependency TypeScript PDF engine GitHub - EricNelson12/retrocycles-hilbert GitHub - ONSARI/payclaw-skill Earful — A voice-only social network GitHub - Chrilleweb/dotenv-diff: Validate environment variable usage in your codebase llms-wordpress-plugin-benchmark/README.md at main · guilamu/llms-wordpress-plugin-benchmark GitHub - agentdmai/teamfuse: Fuse Claude Code agents into a working team. AgentDM-powered template with a cabinet-style control panel, five starter roles Thedex — AI-Native Log Intelligence Introducing Universal Deploy (+server) | Vike GitHub - thomas-vilte/mls-go: MLS Protocol (RFC 9420) implementation in Go. Secure group key exchange with forward secrecy and post-compromise security for E2EE messaging. GitHub - tamnd/python-0.9.1: Python 0.9.1 from 1991, Guido van Rossum's first public release, patched to compile on modern systems GitHub - orchidfiles/ungate: Use your Claude and ChatGPT subscriptions in Cursor instead of paying for API tokens. Show HN: Rusty Browser – AI rust service spinning up AI browsers Show HN: Seleci – Pre-built AI agents that keeps your business running GitHub - SkardiLabs/skardi: Spark for Agents — a data platform that gives AI agents full data autonomy so every dataset in your stack becomes something an agent can actually use. sss/Design.md at share-hn · ekipan/sss GitHub - TwillAI/agentbox-sdk: The open-source TypeScript SDK for running AI coding agents in sandboxes. One unified API — swap agents and infrastructure providers without changing your code. GitHub - cheprasov/ts-jsbt: JavaScript Binary Transfer (JSBT) – a binary serialization format designed for JavaScript → JavaScript communication. GitHub - AI-Colleagues/skill-mgr: Agent Skill manager. Install a skill for all supported or specified agents When Your Repo Moves, Your AI Coding History Doesn’t Buffer zoom GitHub - janaraj/tnl: Structured English contracts for AI coding agents — proposed by the agent, approved by you, saved on disk, read by every future session. Show HN: Slopify – An AI agent skill to slopify a codebase GitHub - tinyhumansai/openhuman: Your Personal AI super intelligence. Private, Simple and extremely powerful. Show HN: Reducing a 66-node dependency cycle to 13 in Scrapy GitHub - NV404/gova GitHub - latitude-dev/eval-skills: LLM eval skills for developers. Free tools to find failure patterns, build evals, and improve AI quality in production GitHub - BadC-mpany/lilith-zero: ⚸ Lilith Zero - Security Middleware for MCP tool calls written in Rust. GitHub - pumpkin-bit/Flux3n1: Music generation using the Collatz conjecture formula GitHub - Lumen-Labs/brainapi2: BrainAPI is a knowledge graph–powered AI memory layer that transforms unstructured data into structured knowledge, enabling intelligent search, recommendations, and contextual memory for AI agents and applications. An agent-native static host for AI-generated sites · VibeDrop GitHub - k38f/envsleuth: 🕵️ Detective for env vars in Python code. Finds os.getenv/os.environ usages via AST and checks them against your .env file. whodb/cli at main · clidey/whodb GitHub - Higangssh/winclipshot: Windows clipboard-to-path helper for terminal CLIs like Claude Code. Screenshot with Win+Shift+S, paste the saved path with Ctrl+V GitHub - decisionbox-io/decisionbox-platform: DecisionBox connects to your data warehouse, runs autonomous AI agents that write and execute SQL, and surfaces validated insights and actionable recommendations — without you asking a single question. GitHub - scastiel/kado: Kadō — A privacy-first habit tracker for iPhone and iPad. GitHub - franzenzenhofer/tinyscreenshot: Token-frugal screenshots for AI agents. A default capture costs ~540 tokens instead of ~2100. GitHub - russellromney/honker: SQLite extension + bindings for Postgres NOTIFY/LISTEN semantics with durable queues, streams, pub/sub, and scheduler Manex Hub App - App Store Release 0.5 · greymattergames/unbug GitHub - mljar/features_goldmine: Features Engineering Made Easy GitHub - chojs23/lazyagent: TUI for watching all your AI coding agents(claude, codex and opencode) Running Is The Hardest Endurance Sport? Not so Fast Show HN: Canopy – A2UI experiment in Go for macOS/AppKit Show HN: Share browser recordings on Cloudflare Pages from the command line GitHub - aiptimizer/TurboOCR: Fast GPU OCR server. 270 img/s on FUNSD. TensorRT FP16, PP-OCRv5, HTTP + gRPC. Show HN: We built a way for Claude Code to join meetings like a real teammate Show HN: Razorpay-universal – A framework-agnostic Razorpay SDK
GitHub - tdortman/cuSBF: High-Performance GPU Super Bloom Filter
tdortman · 2026-05-28 · via Hacker News: Show HN

Documentation

Overview

cuSBF is a high-performance GPU implementation of the Super Bloom filter, optimized for high-throughput batch k-mer insertion and query on nucleotide (DNA) and protein sequences (or any other sequence type as long as a valid alphabet is provided).

It exploits the streaming nature of sequence-derived k-mers by using minimizers to group consecutive k-mers sharing the same minimiser into super-k-mers, assigning all k-mers of a super-k-mer to the same 256-bit memory shard. This amortizes random memory accesses across consecutive k-mer queries, reducing memory-bandwidth pressure. The findere scheme further reduces false positives dramatically by inserting overlapping s-mers and requiring a full run of consecutive s-mer matches.

Features

  • CUDA-accelerated batch k-mer insert and query from sequences
  • Configurable k-mer length, minimiser width, s-mer width, and hash function count
  • Minimizer-based shard selection for cache-efficient streaming queries
  • Findere false-positive reduction via overlapping s-mer membership
  • Header-only library design
  • FASTA/FASTQ stream and file support

Performance

image

Benchmarks use Config<31, 28, 16, 4> on an NVIDIA RTX PRO 6000 Blackwell GPU. CPU Super Bloom runs on an Intel Xeon W9-3595X with 120 threads.

Compared against:

Smaller Filter (C. elegans, ~100M k-mers)

Comparison Insert Query
cuSBF vs Super Bloom 92× faster 234× faster
cuSBF vs GBBF 9.1× faster 7.7× faster
cuSBF vs Cuckoo-GPU 80× faster 8.0× faster
cuSBF vs TCF 12× faster 52× faster
cuSBF vs GQF 69× faster 13× faster

Large Filter (CHM13, ~3.1B k-mers)

Comparison Insert Query
cuSBF vs Super Bloom 59× faster 165× faster
cuSBF vs GBBF 8.2× faster 7.6× faster
cuSBF vs Cuckoo-GPU 3427× faster 7.8× faster
cuSBF vs TCF 12× faster 67× faster
cuSBF vs GQF 42× faster 11× faster

False Positive Rate

Bits/k-mer cuSBF s=28 cuSBF s=30 cuSBF s=31 GBBF
21.4 0.848% 0.951% 1.593% 3.069%
85.7 0.091% 0.107% 0.210% 0.126%
342.6 0.0095% 0.0114% 0.0264% 0.0273%

Requirements

  • Linux (x86_64 or aarch64) with an NVIDIA GPU and driver
  • CUDA Toolkit >= 13.1
  • GCC or Clang host compiler (C++20)
  • Meson and Ninja
  • NVIDIA GPU with compute capability 8.0+ (Ampere, Lovelace, Hopper, Blackwell)

Platform support

cuSBF is developed and tested on Linux only.

  • WSL2 on Windows with is a reasonable dev environment (See NVIDIA docs).
  • Native Windows and macOS are not supported or tested. The build uses Linux-specific FASTX paths (for example mmap) and host tooling assumptions (GCC/Clang, GNU statement expressions in CUSBF_TRY/CUSBF_UNWRAP).

Building

meson setup build
ninja -C build

When this repo is the root Meson project, benchmarks, tests, and examples build by default. As a subproject they are skipped unless you force them on.

Option Type Default Description
benchmarks feature auto Google Benchmark binaries
tests feature auto GoogleTest suite
examples feature auto Example CLI
param_sweep feature disabled Parameter-sweep binaries (large, see below)
param_sweep_alphabet combo dna dna or protein when param_sweep is enabled
large_fastx_tests feature disabled Large generated FASTX test (CUSBF_LARGE_FASTX_* env vars)

Each feature option accepts auto, enabled, or disabled:

  • auto — on for a standalone checkout, off when cuSBF is a subproject
  • enabled / disabled — override regardless of project layout

Important

Enabling param_sweep builds many binaries (208 for the DNA alphabet). Leave it disabled unless you need that sweep.

# Default standalone build
meson setup build

# Faster configure: library + examples only
meson setup build -Dbenchmarks=disabled -Dtests=disabled

# Subproject consumer forcing tests on
meson setup build -Dtests=enabled

# Parameter sweep
meson setup build -Dparam_sweep=enabled
meson setup build -Dparam_sweep=enabled -Dparam_sweep_alphabet=protein

Usage

Fallible APIs return cusbf::Result<T> (a thin wrapper over cuda::std::expected<T, Error>). Use return Err(error) (cuda::std::unexpected<Error>, deduces Result<T>) or return Ok() / return {} for Result<void>. For success with a value, return value is enough. Two helpers unwrap results:

Macro On failure Use when
CUSBF_TRY(expr) Copies the error, then return cuda::std::unexpected<Error>(...) from the enclosing function The caller returns Result (library glue, examples/cusbf-main)
CUSBF_UNWRAP(expr) throw std::runtime_error(message()) Tests, main, or other code that does not return Result

Both work as statements or in initializers (auto x = CUSBF_UNWRAP(...)). For full control (typed errors, exit codes), use if (!result) instead.

Quick example (CUSBF_UNWRAP)

#include <cusbf/filter.cuh>

using Config = cusbf::Config<31, 28, 16, 4>;

int main() {
    cusbf::filter<Config> filter(1 << 24);

    CUSBF_UNWRAP(filter.insert_sequence("ACGTACGTACGTACGTACGTACGTACGTACGT"));
    const auto hits = CUSBF_UNWRAP(filter.contains_sequence("ACGTACGTACGTACGTACGTACGTACGTACGT"));

    CUSBF_UNWRAP(filter.insert_fastx_file("reference.fasta"));
    const auto summary = CUSBF_UNWRAP(filter.query_fastx_file("queries.fastq"));

    (void)hits;
    (void)summary;
    return 0;
}

Propagating errors (CUSBF_TRY)

When the caller already returns Result, use CUSBF_TRY so failures propagate without exceptions:

[[nodiscard]] cusbf::Result<void> run(cusbf::filter<Config>& filter) {
    CUSBF_TRY(filter.insert_fastx_file("reference.fasta"));
    const auto summary = CUSBF_TRY(filter.query_fastx_file("queries.fastq"));
    (void)summary;
    return cusbf::Ok();
}

Async device APIs, record batches, and streaming FASTX callbacks follow the same pattern. filter.load_factor() and filter.filter_bits() are synchronous and do not return Result.

Inspecting errors

if (const auto result = filter.query_fastx_file("queries.fastq"); !result) {
    const cusbf::Error& err = result.error();
    std::cerr << err.message() << '\n';
    if (const cusbf::FastxParseError* parse = err.as_fastx_parse()) {
        // parse->location.file / .line / .column
    }
    return 1;
}

CUSBF_CUDA_TRY wraps CUDA runtime calls into Result<void>; CUSBF_CUDA_CALL / CUSBF_CUDA_ABORT are for throw/abort paths only.

Configuration Options

The Config template accepts the following parameters:

Parameter Description Default
K k-mer length (max depends on alphabet) -
S s-mer width for findere Bloom hash seed (1-K) -
M Minimiser width for shard selection (1-K) -
HashCount Number of independent Bloom hash functions (4,8,12,16) 4
CudaBlockSize CUDA threads per block 256
Alphabet Symbol encoding (DNA or protein) DnaAlphabet

Protein Alphabet Support

#include <cusbf/filter.cuh>

using ProteinConfig = cusbf::Config<12, 10, 6, 4, 256, cusbf::ProteinAlphabet>;

[[nodiscard]] cusbf::Result<void> run_protein() {
    cusbf::filter<ProteinConfig> filter(1 << 24);
    CUSBF_TRY(filter.insert_sequence("ACDEFGHIKLMNPQRSTVWY"));
    const auto hits = CUSBF_TRY(filter.contains_sequence("ACDEFGHIKLMNPQRSTVWY"));
    (void)hits;
    return cusbf::Ok();
}