惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

The Register - Security
The Register - Security
MongoDB | Blog
MongoDB | Blog
Martin Fowler
Martin Fowler
I
InfoQ
F
Full Disclosure
Vercel News
Vercel News
M
MIT News - Artificial intelligence
F
Fortinet All Blogs
GbyAI
GbyAI
MyScale Blog
MyScale Blog
L
LangChain Blog
云风的 BLOG
云风的 BLOG
T
The Exploit Database - CXSecurity.com
W
WeLiveSecurity
aimingoo的专栏
aimingoo的专栏
Engineering at Meta
Engineering at Meta
Scott Helme
Scott Helme
Recent Announcements
Recent Announcements
H
Hackread – Cybersecurity News, Data Breaches, AI and More
宝玉的分享
宝玉的分享
T
The Blog of Author Tim Ferriss
Google DeepMind News
Google DeepMind News
Simon Willison's Weblog
Simon Willison's Weblog
The Hacker News
The Hacker News
阮一峰的网络日志
阮一峰的网络日志
T
Threatpost
D
DataBreaches.Net
Jina AI
Jina AI
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
L
LINUX DO - 最新话题
Hacker News - Newest:
Hacker News - Newest: "LLM"
V
Vulnerabilities – Threatpost
T
Tailwind CSS Blog
N
News | PayPal Newsroom
AI
AI
N
News and Events Feed by Topic
Microsoft Azure Blog
Microsoft Azure Blog
WordPress大学
WordPress大学
I
Intezer
Schneier on Security
Schneier on Security
Y
Y Combinator Blog
Google Online Security Blog
Google Online Security Blog
酷 壳 – CoolShell
酷 壳 – CoolShell
The GitHub Blog
The GitHub Blog
博客园 - 三生石上(FineUI控件)
P
Proofpoint News Feed
美团技术团队
B
Blog RSS Feed
小众软件
小众软件
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events

Hacker News - Newest: "LLM"

GitHub - lechmazur/position_bias: A benchmark for testing whether LLM judges keep the same preference when two lightly edited versions of the same story are shown in opposite orders. Flex routing (EU and EFTA) Dark Factories: Retooling for LLM Velocity Ask HN: What would be the impact of a LLM output injection attack? GitHub - AronDaron/dataset-generator: No-code desktop app for generating high-quality synthetic datasets to fine-tune LLMs — plan-then-execute pipeline, LLM-as-judge, HuggingFace upload. GitHub - Oaklight/llm-rosetta: Production-ready LLM API translation layer for Python — bidirectional conversion between OpenAI, Anthropic & Google formats via hub-and-spoke IR. Optional API gateway. Streaming & non-streaming. Zero core deps. Contributions welcome! GitHub - browser-use/browser-harness: Self-healing browser harness that enables LLMs to complete any task. GitHub - moeen-mahmud/remen: Remen turns thoughts into something you can return to Analyzing 156 LLM Launch Posts on Hacker News ChatGPT vs Gemini vs Claude: The Best LLM Subscription You Should Buy GitHub - salaamalykum/quran-semantic-search: High-density RAG Semantic Search Engine & Quran Corpus (GEO/SEO Architecture) GitHub - NVIDIA/TensorRT-LLM: TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way. The State of LLM Bug Bounties in 2026 Operational Readiness Criteria for Tool-Using LLM Agents Meshcore: Architecture for a Decentralized P2P LLM Inference Network How an LLM becomes more coherent as we train it GitHub - seetrex-ai/laimark GitHub - Jossifresben/BibCrit: AI-assited biblical textual criticism GitHub - wastedcode/memex: File system based wiki, maintained by Claude 99helpers.com GitHub - cliver-project/AITrigram GitHub - unbody-io/adapt: A self-evolving memory layer for AI agents. GitHub - hb20007/awesome-gen-ai-fails: A list of incidents where reliance on generative AI and LLMs resulted in harm to companies, individuals, or society GitHub - nevenkordic/localmind: Run any local LLM with persistent memory and context. CLI agent over Ollama with SQLite-backed hybrid recall. No cloud. Ask HN: What are the machine requirements for a LLM like Llama-3.1-8B? Faster LLM Inference via Sequential Monte Carlo grpo explained: group relative policy optimization for llm finetuning - cgft Stop comparing price per million tokens: the hidden LLM API costs · TensorZero Andrej Karpathy's LLM Wiki Is a Bad Idea GitHub - GG-QandV/mnemostroma: Offline RAM-first cognitive leer/coprocessor for AI agents and robotics. Solves "Context Abandonment" with 20-80ms latency using a dual-thread biomimetic memory architecture (ONNX + SQLite WAL). mempalace/agent at agent · skorotkiewicz/mempalace GitHub - Nyquest-ai/nyquest-rust-fullstack-pub: Nyquest — Semantic Compression Proxy for LLMs. 350+ rules, local LLM stage, 15-75% token savings. Full Rust stack. GitHub - TheoV823/mneme: Enforce architectural decisions in AI-assisted development. GitHub - klemenvod/TokenBrawl: A 1v1 Bomberman-style game where two LLM agents play autonomously against each other. No human plays — you watch the AIs fight. Each agent receives a text description of the board state, reasons about it, and outputs a move as JSON. The game engine executes it. Introducing the Common AI Provider: LLM and AI Agent Support for Apache Airflow Power Circuit AI: Designing Power Electronic Circuits for Motor Drives with Generative Artificial Intelligence Ask HN: How to program with IDE and LLM on CPU locally? Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis Bonsai 1-bit WebGPU - a Hugging Face Space by webml-community The LLM Fallacy: Misattribution in AI-Assisted Cognitive Workflows Ask HN: Simple tooling for local LLM code critique without IDE integration? Can a General LLM Diagnose a DICOM Slice? A 10-Case Public Benchmark Charts-of-Thought: Enhancing LLM Visualization Literacy (PDF, 2026) GitHub - Mesh-LLM/mesh-llm: Distributed AI/LLM for the people. Share compute privately or publicly to power your agents and chat. GitHub - seamus-brady/springdrift: A persistent runtime for long-lived LLM agents Writing an LLM from scratch, part 32k -- Interventions: training a better model locally with gradient accumulation Ask HN: Which LLM model and agentic CLI are you using for local development? GitHub - wayneColt/modelcascade: Route local. Escalate smart. Never overspend. Open-source multi-model cascade routing for autonomous agents. LLM pricing is 100x harder than you think GitHub - asakin/llm-primer: Pre-warmed Claude Code sessions in tmux. No startup wait. GitHub - EggerMarc/chat-rs: A multi-provider LLM framework for Rust. GitHub - SynapseKit/SynapseKit: Minimal, async-first Python framework for production LLM apps- 2 hard deps, no magic, no SaaS. A Claude Skill that Makes LLM Paragraphs More Bearable Does Gas Town 'steal' usage from users' LLM credits & paid services to improve itself? What's Claude Code Actually Doing? Open the Black Box with the Arthur Engine Milla Jovovich's New Open Source LLM Memory App and the Dark Code Problem Your intuition of LLM token usage might be wrong Show HN: Bloomberg Terminal for LLM ops – free and open source GitHub - 0xchamin/mcptube: Transform YouTube videos into a compounding knowledge base with transcripts, vision analysis, and agentic search. Works as an MCP server for Claude, Copilot & more. Show HN: Open KB: Open LLM Knowledge Base Your LLM is a compiler, not a runtime GitHub - sapountzis/Unslop: A Web Feed That Deserves You crates.io: Rust Package Registry Beyond Karpathy's LLM-Wiki: The Necessity of Cognitive Governance GitHub - amitshekhariitbhu/llm-internals: Learn LLM internals step by step - from tokenization to attention to inference optimization. GitHub - parallem-ai/parallem: An expressive library for running agents with the Batch API. GitHub - stfurkan/pi-llm LLM-Wiki Show HN: Formal – Formal verification for AI-generated code using Lean 4 LRTS – Regression testing for LLM prompts (open source, local-first) LLM Wiki Skill: Build a Second Brain with Claude Code and Obsidian I built an LLM Wiki and RAG solution: here's a demo for a security KB The biggest advance in AI since the LLM Predict-Rlm: The LLM Runtime That Lets Models Write Their Own Control Flow the-synthetic-library/the-synthetic-mind at main · joshferrer1/the-synthetic-library GitHub - yisding/reviewwiggum GitHub - Donnyb369/mcp-spine: Context Minifier & State Guard — Local-first MCP middleware proxy GitHub - Beledarian/wgpu-llm: A from-scratch LLM inference engine that uses wgpu (the cross-platform WebGPU implementation) to dispatch WGSL compute shaders for every math operation a Transformer needs. No CUDA. No Python. No massive framework dependencies. Just Rust, raw shaders, and your GPU. GitHub - anitiue/Hindsight: An experience-driven self-improvement framework for LLM agents — 基于经验的 LLM Agent 自我改进框架 GitHub - stef41/lmscan: 🔍 Detect AI-generated text and fingerprint which LLM wrote it. Open-source GPTZero alternative. Zero dependencies, works offline. GitHub - alainnothere/AmdPerformanceTesting: Amd Performance Testing Ask HN: Is a purely Markdown-based CRM a terrible idea? Optimized for LLM agents Context Engineering - LLM Memory and Retrieval for AI Agents | Weaviate little_helper_tui/letter.md at main · sleepyeldrazi/little_helper_tui GitHub - EvanZhouDev/umr: The Unified Model Registry for all your local AI apps. GitHub - JordanCT/VigIA-Orchestrator Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain A Taxonomy of RL Environments for LLM Agents Llama LLM Network Feture GitHub - genedeng-ca/ai-mac-migration: AI-powered Mac-to-Mac migration tool - replace Apple Migration Assistant with intelligent, selective transfer using local LLMs GitHub - lunargate-ai/gateway: High-performance self-hosted AI gateway (OpenAI-compatible) with routing, retries, and streaming GitHub - AuthBits/webmcp: A lightweight, prompt-driven MCP web research server for high-quality LLM powered information extraction. Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering Springdrift: An Auditable Persistent Runtime for LLM Agents with Case-Based Memory, Normative Safety, and Ambient Self-Perception High-Stakes Personalization: Rethinking LLM Customization for Individual Investor Decision-Making From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents HUOZIIME: An On-Device LLM-enhanced Input Method for Deep Personalization TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users
GenDB — LLM-Powered Generative Query Engine
matt_d · 2026-06-18 · via Hacker News - Newest: "LLM"

The Next Generation of
Query Processing

GenDB is a Generative Query Engine that uses LLM agents to generate instance-optimized query execution code, tailored to your specific data, workloads, and hardware.

☝ Interactive guided tour • Step-by-step visualization • Try your own data

3.2x

faster than DuckDB on TPC-H

6.8x

faster than DuckDB on SEC-EDGAR

462x

faster than PostgreSQL on TPC-H

280x

faster than PostgreSQL on SEC-EDGAR

What is GenDB?

Synthesized, Not Engineered

Five specialized LLM agents collaborate through a structured pipeline to generate optimized storage, indexes, and standalone native executables — all tailored to the specific data, workload, and hardware.

GenDB System Overview

Agent 1

Workload Analyzer

Profiles hardware, samples data, extracts workload characteristics

Agent 2

Storage Designer

Designs layouts with encoding, compression, indexes, and zone maps

Agent 3

Query Planner

Generates resource-aware execution plans adapted to data and hardware

Agent 4

Code Generator

Implements plans as optimized native code with SIMD and parallelism

Agent 5

Query Optimizer

Iteratively refines code using runtime profiling feedback

Why GenDB?

A Third Option

Today, every new use case demands either a painful extension or an entirely new system:

Option 2 — Build a new system

DuckDB, Umbra, ClickHouse, Milvus, Pinecone, InfluxDB, Neo4j …
Each requires years of engineering and huge monetary costs.

Option 3 — Generate

Use LLMs to generate per-query execution code. No extension wrestling, no multi-year engineering. New techniques become reachable through prompt updates.

Performance

Instance-optimized code exploits exact data distributions, join selectivities, group cardinalities, and hardware characteristics. No general-purpose engine can match this.

Extensibility

Integrating new techniques requires prompting, not re-engineering. Semantic queries, GPU-native code — all reachable through prompt updates.

Leaderboard

Performance Rankings

Total query execution time across all queries. GenDB variants use different LLM backbone models. All systems run on identical hardware with full parallelism enabled.

TPC-H (SF10, ~10GB)

SEC-EDGAR (3yr, ~5GB)

# System Total Time vs. Best GenDB Relative
# System Total Time vs. Best GenDB Relative

Model Comparison

Generation Cost & Speed

Different LLM backbone models offer different trade-offs between generated code quality, generation time, and cost. Ranked by average query execution time.

Language Comparison

C++ vs Optimized C++ vs Rust

We select the best-performing C++ binary for each TPC-H query from a GenDB run, then give Claude Code (Opus 4.6) 5 iterations to analyze, profile, and improve — first for optimized C++, then for a full Rust rewrite.

Original C++

GenDB-generated code with standard compilation.

241 ms

total (5 queries)

Optimized C++

Aggressive flags, madvise tuning, parallelized joins, thread optimization.

185 ms

total — 1.30x faster

Rust

Full rewrite with rayon, memmap2, unsafe bounds-check elimination.

283 ms

total — competitive main_scan

Query Original C++ Optimized C++ Rust Best
Q1 49.8 ms 39.2 ms 71.7 ms Opt. C++
Q3 25.0 ms 26.0 ms 52.5 ms Orig. C++
Q6 31.8 ms 35.5 ms 23.7 ms Rust
Q9 85.4 ms 64.4 ms 101.9 ms Opt. C++
Q18 49.2 ms 20.1 ms 32.8 ms Opt. C++
Total 241.2 ms 185.2 ms 282.6 ms Opt. C++ (1.30x)

Key findings: Optimized C++ achieves a 1.30x overall speedup, with Q18 showing the largest gain (2.44x) from parallelized join building. Rust wins on Q6 (zone-map scan with get_unchecked) but carries ~30ms per-query overhead from mmap page table setup, penalizing short queries. The Rust main_scan compute times are competitive with C++, suggesting the overhead is structural rather than algorithmic. We plan to introduce a dedicated Code Refiner agent to the pipeline, responsible for low-level, implementation-level optimizations — to automatically achieve these gains as part of the standard GenDB workflow.

Roadmap

What’s Next

GenDB is under active development. Every step follows three principles:

Completed

OLAP Workloads

Multi-agent pipeline for analytical queries. Evaluated on TPC-H and SEC-EDGAR, outperforming DuckDB, Umbra, ClickHouse, MonetDB, and PostgreSQL.

In Progress

Self-Evolving Agent Memory

Agents learn from past runs, accumulate optimization experience, and improve generation quality over time — without retraining the underlying LLMs.

Planned

GPU-Native Code Generation

Generate CUDA and GPU-accelerated code targeting libcudf for cost-efficient GPU analytics, not just CPU.

Planned

Semantic Query Processing

Generate code for multimodal data — images, audio, text — with AI-powered operators, moving beyond SQL’s relational model.

Planned

… and more

Reusable operators across queries, query template generation, hybrid execution with traditional DBMS, and further cost reduction as LLMs become faster and cheaper.