惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
V2EX - 技术
V2EX - 技术
The Register - Security
The Register - Security
H
Help Net Security
S
SegmentFault 最新的问题
宝玉的分享
宝玉的分享
Recorded Future
Recorded Future
GbyAI
GbyAI
Recent Announcements
Recent Announcements
T
Tailwind CSS Blog
MyScale Blog
MyScale Blog
L
LangChain Blog
D
DataBreaches.Net
M
MIT News - Artificial intelligence
雷峰网
雷峰网
WordPress大学
WordPress大学
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
Apple Machine Learning Research
Apple Machine Learning Research
H
Hackread – Cybersecurity News, Data Breaches, AI and More
博客园 - 司徒正美
C
Check Point Blog
T
The Blog of Author Tim Ferriss
F
Fortinet All Blogs
Microsoft Security Blog
Microsoft Security Blog
T
The Exploit Database - CXSecurity.com
G
Google Developers Blog
博客园 - 聂微东
MongoDB | Blog
MongoDB | Blog
Blog — PlanetScale
Blog — PlanetScale
D
Darknet – Hacking Tools, Hacker News & Cyber Security
P
Palo Alto Networks Blog
有赞技术团队
有赞技术团队
Attack and Defense Labs
Attack and Defense Labs
N
News | PayPal Newsroom
V
V2EX
T
Troy Hunt's Blog
N
News and Events Feed by Topic
The GitHub Blog
The GitHub Blog
Webroot Blog
Webroot Blog
The Hacker News
The Hacker News
I
InfoQ
L
LINUX DO - 最新话题
AWS News Blog
AWS News Blog
美团技术团队
博客园 - 叶小钗
SecWiki News
SecWiki News
G
GRAHAM CLULEY
Vercel News
Vercel News
A
About on SuperTechFans

Hacker News: Front Page

SPICE simulation → oscilloscope → verification with Claude Code — Lucas Gerads GitHub - GainSec/AutoProber: Hardware hacker’s flying probe automation stack for agent-driven target discovery, microscope mapping, safety-monitored CNC motion, probe review, and controlled pin probing. Introducing Claude Opus 4.7 Qwen Studio The Future of Everything is Lies, I Guess: Where Do We Go From Here? GitHub - SeanFDZ/macmind: Single-layer transformer in HyperTalk for the classic Macintosh Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis Ancient DNA reveals pervasive directional selection across West Eurasia [pdf] AI cybersecurity is not proof of work Moving a large-scale metrics pipeline from StatsD to OpenTelemetry / Prometheus GitHub - Nightmare-Eclipse/RedSun: The Red Sun vulnerability repository GitHub - SethPyle376/hiraeth: Local AWS emulator focused on fast integration testing, with SQS support, SQLite-backed state, and a debug-friendly web UI. A Better Ludum Dare; Or, How to Ruin a Legacy GitHub - macOS26/Agent: Any AI, replaces Claude Code, Cursor, OpenClaw. Over 18 LLM providers (Claude, OpenAI, Gemini, Ollama, Zai, HF, Qwen) wired into a native Mac app that writes code, builds Xcode projects, bumps versions, manages git, automates Safari, use AppleScript, JS or Accessibility, extend Agent! w/ MCP Servers, run tasks from your iPhone via Messages. YouTube now lets you turn off Shorts I Made a Terminal Pager Burgers | マクドナルド公式 Commands — HackerNews CLI documentation ChatGPT for Excel PiCore - Raspberry Pi Port of Tiny Core Linux Live Nation illegally monopolized ticketing market, jury finds Google Broke Its Promise to Me. Now ICE Has My Data. Founding Engineer at Adaptional | Y Combinator CRISPR takes important step toward silencing Down syndrome’s extra chromosome GitHub - saffron-health/libretto: The AI toolkit for building reliable browser automations US v. Heppner (S.D.N.Y. 2026) no attorney-client privilege for AI chats [pdf] Unexpected €54k billing spike in 13 hours: Firebase browser key without API restrictions used for Gemini requests Fragments: April 14 Cal.com Goes Closed Source: Why AI Security Is Forcing Our Decision | Cal.com - Scheduling Software for Online Bookings Laravel raised money and now injects ads directly into your agent Pakistan hospital at centre of child HIV outbreak caught reusing syringes in BBC film Codex Hacked a Samsung TV Tech Valuations Back to Pre-AI Boom Levels A perfectable programming language — Soter GitHub - halfwhey/claudraband: Claude Code for the Power User Partnership through Play: Investigating How Long-Distance Couples Use Digital Games to Facilitate Intimacy Textbooks and Methods of Note-Taking in Early Modern Europe (2008) Eternity in six hours: Intergalactic spreading of intelligent life (2013) Seven countries now generate 100% of their electricity from renewable energy Tell HN: OpenAI silently removed Study Mode from ChatGPT Pro Max 5x Quota Exhausted in 1.5 Hours Despite Moderate Usage Show HN: Oberon System 3 runs natively on Raspberry Pi 3 (with ready SD card) Tell HN: docker pull fails in spain due to football cloudflare block Bring Back Idiomatic Design No one owes you supply-chain security GitHub - xsawyerx/curl-doom: DOOM, played over cURL Apple update turns Czech mate for locked-out iPhone user The Grand Line Cache TTL silently regressed from 1h to 5m around early March 2026, causing quota and cost inflation Building a Z-Machine in the worst possible language The peril of laziness lost Iran war: We spoke to the man making Lego-style AI videos that experts say are powerful propaganda AI Will Be Met With Violence, and Nothing Good Will Come of It GitHub - duguyue100/midnight-captain: Inspired by Midnight Commander, tailored to my taste. How to build a `git diff` driver · Jamie Tanna | Software Engineer Center for Responsible, Decentralized Intelligence at Berkeley The Local Universe’s Expansion Rate Is Clearer Than Ever, but Still Doesn’t Add Up - A new synthesis of astronomical measurements confirms a persistent mismatch that could point to physics beyond current models The disturbing white paper Red Hat is trying to erase from the internet – OSnews NetBlocks (@netblocks@mastodon.social) The Future of Everything is Lies, I Guess: Annoyances ‘Abhorrent’: the inside story of the Polymarket gamblers betting millions on war Productive procrastination — Max van IJsselmuiden maps, territory and LMs 447 Terabytes per Square Centimetre at Zero Retention Energy: Non-Volatile Memory at the Atomic Scale on Fluorographane Show HN: Pardonned.com – A searchable database of US Pardons 20 Years on AWS and Never Not My Job The Seasons are Wrong The FAA wants gamers to apply for air traffic control jobs Artemis II crew splashes down near San Diego after historic moon mission Why weekends are under threat We gave an AI a 3 year retail lease in SF and asked it to make a profit | Andon Labs How a dancer with ALS used brainwaves to perform live On filing the corners off my MacBooks Installing every* Firefox extension OpenClaw’s memory is unreliable, and you don’t know when it will break Steve Blank Nowhere Is Safe Chimpanzees in Uganda locked in vicious 'civil war', say researchers watgo - a WebAssembly Toolkit for Go linux/Documentation/process/coding-assistants.rst at master · torvalds/linux GitHub - callumlocke/json-formatter: Makes JSON easy to read. Founding Product Engineer at Bild AI | Y Combinator A compelling title that is cryptic enough to get you to take action on it GitHub - Keychron/Keychron-Keyboards-Hardware-Design: Industrial design files for Keychron keyboards and mice. 100+ models with CAD assets in STEP, DXF, DWG, and PDF. Source-available, with commercial use allowed for original compatible accessories within the license terms. [ANNOUNCE] WireGuardNT v0.11 and WireGuard for Windows v0.6 Released 1D-Chess Helium Is Hard to Replace Keeping a Postgres queue healthy — PlanetScale Serenity Forge (@serenityforge.com) Our response to the Axios developer tool compromise Do Americans read print books, e-books or audiobooks more? Uncharted island soon to appear on nautical charts The Problem That Built an Industry Fragments: April 2 Python Release Python install manager 26.1 Bitcoin miners are losing $19,000 on every BTC produced as difficulty drops 7.8% God sleeps in the minerals Harness engineering: leveraging Codex in an agent-first world Apple Silicon and Virtual Machines: Beating the 2 VM Limit What have been the greatest intellectual achievements? The APL Programming Language Source Code
GitHub - aymanhs/nanotdb: A tiny, append‑only time‑series database designed for long‑running sensor data on modest hardware.
aymanhs72 · 2026-05-15 · via Hacker News: Front Page

A small, embedded time-series database designed for resource-constrained hosts (Raspberry Pi, edge nodes, IoT gateways). No external dependencies at runtime. All data lives in plain files under a single root directory.


Architecture overview

Engine
 ├── "prod"    Database  → WAL (prod.wal) + Catalog (catalog.json) + partitioned .dat files
 ├── "sensors" Database  → WAL + Catalog + partitioned .dat files
 └── "internal"          → engine self-metrics (same layout, never exposed to users)

The Engine is the single entry point. It owns a collection of named databases and routes ingested samples to the right one based on the line-protocol prefix.

Each Database has three storage layers:

Layer File Purpose
WAL <db>.wal Crash-safety: records every sample before it enters the page
Catalog catalog.json Maps metric names ↔ compact MetricIDs + value types
Data files data-<partition>.dat Immutable compressed pages flushed from memory

Data flow

Ingest (AddLine)

AddLine("prod/room.temp 21.5 1715000000000000000")
  │
  ├─ parse line protocol  →  dbName="prod"  metric="room.temp"  ts=…  value=21.5
  ├─ getOrCreateDB        →  open or reuse prod Database
  ├─ WAL append           →  write compact record to prod.wal  (crash-safe)
  ├─ addToOpenDay         →  append to in-memory Page for today's bucket
  └─ if page full         →  compress + write page frame to data-<partition>.dat
                              reset WAL (replay no longer needed)

Timestamps must be monotonically non-decreasing per metric across the entire write stream. Out-of-order or stale samples are rejected.

Replay (on engine open)

When a database is opened, the WAL is replayed into the in-memory page if the data file is behind. The catalog is used to resolve ValueTypes for metrics that omit them (compact format optimization). After a full replay the engine is ready to accept new writes.

Query (QueryRange)

QueryRange("prod", "room.temp", fromTS, toTS, stride, callback)
  │
  ├─ iterate UTC days in [fromTS, toTS]
  │    ├─ open data-<partition>.dat  →  scan page frame headers
  │    │    skip frames outside time window (no decompression)
  │    │    decompress + scan matching frames
  │    └─ check in-memory page for today's data
  └─ call callback for each sample (every Nth if stride > 1)

Line protocol

DB/metric.name value [ts]
  • DB — database name (created automatically on first write)
  • metric.name — arbitrary metric identifier (slash-separated namespaces work well)
  • value — integer (42, -7) or float (3.14, 1e-3). An integer literal always creates an int32 metric; a float literal creates a float32 metric. Type is fixed on first write; mixing types for the same metric is an error.
  • ts — Unix nanosecond timestamp (optional; defaults to time.Now())

Examples:

prod/room.temp 21.5 1715000000000000000
sensors/pressure.hpa 1013
internal/batch.size 256i

The i suffix forces integer interpretation for values that look like floats.


WAL format (compact v2)

Each record is a uvarint length prefix followed by a fixed-layout payload:

[uvarint: payload_len] [payload]

Payload layout:

Offset  Size  Field
  0      2    MetricID          uint16 LE
  2      3    TS delta          uint24 LE nanoseconds from baseline
  5      1    CompactTL flags   bit 7 = new baseline, bit 6 = new metric
  6      8    Baseline TS       int64 LE  (only when bit 7 set)
  —      var  name_len+name+vtype         (only when bit 6 set)
  —      4    Value             int32 or float32 LE, always present
  • Hot path (known metric, same baseline): 2+3+1+4 = 10 bytes + 1 varint = 11 bytes.
  • A new baseline is emitted on the first record of each WAL and whenever the timestamp gap exceeds ~16.7 ms (2²⁴ ns). Typical sensor streams fit hundreds of seconds between baseline resets.
  • Known metrics (previously seen in the session) omit the name and value type; those fields are recovered from the catalog during replay.

On-disk layout

<root>/
  engine.toml          — engine configuration (auto-created on first start)
  <db>/
    catalog.json       — metric registry: name → id + type
    manifest.toml      — per-database settings (retention, WAL, page limits)
    <db>.wal           — write-ahead log (single reusable file)
    data-<partition>.dat — compressed page frames for completed partitions

Data files are append-only sequences of page frames:

Frame = PageHeader(18 bytes) + compressed_len(uvarint) + S2-compressed payload + CRC32(4 bytes)

The payload is a flat array of interleaved (MetricID, Timestamp, Value) triples, sorted by timestamp. S2 compression typically achieves 3–4× on realistic sensor data.


Configuration (engine.toml)

Created automatically at <root>/engine.toml on first start. Key settings:

Key Default Effect
engine.listen :8428 HTTP server address
wal.max_segment_size 67108864 (64 MiB) WAL size before reset after a page flush
wal.fsync_policy segment segment = fsync on WAL reset; always = fsync every append
durability.profile strict strict / balanced / throughput (see below)
stats.enabled true Emit engine self-metrics to the internal database
stats.interval 30s How often stats are flushed

Durability profiles:

Profile Page file fsync Catalog fsync
strict yes yes
balanced yes no
throughput no no

Per-database settings (retention, partitioning, WAL skip window, page flush thresholds, rollups) live in <db>/manifest.toml and default values can be set in engine.toml under [manifest_defaults].

Partition options in [retention]:

  • partition = "day" (default): data-YYYY-MM-DD.dat
  • partition = "month": data-YYYY-MM.dat
  • partition = "year": data-YYYY.dat
  • partition = "forever": data-forever.dat

Rollups (manifest.toml)

Rollup jobs are defined in the source database manifest under [rollups].

Example:

[rollups]
enabled = true
checkpoint_file = "rollup.checkpoints.log"
default_grace = "5m"

[[rollups.jobs]]
id = "outside_temp_1h"
source_metric = "temp.out_dry"
interval = "1h"
aggregates = ["min", "max", "sum", "avg", "count"]
destination_db = "sensors_rollup_1h"
destination_metric_prefix = "temp.out_dry"

Rollup config reference:

Field Scope Required Valid / Default Notes
rollups.enabled DB no `true false(defaultfalse`)
rollups.checkpoint_file DB no string (default rollup.checkpoints.log) Checkpoint log path, relative to source DB directory.
rollups.default_grace DB no Go duration or empty Used when job grace is omitted.
rollups.jobs[].id Job yes non-empty string Unique per source DB for checkpoint tracking.
rollups.jobs[].source_metric Job yes non-empty string Metric to read from source DB.
rollups.jobs[].interval Job yes valid Go duration (>0) Rollup bucket size (for example 1h, 24h).
rollups.jobs[].aggregates Job no `min max
rollups.jobs[].destination_db Job yes non-empty string Target DB receiving rollup samples.
rollups.jobs[].destination_metric_prefix Job no string (default source_metric) Output names are <prefix>.<agg>.
rollups.jobs[].grace Job no Go duration or empty Overrides default_grace for this job.

Notes:

  • Checkpoints are stored in the source DB (default rollup.checkpoints.log).
  • Destination DBs can also define their own rollup jobs to create cascades (for example 1h -> 1d).
  • For low-frequency rollup outputs, use coarser partitions on destination DBs (for example month or year) to avoid many tiny day files.

Binaries

nanotdb — server

nanotdb --config <path>      start server using given engine.toml
nanotdb --init --config <path>   write default engine.toml and exit

Exposes a small HTTP API compatible with the VictoriaMetrics instant/range query wire format (/api/v1/query, /api/v1/query_range, /api/v1/import/prometheus).

nanocli — offline CLI tool

Operates directly on the data directory without a running server.

nanocli inspect db  --root <dir> [--db <name>] [--json]  — overview of all/one database
nanocli inspect dat --root <dir>  --db <name>  [--json]  — page frame headers in .dat files
nanocli inspect wal --root <dir>  --db <name>  [--json]  — WAL record dump

nanocli import --root <dir> --in <file.lp>  [--json]     — bulk import line-protocol file
nanocli export --root <dir> --db <name> [--out <file.lp>] — export database to line protocol (stdout when --out is omitted)

nanocli query  --root <dir> --db <name> --metric <regex>
               [--start <time>] [--end <time>] [--format table|json]

LP timestamps (import and exported files) accept / use: YYYY-MM-DD HH:MM:SS.nnnnnnnnn (UTC) and also accept raw Unix nanoseconds on import.

--start / --end accept RFC3339 strings, YYYY-MM-DD [HH[:MM[:SS[.nnnnnnnnn]]]], or Unix timestamps (seconds or nanoseconds).

Rollup full-cycle check script

For deterministic end-to-end verification (generate LP -> import -> rollups -> export -> compare expected), run:

./scripts/rollup_full_cycle_check.sh

Optional arguments:

  • ./scripts/rollup_full_cycle_check.sh <root-dir> <duration-hours> <metrics> <cadence-seconds> <gap-metrics>
  • Defaults: root-dir=test-data/full-cycle-check, duration-hours=30, metrics=10, cadence-seconds=10, gap-metrics=2

Generated artifacts are placed in <root-dir>/work for easy discovery:

  • scenario_summary.json (duration, rates, counts, per-metric stats)
  • known_gaps.csv (deterministic missing windows for temp.gap_probeXX metrics)
  • SCENARIO.md (quick human-readable summary)

Engine API (embedding)

e, err := engine.OpenEngine("/data", 0)   // 0 = default WAL segment size
defer e.Close()

// Ingest
err = e.AddLine("sensors/temp 22.1 " + strconv.FormatInt(time.Now().UnixNano(), 10))

// Range query
err = e.QueryRange("sensors", "temp", fromTS, toTS, 1, func(s engine.Sample) error {
    fmt.Println(s.TS, s.Float32)
    return nil
})

// Last value (from in-memory catalog cache)
sample, ok, err := e.QueryLast("sensors", "temp")

// Bulk import / export
err = e.ImportFile("backup.lp")
err = e.ExportFile("sensors", "backup.lp")

Key types:

Type Description
Engine Top-level coordinator; safe for concurrent use
Database One named DB with WAL + catalog + data files
Catalog Metric name ↔ ID registry; persisted as JSON
Page In-memory buffer of interleaved samples; flushed when full
WAL Single-file write-ahead log with compact v2 encoding
Sample Decoded data point from a query
Timestamp int64 Unix nanoseconds
MetricID uint16 per-database metric address