惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

S
Schneier on Security
Blog — PlanetScale
Blog — PlanetScale
L
LangChain Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog
G
GRAHAM CLULEY
Simon Willison's Weblog
Simon Willison's Weblog
The Hacker News
The Hacker News
博客园_首页
W
WeLiveSecurity
Recorded Future
Recorded Future
S
Secure Thoughts
C
Check Point Blog
Y
Y Combinator Blog
Project Zero
Project Zero
量子位
www.infosecurity-magazine.com
www.infosecurity-magazine.com
S
Security Archives - TechRepublic
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
Scott Helme
Scott Helme
Spread Privacy
Spread Privacy
V
Vulnerabilities – Threatpost
AWS News Blog
AWS News Blog
S
Security @ Cisco Blogs
T
Threatpost
F
Full Disclosure
P
Proofpoint News Feed
T
The Exploit Database - CXSecurity.com
阮一峰的网络日志
阮一峰的网络日志
TaoSecurity Blog
TaoSecurity Blog
Last Week in AI
Last Week in AI
E
Exploit-DB.com RSS Feed
Microsoft Security Blog
Microsoft Security Blog
N
News | PayPal Newsroom
C
Cybersecurity and Infrastructure Security Agency CISA
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
C
Cisco Blogs
月光博客
月光博客
S
SegmentFault 最新的问题
B
Blog
GbyAI
GbyAI
J
Java Code Geeks
小众软件
小众软件
D
Docker
IT之家
IT之家
SecWiki News
SecWiki News
F
Fortinet All Blogs
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Google Online Security Blog
Google Online Security Blog
NISL@THU
NISL@THU

Hacker News

Introducing Claude Opus 4.7 Qwen Studio The Future of Everything is Lies, I Guess: Where Do We Go From Here? GitHub - SeanFDZ/macmind: Single-layer transformer in HyperTalk for the classic Macintosh Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis Ancient DNA reveals pervasive directional selection across West Eurasia [pdf] Moving a large-scale metrics pipeline from StatsD to OpenTelemetry / Prometheus GitHub - Nightmare-Eclipse/RedSun: The Red Sun vulnerability repository GitHub - SethPyle376/hiraeth: Local AWS emulator focused on fast integration testing, with SQS support, SQLite-backed state, and a debug-friendly web UI. GitHub - macOS26/Agent: Any AI, replaces Claude Code, Cursor, OpenClaw. Over 18 LLM providers (Claude, OpenAI, Gemini, Ollama, Zai, HF, Qwen) wired into a native Mac app that writes code, builds Xcode projects, bumps versions, manages git, automates Safari, use AppleScript, JS or Accessibility, extend Agent! w/ MCP Servers, run tasks from your iPhone via Messages. YouTube now lets you turn off Shorts I Made a Terminal Pager Burgers | マクドナルド公式 Commands — HackerNews CLI documentation ChatGPT for Excel PiCore - Raspberry Pi Port of Tiny Core Linux Live Nation illegally monopolized ticketing market, jury finds Google Broke Its Promise to Me. Now ICE Has My Data. Founding Engineer at Adaptional | Y Combinator CRISPR takes important step toward silencing Down syndrome’s extra chromosome GitHub - saffron-health/libretto: The AI toolkit for building reliable browser automations US v. Heppner (S.D.N.Y. 2026) no attorney-client privilege for AI chats [pdf] Unexpected €54k billing spike in 13 hours: Firebase browser key without API restrictions used for Gemini requests Retrofitting JIT Compilers into C Interpreters IPv6 – Google The Accursèd Alphabetical Clock Cybersecurity Looks Like Proof of Work Now Fragments: April 14 Cal.com Goes Closed Source: Why AI Security Is Forcing Our Decision | Cal.com - Scheduling Software for Online Bookings Laravel raised money and now injects ads directly into your agent When moving fast, talking is the first thing to break Too much Discussion of the XOR swap trick – Heather Cafe Introduction to Spherical Harmonics for Graphics Programmers The Grand Line Building a Z-Machine in the worst possible language High-Level Rust: Getting 80% of the Benefits with 20% of the Pain GitHub - duguyue100/midnight-captain: Inspired by Midnight Commander, tailored to my taste. How to build a `git diff` driver · Jamie Tanna | Software Engineer Center for Responsible, Decentralized Intelligence at Berkeley The Local Universe’s Expansion Rate Is Clearer Than Ever, but Still Doesn’t Add Up - A new synthesis of astronomical measurements confirms a persistent mismatch that could point to physics beyond current models The air throughout our homes is infused with microplastics. But there are things you can do to breathe less of them The disturbing white paper Red Hat is trying to erase from the internet – OSnews The Future of Everything is Lies, I Guess: Annoyances ‘Abhorrent’: the inside story of the Polymarket gamblers betting millions on war Productive procrastination — Max van IJsselmuiden maps, territory and LMs 447 Terabytes per Square Centimetre at Zero Retention Energy: Non-Volatile Memory at the Atomic Scale on Fluorographane Show HN: Pardonned.com – A searchable database of US Pardons 20 Years on AWS and Never Not My Job The Seasons are Wrong Artemis II crew splashes down near San Diego after historic moon mission We gave an AI a 3 year retail lease in SF and asked it to make a profit | Andon Labs How a dancer with ALS used brainwaves to perform live On filing the corners off my MacBooks Installing every* Firefox extension OpenClaw’s memory is unreliable, and you don’t know when it will break Steve Blank Nowhere Is Safe Chimpanzees in Uganda locked in vicious 'civil war', say researchers watgo - a WebAssembly Toolkit for Go linux/Documentation/process/coding-assistants.rst at master · torvalds/linux GitHub - callumlocke/json-formatter: Makes JSON easy to read. Founding Product Engineer at Bild AI | Y Combinator A compelling title that is cryptic enough to get you to take action on it GitHub - Keychron/Keychron-Keyboards-Hardware-Design: Industrial design files for Keychron keyboards and mice. 100+ models with CAD assets in STEP, DXF, DWG, and PDF. Source-available, with commercial use allowed for original compatible accessories within the license terms. [ANNOUNCE] WireGuardNT v0.11 and WireGuard for Windows v0.6 Released 1D-Chess Helium Is Hard to Replace Cooperative Vectors Introduction | Evolve Keeping a Postgres queue healthy — PlanetScale Our response to the Axios developer tool compromise Do Americans read print books, e-books or audiobooks more? The Zettelkasten Method in Obsidian: A Practical Setup Guide Artemis II Is Competency Porn and We Are Starving For It WeakC4 Flight Viz — Cockpit View A Mexican surveillance giant you’ve never heard of is now watching the U.S. border Surelock: Deadlock-Free Mutexes for Rust RISC-V 101 – what is it and what does it mean for Canonical? | Ubuntu The Problem That Built an Industry How Much Linear Memory Access Is Enough? | Solidean Investigating Split Locks on x86-64 Simplest hash functions Sybilproof reputation mechanisms (2005) [pdf] What is a property? How Complex is my Code? Static code analysis in Kotlin — tools overview Toffoli gates are all you need PGLite evangelism dcmake: a new CMake debugger UI Clojure on Fennel part one: Persistent Data Structures Fragments: April 2 Python Release Python install manager 26.1 The Life and Death of the Book Review - Liberties Bitcoin miners are losing $19,000 on every BTC produced as difficulty drops 7.8% God sleeps in the minerals Building slogbox Apple Silicon and Virtual Machines: Beating the 2 VM Limit Who was “Not Even Wrong” first? Pokemon Evolution Vs Darwinian Evolution The APL Programming Language Source Code
Building a tiny FUSE filesystem
2026-06-13 · via Hacker News

Lately I have been working around sandboxing, storage, and networking, and a lot of that work keeps coming back to files, which makes sense since Unix has organized itself around everything is a file for over fifty years. Your terminal and random number generator are device files you can open and read (/dev/tty, /dev/urandom), and even network sockets, which are created with their own system call rather than opened by path, are read and written through the same interface afterwards.

For this post, I built a small filesystem with a real backing store, enough metadata to behave like a filesystem, and a few deliberate omissions so the code is still readable.

magicfs mounts at /magic, but it keeps its own local backing store next to it, with names and inode numbers in metadata.json, while file contents live as plain local files under blobs/. Calling that directory a blob store is a little grandiose, because the blobs are just files with allocated names like blob-000000000001, but keeping metadata separate from file contents lets the example cover name lookup, inode stability, write ordering, kernel caching, and what fsync() is asking the filesystem to do.

The full sample code is at github.com/shayonj/magicfs, and if you have Docker, you can run the filesystem with FUSE enabled.

Try it first

docker run -it --rm --device /dev/fuse --cap-add SYS_ADMIN shayonj/magicfs
$ ls /magic
hello.txt  notes.txt

$ cat /magic/hello.txt
Hello from a tiny FUSE filesystem.

$ echo "remember the milk" > /magic/notes.txt
$ cat /magic/notes.txt
remember the milk

Inside that shell, the mount point is the interface applications use, while the store directory is private state owned by the filesystem process, so the shell sees an ordinary directory even though the data behind it is a metadata file plus a couple of local blobs.

$ find /tmp/magicfs-store -type f
/tmp/magicfs-store/metadata.json
/tmp/magicfs-store/blobs/blob-000000000001
/tmp/magicfs-store/blobs/blob-000000000002

In the store directory, the metadata file stands in for a tiny inode table and a tiny directory tree, recording the name, inode number, size, mode bits, and blob IDs for each file.

{
  "next_inode": 4,
  "entries": {
    "hello.txt": {
      "ino": 2,
      "mode": 420,
      "size": 36,
      "blobs": [
        {
          "blob": "blob-000000000001",
          "offset": 0,
          "len": 36
        }
      ]
    },
    "notes.txt": {
      "ino": 3,
      "mode": 420,
      "size": 18,
      "blobs": [
        {
          "blob": "blob-000000000002",
          "offset": 0,
          "len": 18
        }
      ]
    }
  }
}

The path notes.txt is not where the bytes live, it is the name that gets you to inode 3, and the metadata for inode 3 points at a blob file under blobs/, so renaming notes.txt changes the directory metadata, while rewriting it creates a new blob and updates the metadata pointer.

Filesystems as a request loop

When you run cat /magic/hello.txt, cat does not know that JSON metadata and blob files are involved, because all it does is call open() and read(), after which the kernel resolves the path through the VFS, and the operation eventually lands on the filesystem mounted at /magic.

With FUSE, the code that answers those filesystem requests runs in userspace, where the kernel driver sends request messages over /dev/fuse, the userspace process replies, and the application that made the system call keeps waiting until the kernel has an answer, while the kernel FUSE documentation covers the protocol, and the fuser crate exposes the same operations as Rust trait methods.

The path for a read looks roughly like this:

flowchart LR
    A["cat /magic/hello.txt"] --> B["Linux VFS"]
    B --> C["FUSE kernel driver"]
    C --> D["magicfs userspace process"]
    D --> E["metadata.json + local blobs"]
    E --> D
    D --> C
    C --> B
    B --> A

In the request log, LOOKUP asks whether a name exists in a directory and which inode it maps to, GETATTR asks for the metadata associated with an inode, READ asks for bytes at an offset, and WRITE sends bytes at an offset, while later in the lifetime of an open file, FLUSH, FSYNC, and RELEASE show up and make the write path less like a simple callback that copies bytes.

Here is the log from writing notes.txt, trimmed to the requests involved in opening, truncating, writing, flushing, and releasing the file:

[magicfs] READDIR ino=1
[magicfs] LOOKUP notes.txt -> ino=3
[magicfs] OPEN notes.txt ino=3 flags=0x8001
[magicfs] SETATTR ino=3 size=0 staged=true
[magicfs] WRITE notes.txt ino=3 offset=0 len=18 staged=true
[magicfs] FLUSH notes.txt ino=3
[magicfs] COMMIT notes.txt ino=3 size=18 blobs=1
[magicfs] COMMIT metadata entries=2
[magicfs] RELEASE notes.txt ino=3 flags=0x8001 flush=true

In this log, ls triggers READDIR, while a direct cat /magic/hello.txt can walk the path without listing the directory first. Shell redirection with > opens the file for writing and truncation, so the kernel sends a size change before it sends the bytes, and the WRITE handler only stages the new contents in memory, while the backing store does not change until the file is flushed or synced.

A filesystem usually has to answer a question about a name before it can answer anything about bytes, namely whether this name exists in this directory, and if it does, which file it refers to.

Linux mostly stops caring about filenames once path lookup is done, because internally it refers to files by inode number, and on a disk filesystem, an inode is a record with metadata and pointers to data blocks, while a directory entry maps a name to an inode, which is why a rename can change a path without moving file data, and also why hard links can make the same inode appear under more than one name.

magicfs keeps the directory entry and inode metadata in metadata.json:

"notes.txt": {
  "ino": 3,
  "mode": 420,
  "size": 18,
  "blobs": [
    {
      "blob": "blob-000000000002",
      "offset": 0,
      "len": 18
    }
  ]
}

The LOOKUP notes.txt handler reads that map and returns inode 3, while the GETATTR handler turns the entry into a FileAttr, which is what makes stat and ls -l work, and the root directory uses inode 1, which is the conventional root inode for FUSE filesystems.

The ordering problem shows up before the read and write handlers do anything with file contents, because if a new blob reaches the backing store but metadata.json still points at the old blob, readers keep seeing the old file, while if metadata.json points at a blob that never made it to disk, readers see a broken file. magicfs handles the simple case by writing the blob first, then replacing metadata, and the metadata replacement follows the usual local-filesystem pattern where the code writes a temporary file, syncs it, renames it over metadata.json, and then syncs the containing directory.

The temp-file-and-rename pattern avoids half-written JSON, but it is not a journal, and without a recovery pass or a transaction log, the filesystem cannot determine after a crash whether every in-flight metadata update had committed.

File contents as local blobs

For the data path, magicfs stores each committed file version as one immutable blob with an allocated ID, while a more complete filesystem would split larger files into chunks and let metadata point at a list of chunks, but one blob per file keeps the code short.

For reads, metadata comes first, so given inode 3, the filesystem finds the entry for notes.txt, reads the blob ID from that entry, opens the corresponding file under blobs/, and returns the byte range the kernel requested.

inode 3
  -> metadata entry for notes.txt
  -> blob ID blob-000000000002
  -> blobs/blob-000000000002
  -> bytes returned to READ

For writes, the data moves in the other direction, but magicfs does not mutate the blob in place, because when the kernel sends WRITE, the filesystem stages the new file contents in memory, and later, when FLUSH or FSYNC arrives, it writes a new blob and updates metadata to point at it.

The example ends up with a small copy-on-write data path, although rewriting one byte of a large file should not require rewriting the whole file, so a more complete implementation would chunk the file, track dirty chunks, write only the changed chunks, and then commit a metadata update that points at the new chunk list, while magicfs skips that complexity by assuming the files are small enough to rewrite as a unit.

Write is not sync

A shell command like this looks simpler than the filesystem work behind it:

$ echo "remember the milk" > /magic/notes.txt

Inside magicfs, the work is closer to this:

OPEN notes.txt for writing
SETATTR notes.txt size=0
WRITE bytes at offset 0
FLUSH because a file descriptor is closing
write content blob
replace metadata.json
RELEASE the open file

On a normal Linux filesystem, write(2) usually means the kernel accepted the bytes into memory, not that the bytes necessarily reached stable storage. fsync(2) is the call an application uses when it wants the file data, along with the metadata needed to retrieve that data, flushed to the storage device, while fdatasync(2) is similar but can skip metadata that is not needed for a later read.

FUSE also calls the filesystem when a file descriptor closes, because flush is called on close, and duplicated file descriptors mean one open file can have more than one flush. A filesystem can use flush to report delayed write errors, but flush does not mean the same thing as fsync, and release happens later still, when the kernel is done with the open file handle.

For the shell demo, magicfs commits staged bytes on both FLUSH and FSYNC, which makes echo hello > /magic/notes.txt behave the way a person expects, while the code still treats fsync as the explicit request for durable file data and metadata. A database that calls fsync is asking a more specific question than a shell that happened to close a redirected file, and if the backing blob write fails after WRITE already returned success, the filesystem still has to decide where that error can be reported, either through a later fsync or through a close-time error from flush, although plenty of programs are not careful about checking close errors.

For metadata, replacing a file with rename is atomic for readers, but atomic replacement is not the same thing as durability after power loss, so if you care that the new metadata.json survives a crash, you need to sync the new file contents and the directory entry that points at it, which magicfs handles for its local store by syncing the temporary metadata file before rename, then syncing the store directory after rename.

In code, those rules show up in the order of blob writes, metadata replacement, flush, and fsync, because the filesystem has to decide which bytes exist, which names point at them, and what an application is allowed to assume after a successful sync.

FUSE replies can include time-to-live values for names and attributes, and until those TTLs expire, the kernel can answer repeated lookups and getattr calls without asking the userspace process again, which matters because crossing from the kernel into a userspace filesystem on every stat would be expensive.

The same TTL also affects correctness, because magicfs uses a one second TTL, which is fine for a single-process demo, but if another process or another machine can update the same backing store, a reader may see an old file size or an old blob ID until the cache expires unless the filesystem actively invalidates the kernel’s cached state.

For file contents, magicfs opens files with FUSE direct I/O so reads come back to the userspace filesystem instead of being served from the page cache, which keeps the example easier to reason about but gives up caching and read-ahead that a real filesystem would probably want, and the cache policy matters because it changes which file size, inode attributes, and file contents callers are able to observe.

Shortcomings I kept

The implementation only supports one directory, and each file is stored as one local blob, so rewriting a byte rewrites the whole file, with no journal, recovery scan, or cleanup for orphaned blobs left behind by rewrites or unlinks, and it also does not implement locking, mmap, extended attributes, a real permission model, sparse files, hard links, symlinks, or multi-client cache invalidation.

The filesystem also does not model the problems that show up when the backing layer is remote, since network failures, remote consistency rules, retries, and authentication all change when reads can succeed, when writes can be retried, and what fsync can honestly report, while this example stays on local disk so the post can focus on filesystem calls.

A journal or transaction log would let recovery decide whether a metadata update committed, chunking would avoid rewriting whole files, a garbage collector would find blobs no metadata entry can reach, and better cache invalidation would keep multiple readers from seeing stale metadata for too long.

With FUSE, Linux asks the filesystem a fixed collection of questions, and the implementation can answer from whatever backing store it owns, which means the implementation still has to define lookup, write, flush, fsync, and rename when metadata and file contents are stored somewhere else.

I am working on these filesystem, sandboxing, and storage problems at Tines, along with plenty of adjacent systems work that gets deeper than a blog post can. If that sounds interesting, we are hiring.