惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
人人都是产品经理
人人都是产品经理
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
V2EX
博客园 - 三生石上(FineUI控件)
Martin Fowler
Martin Fowler
WordPress大学
WordPress大学
D
Docker
S
SegmentFault 最新的问题
博客园 - 聂微东
美团技术团队
Apple Machine Learning Research
Apple Machine Learning Research
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Last Week in AI
Last Week in AI
M
MIT News - Artificial intelligence
F
Fortinet All Blogs
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
GbyAI
GbyAI
L
LangChain Blog
Vercel News
Vercel News
博客园 - 叶小钗
MongoDB | Blog
MongoDB | Blog
Stack Overflow Blog
Stack Overflow Blog
H
Help Net Security
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
The Cloudflare Blog
Engineering at Meta
Engineering at Meta
T
Threat Research - Cisco Blogs
T
Threatpost
Scott Helme
Scott Helme
T
Tailwind CSS Blog
Latest news
Latest news
Stack Overflow Blog
Stack Overflow Blog
Blog — PlanetScale
Blog — PlanetScale
The Register - Security
The Register - Security
罗磊的独立博客
P
Proofpoint News Feed
腾讯CDC
S
Schneier on Security
雷峰网
雷峰网
A
About on SuperTechFans
T
Tenable Blog
F
Full Disclosure
Cyberwarzone
Cyberwarzone
博客园_首页
有赞技术团队
有赞技术团队
K
Kaspersky official blog

DEV Community

How I built a dependency risk scanner with Coral in 7 days Local-first: a Model on Your Own Machine, Zero Cloud 2487. Remove Nodes From Linked List How to build your professional network as a developer — authentic strategies The Pope and the Dynamo Building ShouldWeAutomate: A Decision Intelligence Platform for Workflow Automation The Reputation Layer: Why Developers Quietly Run Corporate PR The Last Mile of Software Is a Sentence AppView 1.0.0 Released: Instrument and Secure Your LLM Deployments The Hermes Rescue: How an Open Agent Rebuilt My GitHub Projects from Scratch S2 — Heap Corruption Crashes: How to Diagnose and Fix Them I built a Chrome extension because I couldn't stop opening Twitter between Pomodoro sessions AI cheating in technical interviews is invisible to interviewers — here's how we detect it Lean4 Might Be the Missing Piece in AI: Why Theorem Provers Are Suddenly Everywhere The Zero-Drift API Series: Stop Trusting a Green Build You Can't Explain How I Deployed My First Project on AWS (And Didn't Break Everything) How I Built a Real-Time Quiz Platform with Next.js, WebSockets, and Learning Science When Your VPS Blocks Outbound SMTP: What Actually Helps Los agentes de código necesitan memoria durable, no solo contexto Cognitive Architectures of AGI: 7 Patterns That Transform LLMs from Oracles into Thinkers I Built a Chat App That Deletes Itself (Because I Was Bored at 2am) Uncovering the Power of Linux's History Command How to Add a Contact Form to Your Ghost Blog Accept Payments in Minutes with Afriex Checkout Sessions Hermes Agent Gets Smarter Every Day. So Does the Bill. How I get Next.js sites to load almost instantly — a practical checklist Treasure Hunt Engine: Why One Bad Prometheus Rule Sank the Whole Veltrix Event Test a DNS Leak in 2 Minutes: Complete Methodology + Per-OS Fixes (2026) Lessons from building a Chrome extension Rivet: A library i made in 2 days I Built a Speech-to-Text Tool Because Sometimes Typing Just Gets in the Way How I'm Building a Multi-Agent Crew for AI Coding Supervision (Cipher Update) Your AI Agent Needs a Manager, Not a Superhero I Built CausalLens — A Free, Open-Source Causal Impact Calculator for Time Series (5 Methods, Zero Setup) How to write good commit messages and pull requests — a team guide Cipher: The Jarvis with a Hermes Core How to build a second brain with Obsidian and Claude Code (step by step) Claude completed my MPI assignment. Then it couldn't run it. So I built the missing piece. This 100% How Our Document Ingestion Pipeline Turns Files into LLM-Ready Markdown Agentic AI Model Risk Management: Aligning with Regulatory Expectations CTV Fraud Has an IPv6 Business Problem The great AI enshittification The Veltrix Treasure Hunt Engine: Why Our First Rewrite Cost Us 3.2 Million Requests Per Second I Made My AI Models Argue, Then Let Hermes Be the Judge Road To KiwiEngine #4: The Racecar Driver Analogy Run Aider on Ollama, Bedrock, or Any LLM Provider — One Gateway, Every Model BAIXAR VÍDEO DO YOUTUBE Releasing HeliosProxy, The programmable Postgres data-plane Hello, DEV Community! 👋 Three Bitcoin Primitives That Don't Exist Anywhere Else (PoW Beacon, DLC Oracle, Fair-Launch Rune) Append-only doesn't mean what you'd hope Notes from the Mistral AI Now Summit Are Claude skills safe in 2026? What the Snyk ToxicSkills audit actually found How to not Lose $500M via API Bills: Run Private AI for 100 Engineers Under $1 Million The Unlikely Journey from Bricks to Bytes Three TODOs, three weeks, one weekend: finishing pq v0.14 Server-Side WebRTC Noise Reduction with Pion, FFmpeg, and RNN Models Autonomous AI Agents in Cryptocurrency Portfolio Management IDOR BugBounty Labs: 5 Realistic Challenges to Master Insecure Direct Object Reference IDOR Lab: The Bug Bounty Training Platform That Doesn't Hold Your Hand ZentriqGuard — Hermes Agent-Powered Zero-Trust Access Auditor Why Artistic QR Codes Silently Fail (And How I'm Trying to Fix It) How I Built and Monetized a Currency Exchange Rate API with FastAPI, Deployed it on Render, and Published it on RapidAPI. The 7 Best Reddit Scrapers in 2026 (Free & Paid, Tested) An AI runs my company. A solo dev vibe-coded $15K in a week — we made $[X]. A cold autopsy. I am new here Stop Pasting Your Code Into ChatGPT For Debugging—Run LLMs Locally Instead 5 Free JSON Tools Every Developer Should Bookmark Building reqlog: a Go CLI for tracing request flows across logs (files, Docker, SSH) Environment Variables in Node.js — What They Are, How dotenv Works, and Why Getting This Wrong Can Ruin You I Built a Zero-Dependency Discord.js Package That Creates Temporary Voice Channels Automatically Goodbye CSV Nightmares: Automating Magento Order Line Item Exports in Google Sheets Nexthena — A Local-First Whiteboard App Built on Excalidraw How we built an platform to solve the "finding a photographer" problem 5 Failure Modes I Found in My Financial RAG (And the One That Actually Mattered) From Logic to Numbers: A Beginner’s Guide to Programming Through Mathematical Thinking Oracle Fusion Report Scheduling with Skip Conditions AtCoder Beginner Contest 460 参加記録と解答例 (A D問題) Your AI Agent Just Crashed at Step 9 of 12. Here's How to Make That Not Matter. Grokking the System Design Interview: Why the Original Course Still Wins Outbox Pattern Solves Publishing. Inbox Pattern Solves Processing. Why autism hasn't disappeared — a hypothesis Por que eu parei de usar Cloudinary e construí minha própria API de imagens How to Test if Your Proxy is Leaking DNS: 2026 Setup Guide AWS VPC Networking — Public Subnet, Private Subnet ve 3-Tier Mimari MediaNote: a note-taking app inside VS code I built a sovereign self-healing AI development system from scratch using Hyperdimensional Computing — no LLMs, no cloud, no APIs WordPress vs. Next.js: benchmark real pe Core Web Vitals (și de ce plugin-urile de cache nu rezolvă problema) ai, deepseek, machinelearning I Gave My Dead Raspberry Pi to an AI Agent. It Fixed Everything Over SSH. How I Built a Google Shopping Scraper with Python & Playwright I Turned Hermes Agent into a Verifiable Agent Operating System The 5 Systematic Failure Modes of AI Research Reports (and How to Catch Them) Stop Saying 'Great!'—Build a Real AI Interview Coach with Claude Code Simple SQL Tool What is DevOps? A Plain English Guide for Beginners Why ChatGPT sucks at generating Types (and how I fixed it) Modelling a codebase as a requirements ontology in Neo4j, keeping AI coding agents oriented AI Is Doing the Work of Junior Developers — And Nobody Is Talking About What Happens in 7 Years
C_STD : A Leak-Free, Cross-Platform Standard Library for Modern C
amin tahmasebi · 2026-05-31 · via DEV Community

 c_std: A Leak-Free, Cross-Platform Standard Library for Modern C

Bringing the comfort of the C++ STL and Python's standard library to C17 — without leaving C

A technical white paper.


Executive summary

C is still the substrate of the computing world — kernels, databases, language runtimes, embedded firmware, and the inner loops of nearly everything else. Yet the moment you step away from the kernel and try to write ordinary application code in C, you feel the gap: no growable vector, no hash map, no JSON parser, no string type that doesn't invite a buffer overflow. You either pull in a grab-bag of mismatched third-party libraries, each with its own conventions and failure modes, or you re-implement the same dynamic array for the hundredth time.

c_std is an attempt to close that gap deliberately and coherently. It is a single, consistent library — written in pure C17 — that reimplements a large slice of the C++ Standard Library (containers, algorithms, smart pointers) alongside many Python-style conveniences (json, regex, random, statistics, csv, config, even turtle graphics). It targets Windows and Linux from one source tree, compiles cleanly under -Wall -Wextra, and — this is the part I care about most — is verified leak-free under Valgrind, module by module, example by example.

This paper explains the design philosophy, the architecture, and the engineering discipline that makes a library like this trustworthy enough to build on.


1. The problem: C's missing middle

Every C programmer knows the two extremes. At the bottom, the language itself: pointers, malloc, memcpy, raw arrays. At the top, whatever the platform hands you — <windows.h> or POSIX, OpenSSL, a JSON library someone wrapped a decade ago. The middle — the layer the C++ STL and Python's batteries-included standard library occupy — is missing.

That missing middle has a real cost. It shows up as:

  • Re-invention. Teams write their own vector, their own string builder, their own linked list, each subtly different.
  • Inconsistency. One library returns 0 on success; another returns -1; a third sets errno; a fourth returns a pointer you must remember to free with its deallocator.
  • Safety hazards. Hand-rolled string and buffer code is where C's reputation for footguns is earned.

c_std's thesis is simple: a C developer should be able to reach for a Vector, a HashMap, a String, a JSON document, a TCP client, or a big integer with the same fluency a C++ or Python developer has — and those building blocks should share one set of conventions for construction, ownership, error reporting, and teardown.


2. Design philosophy

Six principles shaped the library. They are worth stating explicitly, because the value of a standard library is as much in its consistency as in its feature list.

2.1 One mental model for ownership. Every module follows the same lifecycle: a *_create() / *_init() constructor, explicit operations, and a matching *_deallocate() / *_destroy() destructor. Containers that own heap values take a deallocator callback and call it on erase and teardown — and the contract for who frees what is documented and, crucially, tested. Where a function transfers ownership to the caller (for example, list_erase returns the removed value for you to free), that is stated in the header and demonstrated in the README.

2.2 Familiar APIs win. The container names map to their C++ analogues (vector, map, unordered_maphashmap, unique_ptruniqueptr). The text and data utilities map to Python (random, statistics, json, regex). Familiarity is a feature: it shortens the distance between "I know what I want" and "I know what to type."

2.3 Portability is a first-class requirement, not an afterthought. Anything OS-specific lives behind a #if defined(_WIN32) … #else … #endif seam with a Win32 implementation and a POSIX implementation of the same public function. The caller never sees the difference. network, concurrent, sysinfo, dir, and crypto are all built this way.

2.4 Fail safe, never exit(). A library has no business terminating its host process. Constructors return NULL on allocation failure; operations return status codes; NULL inputs are tolerated rather than dereferenced. This sounds obvious and is routinely violated.

2.5 Document at the definition. Every public function carries a Doxygen contract directly above its implementation — purpose, parameters, return values, platform notes. The header is a clean index; the source is the source of truth.

2.6 No leaks. Ever. More on this below, because it is the principle that took the most work and yields the most trust.


3. Architecture at a glance

c_std is organized as ~40 independent modules, each in its own directory with a .c, a .h, a README.md with a full API reference and runnable examples, and a test suite. Grouped by domain:

  • Containersvector, array, string, list, forward_list, deque, queue, stack, priority_queue, span, bitset, map (red-black tree), hashmap (open-addressing-free chained table with automatic rehashing), tuple, variant, uniqueptr.
  • Algorithms & numericsalgorithm, sort, statistics, random, secrets, numbers, matrix, bigint (GMP), bigfloat (MPFR), evalexpr.
  • Text, data & I/Ofmt, encoding (Base16/32/64, URL), json, xml, csv, config (INI), regex, cli, log, file_io, dir.
  • Time & datetime, date (Gregorian + Persian calendars).
  • System & concurrencysysinfo, concurrent (threads, mutexes, condition variables, a thread pool), serial_port.
  • Securitycrypto (OpenSSL-backed), jwt (HS/RS/ES/PS families).
  • Networking & databasesnetwork (TCP, UDP, a small HTTP server/client), database (PostgreSQL via libpq, built only when present).
  • Graphicsplot (line/scatter/bar/pie/histogram) and turtle (Python-style turtle graphics), both on raylib.
  • Testing — a small unittest framework.

The build is CMake-only, generator-agnostic (Ninja recommended), and works with GCC, Clang, and MSVC. You can build the whole library and all ~50 example programs at once, or compile a single module or example in isolation.


4. The two non-negotiables

A standard library earns trust by being boring in the right ways. Two properties matter more than any feature.

4.1 Memory safety, proven — not asserted

It is easy to claim a C library has no leaks. It is harder to mean it. The discipline here is mechanical and relentless:

  • Every module's test suite and every runnable README example is executed under valgrind --leak-check=full --show-leak-kinds=definite,indirect --error-exitcode=99. The target is uniform: 0 leaks, 0 errors.
  • Network and socket code is additionally run with --track-fds=yes, because a server that leaks file descriptors is just as dead as one that leaks memory — it simply takes longer to fall over.
  • The same code is compiled with -Wall -Wextra and is expected to be warning-clean on both GCC and MSVC.

This methodology is not decoration. It routinely surfaces the exact class of bug that C is infamous for. A representative example: a double-ended queue whose reallocation guard checked size == capacity but ignored the front-padding offset that grows from the middle of the first block. For most access patterns it worked; under the right mix of front/back insertions it wrote one slot past the block array — an out-of-bounds write that Valgrind flagged as an invalid write of size 8, which in turn corrupted a stored pointer and produced a "lost" allocation downstream. The fix was a single corrected predicate, but the point is that the test harness found it deterministically rather than leaving it to crash in production six months later.

The lesson I would underline for anyone building C infrastructure: leak-checking is not a final QA step, it is part of the inner development loop. Wire it into the way you run the code from day one, and the cost of staying clean approaches zero.

4.2 Genuine cross-platform parity

Cross-platform support in C is usually aspirational — "it should build on Linux too." c_std treats Windows and Linux as co-equal first-class targets, and the test suites run on both. Several modules are interesting case studies in doing this honestly:

  • regex auto-selects its backend at compile time via __has_include(<pcre.h>): PCRE where available, and a POSIX <regex.h> fallback otherwise, with a small translation shim so that common patterns (\d, \w, \s) behave consistently across the two engines. One API, two engines, same results.
  • network presents one socket API over Winsock and BSD sockets. Sockets are created dual-stack IPv6 so a single code path serves IPv4 and IPv6 — which, as it happens, is exactly the kind of detail that bites you: binding a dual-stack AF_INET6 socket to an IPv4 wildcard address fails, and the fix is to resolve the bind address as IPv6 with AI_V4MAPPED.
  • crypto leans on OpenSSL but guards legacy digests (MDC2) behind OPENSSL_NO_MDC2, because the distro you build on may have compiled them out. Portable code respects the configuration of its dependencies, not just their presence.

The recurring theme: the difference between "compiles on two platforms" and "works correctly on two platforms" is a long tail of small, specific, well-understood decisions. There is no shortcut; there is only the willingness to chase each one down.


5. Production-minded networking: a short case study

Because so much real-world C ends up talking to a network, the network stack received particular attention to the failure modes that actually take systems down in production.

Consider connect timeouts. A plain blocking connect() to an unreachable host can stall for the operating system's default — often minutes — and wedge a request thread the entire time. c_std provides tcp_connect_timeout(), which performs the connect in non-blocking mode, waits on select() for a caller-supplied budget, and confirms the result via SO_ERROR:

TcpSocket s;
tcp_socket_create(&s);

/* Never blocks longer than 2 seconds, on any platform. */
if (tcp_connect_timeout(s, "api.example.com", 443, 2000) == TCP_SUCCESS) {
    /* ... */
}

Enter fullscreen mode Exit fullscreen mode

It is paired with readiness primitives (tcp_wait_readable, tcp_wait_writable), an exact-byte-count helper (tcp_bytes_available via FIONREAD), and tcp_get_socket_error for completing non-blocking operations. None of these allocate, so there is nothing to leak; all were validated with a stress harness that opened and closed thousands of sockets under Valgrind with descriptor tracking, plus a deliberate connect to a black-holed address to prove the timeout actually bounds the wait.

The HTTP layer is similarly pragmatic. Header lookup is case-insensitive per RFC 7230 (a real client will send content-type in the wrong case eventually), and Content-Length parsing is strict and overflow-safe rather than a naive atoi. These are not glamorous features. They are the difference between a demo and something you can put in front of untrusted input.


6. The API in practice

A few snippets convey the intended ergonomics better than prose. Note the uniform shape: create, use, destroy.

/* A growable typed array */
Vector* v = vector_create(sizeof(int));
for (int i = 1; i <= 5; ++i) vector_push_back(v, &i);
fmt_printf("size=%zu\n", vector_size(v));
vector_deallocate(v);

Enter fullscreen mode Exit fullscreen mode

/* JSON, parsed and pretty-printed */
JsonElement* root = json_parse("{\"lib\":\"c_std\",\"version\":1.0}");
json_print(root);
json_deallocate(root);

Enter fullscreen mode Exit fullscreen mode

/* RAII-style ownership with a smart pointer + a custom deleter,
   so a unique_ptr can own another c_std object and free it correctly */

Enter fullscreen mode Exit fullscreen mode

The graphics modules deserve a mention because they show the library reaching beyond "data structures." plot renders line, scatter, bar, pie, and histogram charts to a PNG; turtle provides Python-style turtle graphics. Both are built on raylib, and both are testable headlessly — plot_export_image() and turtle_save_image() produce real image files, which makes them as amenable to automated verification as any other module, despite being graphical.


7. Performance posture

c_std optimizes for predictable performance and correct asymptotics rather than micro-benchmark bragging rights:

  • vector grows geometrically with memory pooling to amortize reallocation.
  • hashmap offers O(1) average insert/lookup with automatic rehashing as the load factor rises.
  • map is a red-black tree with O(log n) ordered operations.
  • bigint/bigfloat delegate to GMP and MPFR — mature, heavily optimized numeric libraries — rather than reinventing arbitrary-precision arithmetic.

The guiding judgment is that a general-purpose standard library should pick the right data structure with the right complexity class and implement it cleanly; specialized hot paths are the application's job, not the library's.


8. Limitations and honest caveats

A white paper that only lists strengths is marketing. In the interest of engineering honesty:

  • It is not the C++ STL. C has no templates, no RAII, no operator overloading. Type-generic containers use void* and element sizes; you trade compile-time type safety for flexibility. uniqueptr approximates RAII but cannot match scope-based destruction enforced by the language.
  • Graphics and database modules carry heavy dependencies (raylib; libpq). They are optional, and database is skipped automatically when libpq is absent — but they are not free.
  • macOS is plausible but not a primary CI target. The POSIX paths should largely work; treat it as best-effort until proven.
  • It is a large surface area. Forty-plus modules is a lot to keep uniformly excellent; the test-and-Valgrind discipline is what keeps that surface honest, but breadth always carries maintenance cost.

Knowing where a tool doesn't fit is part of using it well.


9. Roadmap

Natural next steps include broadening CI to macOS and to MSVC in automation, expanding fuzz testing on the parsers (json, xml, csv, regex) beyond the current adversarial-but-curated suites, and continuing to file down the long tail of platform-specific behavior. The architecture — independent modules behind consistent contracts — is deliberately friendly to incremental, low-risk evolution.


## 10. Conclusion

Enter fullscreen mode Exit fullscreen mode

C does not need to be a language where you start every project by rewriting a dynamic array. c_std is a bet that the missing middle can be filled coherently: one library, one set of conventions, two first-class platforms, and a hard, verified line on memory and descriptor leaks. The interesting work was rarely the feature list — it was the discipline. Leak-check in the inner loop. Treat the second platform as equal to the first. Document ownership and then test it. Respect your dependencies' configuration. Fail safe.

Those habits are transferable to any serious C codebase. If this paper leaves you with one thing, let it be that: the tools to make C trustworthy already exist. What they require is the decision to use them on every commit, not just before the release.

— Written from hands-on experience building and hardening the library across Windows (MSYS2/MinGW, MSVC) and Linux (GCC/Clang) toolchains.