惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
V
Vulnerabilities – Threatpost
有赞技术团队
有赞技术团队
小众软件
小众软件
O
OpenAI News
C
Cyber Attacks, Cyber Crime and Cyber Security
I
Intezer
NISL@THU
NISL@THU
D
Darknet – Hacking Tools, Hacker News & Cyber Security
N
News and Events Feed by Topic
MongoDB | Blog
MongoDB | Blog
阮一峰的网络日志
阮一峰的网络日志
Hacker News: Ask HN
Hacker News: Ask HN
D
Docker
WordPress大学
WordPress大学
Security Archives - TechRepublic
Security Archives - TechRepublic
A
About on SuperTechFans
Stack Overflow Blog
Stack Overflow Blog
C
CERT Recently Published Vulnerability Notes
L
LINUX DO - 最新话题
Application and Cybersecurity Blog
Application and Cybersecurity Blog
M
MIT News - Artificial intelligence
Blog — PlanetScale
Blog — PlanetScale
S
Security @ Cisco Blogs
Cloudbric
Cloudbric
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
V2EX
Hacker News - Newest:
Hacker News - Newest: "LLM"
G
Google Developers Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
W
WeLiveSecurity
Google DeepMind News
Google DeepMind News
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
H
Hackread – Cybersecurity News, Data Breaches, AI and More
G
GRAHAM CLULEY
S
Schneier on Security
T
Tor Project blog
Spread Privacy
Spread Privacy
PCI Perspectives
PCI Perspectives
Microsoft Security Blog
Microsoft Security Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
F
Fortinet All Blogs
L
Lohrmann on Cybersecurity
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
T
The Exploit Database - CXSecurity.com
TaoSecurity Blog
TaoSecurity Blog
Apple Machine Learning Research
Apple Machine Learning Research
T
Threat Research - Cisco Blogs
T
Troy Hunt's Blog
罗磊的独立博客

Show HN

CSP Radar GitHub - awebai/aweb-team-coord-worktrees: An aweb team template for a minimum team with a permanent coordinator and worktrees with local developers. GitHub - fujibee/agmsg GitHub - lucastononro/notify: 100% local, free, offline attention skill for Claude Code: plays a sound and speaks a short status update when a long task finishes, blocks, or needs a decision. GitHub - sebastianwessel/skills: AI Skills tivatdoar / workout-to-work · GitLab GitHub - enumura1/py-sql-cleaner: Find, format, and safely extract embedded SQL from Python files. GitHub - intent-bench/intent-bench: Intent fulfillment benchmark for agentic AI engineering GitHub - steveking-gh/firmion: Firmion is DSL and engine for firmware image generation. GitHub - villagesql/villagesql-skills: Agent skills for VillageSQL - gemini-cli-extension; claude-code-plugin GitHub - 0gsd/enough: a personal language system for planning, writing, and translation. GitHub - Kaelio/ktx: ktx is an executable context layer for data and analytics agents 🐙 Allow Claude Code, Codex, and any AI agent to query data accurately through MCP with skills, memory and a semantic layer GitHub - ThatXliner/xtras: Xliner's Claude Code Skills GitHub - flightdeckhq/flightdeck: Observability and control plane for AI agents. GitHub - search-router/simple-search: Open-source reference app on top of the Search Router API: FastAPI + Jinja metasearch service with pluggable backends, deterministic mocks (no API key needed), RTL UI, Redis cache, and a demo ads cabinet. CSP Radar GitHub - Light-Heart-Labs/DreamServer: Turn your PC, Mac, or Linux box into an AI server. LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. GitHub - Diplomat-ai/diplomat-agent-ts: What can your TypeScript AI agent do to the real world? Scan your code. See which tool calls have zero checks Code Block Selector - Visual Studio Marketplace Prometheus dependency graph — interactive showcase | Riftmap Show HN: I made a vi-like modal keyboard plugin for Figma GitHub - run-llama/liteparse: A fast, helpful, and open-source document parser GitHub - dalemyers/Roar: A macOS CLI tool for notifications GitHub - district-solutions/open-agent-tools-coder: Enables small-to-large self-hosted ai models to use local source code when running tool-calling agentic workloads. We actively data mine 20,900+ (2+ TB) popular github repos using large and small ai models to create reuseable: json, markdown and parquet files for local-first tool-calling models. GitHub - progapandist/stripeek: A local TUI proxy for real-time Stripe API debugging, built for navigating complex payloads fast. GitHub - sir1st/hermes-desktop: All-in-one cross-platform desktop app for Hermes Agent — bundles Python + hermes-agent + hermes-web-ui GitHub - astefanutti/shaderbang: Shebang for Shaders Show HN: Generate Claude Code Workflows using Spec Driven Development approach GitHub - nixys/nxs-universal-chart: The Helm chart you can use to install any of your applications into Kubernetes/OpenShift Show HN: AI agents for UK GDAD PCF roles and their skills The Two Pillars: Mixer Mode and Meta-Software in the Reorganization of Software Work After AI GitHub - JaiCode08/teleport-env What 1,000+ Harness Experiments Taught Me About Self-Improving Agents Show HN: Liiists, a Markdown-first, iOS and CLI list app SwiperTab – Get this Extension for 🦊 Firefox (en-US) GitHub - kouhxp/fftext: Summarize, explain, fact-check, or translate any text, URL, or file. No GPU. No cloud. One command GitHub - sweetpad-dev/sweetpad: Develop Swift/iOS projects using VSCode GitHub - dogmaticdev/IRON: IRON a.k.a. Intermediate Representation Object Notation is a Interpreter/Database that is used to create Programming Languages. GitHub - sjhalani7/vaen: Package your AI coding harness into a portable .agent file, and share it across repos, teams, & the community without ever having to copy-paste instructions, skills, MCP config, or secrets. Show HN: Gandalf the Grader Show HN: Citadeld – replay any CI failure locally from a single file GitHub - tdortman/cuSBF: High-Performance GPU Super Bloom Filter coral-ai/claude-code-token-xray at main · Coral-Bricks-AI/coral-ai GitHub - ulyssestenn/funes: Funes is a Git-based framework for LLM-managed knowledge work: an AI Librarian ingests raw sources, builds an interlinked Markdown knowledge base, and uses it to produce cited reports, analyses, and other outputs. GitHub - ThatXliner/gah: Git Add Hunk, built for agents to use GitHub - harmont-dev/harmont-cli: Command-line client for the Harmont CI platform GitHub - brooksmcmillin/mcp-authflow: OAuth 2.0 Authorization Server framework for MCP servers GitHub - javaid-codes/audit-supply-chain-agents GitHub - amorey/gochan: A small library of common channel architectures for Go, inspired by Rust GitHub - arifozgun/OpenGem: Free, Open-Source AI API Gateway with Gemini, OpenAI & Anthropic Compatibility in 1 file GitHub - Pranesh950/BioPetals: 🌸 Run BIOxAI models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading GitHub - cnguyen14/bounty-doctor: Diagnose a GitHub bounty issue before you waste hours: detects honeypot scam repos, AI-bot attempt swarms, and stale contests. Show HN: CoreMCP – MCP Server for On-Prem DBs Show HN: KittyHTML – Render HTML/CSS as an inline image in your terminal GitHub - bingud/filemat: Web-based file manager Show HN: TruthLens – Free multi-signal deepfake image detector GitHub - apexlocal-jz/claude-usage-tray: Windows system-tray app showing your Claude Code rate-limit usage at a glance. Zero deps, ~300 lines of PowerShell. Cross-IDE (works regardless of VS Code, Cursor, plain terminal). Release v0.1.2.1 · kouhxp/yapsnap GitHub - noopolis/moltnet: Self-hostable chat network for AI agents. Pre-built bridges for Claude Code, Codex, and the Claws. Rooms, DMs, history. No Slack bots, no Matrix, no glue code. GitHub - tamerh/enju: Coordinating Humans, AI Agents, and Compute as Peers on a Shared Workflow Graph Show HN: Continuity-auth – Respect-weighted rate limits for the open web GitHub - luml-ai/luml: AI lifecycle platform where engineers and agents track experiments, train models, and ship to production. GitHub - mrdanielcasper/CoreTex: A UNIX-inspired, biomimetic, flat-file AI harness and knowledge engine. GitHub - clemg/pierre-github: Pierre's diffs.com and trees.software for Github GitHub - lyriks-io/unspaghettit: Behavior-driven AI development without prompt spaghetti. GitHub - sofumel/claude-handoff-revive: Resume Claude Code work after rate/usage/context limits without replaying the prior transcript. Auto-saves at 90%/95% usage. Plugin-installable, 10 languages. GitHub - dotexorg/saferpc: Typed, end-to-end encrypted RPC over any bidirectional channel. GitHub - BeeZeeAgent/beezee: Agent harness orchestration Legato Next.js Boilerplate for Internal Tools · CoreUI GitHub - clark-labs-inc/clark-hash: Clark Hash, 32x smaller searchable sketches for embeddings GitHub - ZeroPointRepo/youtube-mcp: The fastest YouTube transcript + YouTube search MCP for AI agents. Try for free. Typing Mastery — climb toward 100+ WPM, deliberately GitHub - Andebugulin/Awareen GitHub - fayzan123/claude-workflow-composer: Visual desktop app for composing multi-agent coding workflows. Drag agents, attach skills and MCPs, wire handoffs, export to .claude/ GitHub - StackOneHQ/stack-nudge We hardened an LLM agent. Each defense we added made it more exploitable. GitHub - alkait/WhatsKept: Agent-queryable WhatsApp history from an iOS backup — a single Go binary. GitHub - octelium/cordium: Open-source, general-purpose sandbox platform for devs and AI agents that provides identity-based secure access to infrastructure without credentials. GitHub - scosman/videowright: Build animated explainer videos with your coding agent GitHub - dipankar/dscode: The code editor you can take apart. GitHub - zoharbabin/web-researcher-mcp: MCP server (Go) for AI assistants: web search, content extraction, academic/patent/news research. Multi-provider routing, 4-tier scraping, search lenses. Works with Claude, Cursor, and any MCP client. GitHub - scanaislop/aislop: Catch the slop AI coding agents leave in your code: narrative comments, swallowed exceptions, as-any casts, dead code, oversized functions. 50+ rules across 7 languages (TypeScript, JavaScript, Python, Go, Rust, Ruby, PHP). Sub-second, deterministic, no LLM at runtime. MIT-licensed. GitHub - kouhxp/cheap-im: CPU-only voice agent approximating Thinking Machines' Interaction Models demo GitHub - unprovable/OrchidMantis: Orchid Mantis — standalone framework for Zero-Knowledge Proofs of eXploit (ZKPoX). GitHub - TangibleResearch/Halgorithem: A Algo designed to detect AI Hallucitions GitHub - CarpseDeam/Aura-IDE: An AI coding harness that shaped itself - Planner/Worker agents, repo awareness, surgical edits, validation, recovery, and safe diff approvals. GitHub - chojs23/concord: A feature-rich TUI client for Discord GitHub - aerf-spec/aerf: Agent Evidence Receipt Format (AERF) — an open specification for tamper-evident, independently verifiable records of AI agent actions. GitHub - Jwrede/tokentoll: Catch LLM cost changes in code review. Infracost for LLM spend. GitHub - samchon/ttsc: A `typescript-go` toolchain for compiler-powered plugins and type-safe execution + 500x faster lint integrated into compiler GitHub - Higangssh/homebutler: 🏠 Manage your homelab from chat. Single binary, zero dependencies. GitHub - olalie/tapmap: See where your computer connects and what stands out on a live world map. GitHub - Diplomat-ai/diplomat-agent: What can your AI agent do to the real world? Scan your code. See which tool calls have zero checks GitHub - Bajusz15/beacon: Open-source agent for secure remote access, monitoring, and deploys across home-lab and self-hosted machines like Raspberry Pi, N100, or any Linux server. Open web based TTY or tunnel Home Assistant and other local services securely without opening ports. BigTech AI News - Chrome 应用商店 GitHub - vinhnx/VTCode: VT Code is an open-source coding agent with LLM-native code understanding and robust shell safety. Supports multiple LLM providers with automatic failover and efficient context management. GitHub - Lumen-Labs/brainapi2: BrainAPI is a knowledge graph–powered AI memory layer that transforms unstructured data into structured knowledge, enabling intelligent search, recommendations, and contextual memory for AI agents and applications. GitHub - familiar-software/familiar: Let AI watch you work. Familiar lets your AI update its memory, skills, and knowledge by watching your screen. make sidebar/address bar rounded corner toggleable
GitHub - oraziorillo/microcrad: A re-implementation of Karpathy's micrograd in C
oraziorillo · 2026-06-17 · via Show HN

microcrad is a tiny scalar-valued automatic differentiation engine for C, with a small neural network implementation built on top of it. It is a re-implementation of Andrej Karpathy's micrograd in C, written for people who want to understand how backpropagation really works.

Like the Python original, microcrad operates on scalars, not tensors. Every number that takes part in a computation is a node in a graph, every operation records how it was produced, and a single backward pass walks the graph in reverse to compute the derivative of the output with respect to every input. There is no vectorization, no GPU, no clever tricks: just the chain rule applied one scalar at a time. On top of this engine sits a multi-layer perceptron, so you can build a network, run a forward pass, call backward, and do gradient descent, all in C.

This repository is first and foremost an educational implementation. It is meant to be read, experimented with, and tested. It is not a production autograd package, not a practical deep-learning framework, and not optimized for large datasets or numerical robustness.

The whole thing is built around two ideas:

  • A Value, which is a single node in the computation graph.
  • Reference counting, which is how microcrad knows when a Value is no longer part of any graph and can be freed.

Almost everything in the documentation below is a consequence of these two ideas, so it is worth keeping them in mind.

How Values work

The fundamental type is Value. A Value wraps a single double and, when it is the result of an operation, remembers the operands it was computed from:

typedef struct Value {
    uint32_t ref_count;   /* How many references point at this Value. */
    uint32_t n_prevs;     /* How many operands produced this Value. */
    double data;          /* The scalar this node holds. */
    double extra_data;    /* Operation parameter (e.g. the exponent in pow). */
    struct Value **prev;  /* The operands (previous nodes in the graph). */
    int32_t op_code;      /* Which operation produced this Value. */
    uint32_t magic;       /* Debug canary for some invalid or stale pointers. */
    double grad;          /* dLoss/dThisValue, filled in by backward. */
} Value;

A leaf Value (an input, a weight, a constant) has n_prevs == 0 and no operands. A Value produced by an operation such as addition has n_prevs > 0 and a prev array pointing at the operands it depends on. Because every operation links its result back to its operands, the set of all Values reachable through prev pointers forms a directed acyclic graph: the computation graph.

This is the simplest microcrad program that computes something and its gradient:

Value *a = value_create_leaf(2.0);
Value *b = value_create_leaf(3.0);
Value *c = value_mul(a, b);   /* c = a * b = 6 */

value_backward(c);            /* returns 0 on success, and fills every grad field */

printf("c    = %f\n", c->data);   /* 6.000000 */
printf("dc/da= %f\n", a->grad);   /* 3.000000  (== b) */
printf("dc/db= %f\n", b->grad);   /* 2.000000  (== a) */

value_release(c);             /* c freed; releases its hold on a and b */
value_release(a);             /* a freed (its other reference was yours) */
value_release(b);             /* b freed */

This small program already shows the essentials:

  • Values are heap allocated with value_create().
  • Operations such as value_mul() build new Values wired into the graph.
  • value_backward() computes the gradient of its argument with respect to every node it depends on.
  • Values are reference counted, and releasing the root of a graph releases the whole graph (more on this below).

Creating Values

Value *value_create(double data, int32_t n_prevs, Value **prev);
Value *value_create_leaf(double data);

value_create_leaf is the convenience constructor for a leaf node, that is, an input, a weight, a bias, or a constant:

Value *x = value_create_leaf(42.0);

The n_prevs/prev arguments exist because the operation functions (value_add and friends) use value_create internally to build result nodes. Most user code should call value_create_leaf and let the operations do the wiring.

A freshly created Value starts with a reference count of 1: the pointer returned to you is that one reference. It is your job to release it.

Operations

Value *value_add(Value *v1, Value *v2);   /* v1 + v2  */
Value *value_mul(Value *v1, Value *v2);   /* v1 * v2  */
Value *value_pow(Value *b, double e);     /* b ** e   */
Value *value_exp(Value *v);               /* e ** v   */
Value *value_log(Value *v);               /* ln(v)    */
Value *value_relu(Value *v);              /* max(0,v) */

Unless stated otherwise, these functions expect non-NULL pointers and correctly shaped inputs. This code aims to keep the learning path clear; it documents important preconditions, but it does not try to harden every call like a production-grade API would.

Each of these returns a new Value whose data is the result of the operation and whose prev array points at the operands. Crucially, each operation retains its operands: it bumps their reference count so that the result node keeps them alive for as long as it needs them for the backward pass.

This means a result node co-owns its operands. You still own the references you were holding before the call, and you are still responsible for releasing them:

Value *a = value_create_leaf(2.0);   /* a: ref_count 1 (yours) */
Value *b = value_create_leaf(3.0);   /* b: ref_count 1 (yours) */
Value *c = value_add(a, b);              /* a,b now ref_count 2; c ref_count 1 */

/* ... use c ... */

value_release(c);   /* c freed; it releases its hold on a and b   */
value_release(a);   /* a freed (its other reference was yours)     */
value_release(b);   /* b freed                                     */

Note that value_pow takes a plain double exponent, not a Value: only constant exponents are supported, and the exponent is stored in extra_data.

The available op_codes are addition, multiplication, power, exponential, natural logarithm and ReLU. These are exactly the primitives needed to build a ReLU network with a mean-squared-error or negative-log-likelihood loss, which is what the examples do. Subtraction and division are not separate operations: subtraction is addition with a negated operand (the toy example builds its (prediction - target) term this way), and division is multiplication by a reciprocal, either a constant precomputed with value_create as the examples do for their loss scaling, or value_pow(x, -1.0) when the divisor is itself a node in the graph.

Backpropagation

int value_backward(Value *v);

value_backward computes the gradient of v with respect to every node it transitively depends on, storing each result in that node's grad field. It works in two steps, exactly like micrograd:

  1. It performs a depth-first topological sort of the graph rooted at v, so that every node appears after all the nodes it depends on. This uses the internal Vector and SimpleSet types (see below) to record the ordering and to avoid visiting a shared node twice.
  2. It seeds v->grad = 1 and walks the sorted list in reverse, and for each node it pushes its gradient onto its operands according to the local derivative of the operation that produced it (the chain rule).

It returns 0 on success and -1 on failure.

Precondition: v must be non-NULL and must point at a valid computation graph root. If you are training in a loop, you must also zero any gradients you do not want to accumulate before calling it.

Because gradients accumulate (+=) onto the operands, a Value that is used in more than one place in the graph correctly receives the sum of the gradients flowing back through each path. This is why value_backward does not reset gradients for you: if you are training in a loop, you must zero the grad fields yourself before each backward pass. Both training examples do exactly this:

for (size_t i = 0; i < parameters->size; i++)
    vector_get(parameters, i)->grad = 0.0;   /* zero the gradients   */

value_backward(loss);                        /* accumulate new ones  */

for (size_t i = 0; i < parameters->size; i++) {
    Value *p = vector_get(parameters, i);
    p->data -= learning_rate * p->grad;      /* gradient descent step */
}

Memory management: reference counting

C has no garbage collector, and a computation graph is a tangle of shared pointers: the same weight Value can be an operand of thousands of result nodes, and the same intermediate result can feed several downstream operations. Freeing such a graph correctly by hand is error prone. microcrad solves this the same way many long-lived C programs do, with reference counting.

void value_retain(Value *v);    /* take a reference: ref_count++ */
void value_release(Value *v);   /* drop a reference: ref_count-- */

The rules are simple:

  • Every Value is born with a reference count of 1, owned by whoever called the function that created it.
  • value_retain records that someone new is holding the Value.
  • value_release records that a holder is done with it. When the count reaches zero, the Value is freed, and it releases its own operands first, which may in turn free them, and so on recursively down the graph.

The recursive release is the important part: you almost never free a graph node by node. You release the root of the graph (the loss, the output of a forward pass), and the reference counts cascade downward, freeing exactly the nodes that nothing else still holds. Weights, which are also held by the network structure, survive; pure intermediates, which were only held by the result you just released, are freed.

value_release is safe to call on NULL, so you do not need to guard against it. This makes cleanup paths in functions that may fail part way through much easier to write, you can release everything unconditionally:

value_release(maybe_null);   /* does nothing if maybe_null is NULL */

A note on the magic field

Every Value carries a magic marker set to a known constant when the node is created. value_retain and value_release check it, and value_release poisons it before recursively freeing the node. This is a debug canary, not a correctness guarantee: it can help catch some invalid or stale Value * usage while you are experimenting, but it is not a substitute for correct ownership reasoning. If you ever see microcrad complaining that a Value * is invalid or stale, you almost certainly have a reference-counting bug.

Advantages and disadvantages of this design

Reference counting buys correctness and composability, but at a cost.

Disadvantage #1: you must balance every reference. Each value_create, value_retain, and each operation's implicit retain of its operands has to be matched by a value_release. Forget one and you leak; do one too many and you free memory that is still in use. The training examples in examples/ are verbose precisely because they are scrupulous about this in their error paths; that verbosity is the price of leak-free C.

Disadvantage #2: operations co-own their operands, which can surprise you. After Value *c = value_add(a, b), the nodes a and b are kept alive by c even if you release your own references to them. This is what makes recursive release work, but it means you cannot reason about a single Value's lifetime in isolation, you have to think about the whole graph.

Advantage #1: graphs free themselves. Release the root and the entire subgraph that nothing else references disappears, in one call. There is no graph walk to write, no bookkeeping of which intermediates to free.

Advantage #2: sharing is free and correct. A weight used in ten thousand multiplications is just retained ten thousand times; it is freed neither too early nor too late. The same property is what lets value_backward accumulate gradients correctly across shared nodes.

The neural network implementation

On top of the autograd engine, microcrad provides the three pieces you need for a feed-forward network. Each is a thin structure whose parameters are Values, so a forward pass automatically builds a computation graph you can backpropagate through.

Neuron *neuron_create(uint32_t nin);
Value  *neuron_forward(Neuron *n, Value **x);

Layer  *layer_create(uint32_t nin, uint32_t nout);
Value **layer_forward(Layer *l, Value **x);

MLP    *mlp_create(uint32_t nin, uint32_t *nouts, uint32_t n_layers);
Value **mlp_forward(MLP *mlp, Value **x);

These forward functions assume the caller passes arrays of the correct length: neuron_forward expects n->nin inputs, layer_forward expects the width used to build the layer, and mlp_forward expects the width of the model's first layer.

A Neuron holds nin weight Values and a bias, all initialized to small random numbers. Its forward pass computes relu(w·x + b) and returns the single output Value. A Layer is an array of nout neurons sharing the same input, and its forward pass returns an array of nout output Values. An MLP chains several layers, feeding each layer's outputs into the next.

Note that every neuron applies a ReLU, including those in the output layer. This keeps the engine minimal but it shapes what the network can represent (its outputs are always non-negative), which is why the toy example targets a function that is itself non-negative. It is a deliberate simplification, not an oversight.

To train, you need a flat list of every weight and bias in the network so you can zero gradients and apply the update in a single loop. Each level exposes one:

Vector *neuron_parameters(Neuron *n);
Vector *layer_parameters(Layer *l);
Vector *mlp_parameters(MLP *mlp);

mlp_parameters returns a Vector containing every trainable scalar in the network. This is the list you iterate over to do gradient descent, as shown in the backpropagation section above.

Putting it together

Here is the shape of a full training step, the same shape both examples use:

uint32_t nouts[] = {8, 1};
MLP *model = mlp_create(2, nouts, 2);     /* a 2 -> 8 -> 1 network      */
Vector *params = mlp_parameters(model);   /* flat list of all weights   */

/* forward: build the graph */
Value *inputs[] = { value_create_leaf(x1), value_create_leaf(x2) };
Value **out = mlp_forward(model, inputs);

/* ... build a loss Value from out[...] ... */

/* backward + update */
for (size_t i = 0; i < params->size; i++) vector_get(params, i)->grad = 0.0;
value_backward(loss);
for (size_t i = 0; i < params->size; i++) {
    Value *p = vector_get(params, i);
    p->data -= learning_rate * p->grad;
}

/* cleanup */
value_release(loss);
/* ... release out, inputs ... */
vector_free(params);
mlp_free(model);

The two examples in examples/ flesh this out with concrete training loops and data loading. Read train_on_toy_regression.c first: it is the smallest complete program in the repository that creates a model, builds a graph, backpropagates, updates parameters, and runs inference.

Supporting data structures

The engine relies on two small, self-contained data structures. You normally do not interact with them directly, but they are worth knowing about.

  • Vector (vector.h) is a dynamically growing array of Value pointers. It grows in fixed-size blocks, and it participates in reference counting: vector_append retains the Value it stores and vector_free releases every Value it holds. The parameter lists returned by *_parameters are Vectors.

  • SimpleSet (simpleset.h) is a minimal set keyed on pointer identity (a Value's memory address). It supports only insertion and membership tests, which is exactly what the topological sort in value_backward needs to avoid visiting a shared node twice.

Building and testing

microcrad has no dependencies beyond a C compiler, the C standard library, and libm for the math functions. Everything is driven by the Makefile.

To build and run the full test suite:

The test/ directory contains a standalone suite per component, test_value, test_vector, test_set, test_neuron, test_layer, and test_mlp, and you can build and run any one of them on its own:

make test_value
make test_mlp

To build and run the examples:

make example_toy_regression   # tiny synthetic regression, no external data
make example_mnist            # downloads MNIST, then runs a conceptual demo

example_mnist will fetch the MNIST IDX files first via examples/mnist/download_data.sh. The toy regression example needs no data and is the fastest way to see the whole pipeline run end to end; it is the primary example to treat as supported.

make clean removes the build directory.

Where to go next

  • Read examples/toy_regression/train_on_toy_regression.c for the smallest complete training program.
  • Read examples/mnist/train_on_mnist.c only as a structural demonstration of wiring the engine to a real dataset. It is not a practical training recipe: the engine is scalar, the model is ReLU-only, and the example intentionally prioritizes explicit code over optimization or numerically careful modeling.
  • Read test/ for compact, executable documentation of how each function is meant to be called and what it guarantees.
  • Read src/value.c itself, it is short, and the comments walk through the forward operations and the backward rules one case at a time.

Credits

microcrad is a C re-implementation of Andrej Karpathy's micrograd. The autograd design, the scalar Value abstraction, and the topological-sort backward pass all follow the original; the reference-counted memory management and the C data structures are what this port adds in order to make those ideas work without a garbage collector.