惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
G
GRAHAM CLULEY
P
Privacy & Cybersecurity Law Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
宝玉的分享
宝玉的分享
P
Proofpoint News Feed
H
Help Net Security
V
Visual Studio Blog
阮一峰的网络日志
阮一峰的网络日志
C
Cisco Blogs
人人都是产品经理
人人都是产品经理
Know Your Adversary
Know Your Adversary
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Recorded Future
Recorded Future
I
Intezer
罗磊的独立博客
T
The Exploit Database - CXSecurity.com
Blog — PlanetScale
Blog — PlanetScale
Malwarebytes
Malwarebytes
Spread Privacy
Spread Privacy
T
Tor Project blog
V
Vulnerabilities – Threatpost
云风的 BLOG
云风的 BLOG
腾讯CDC
B
Blog RSS Feed
Stack Overflow Blog
Stack Overflow Blog
F
Future of Privacy Forum
MyScale Blog
MyScale Blog
Latest news
Latest news
IT之家
IT之家
MongoDB | Blog
MongoDB | Blog
The Hacker News
The Hacker News
S
Securelist
博客园 - 【当耐特】
C
CXSECURITY Database RSS Feed - CXSecurity.com
T
Threat Research - Cisco Blogs
Jina AI
Jina AI
Cisco Talos Blog
Cisco Talos Blog
B
Blog
博客园 - 三生石上(FineUI控件)
Last Week in AI
Last Week in AI
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
M
MIT News - Artificial intelligence
V
V2EX
D
Darknet – Hacking Tools, Hacker News & Cyber Security
The Cloudflare Blog
The GitHub Blog
The GitHub Blog
博客园 - 聂微东
F
Full Disclosure
C
CERT Recently Published Vulnerability Notes

DEV Community

AI Memory Governance for Legal Tech: How Contract AI Agents Handle Privileged Data Join the GitHub Finish-Up-A-Thon Challenge: $3,000 Prize Pool! I Replaced a $50/Month OCR API with Gemma 4’s Native Vision (And You Can Too) Building a Data-Driven Medical Image Enhancement Pipeline with Differential Evolution 🔥🩻 Why I Like Small Software Beyond the Model: Why the Gemini Ecosystem and Google AI Studio Are Redefining Enterprise AI Architecture in 2026 Complete set of Claude Skills for Solo Developer I read 50 years of network science, then built a CRM that runs entirely in the browser The New AI Workflow Is Not “More Agents” How to Make Large Time-Series Charts Smooth in Vue.js + ApexCharts (and fix Zoom & Scroll behavior issues) I Built a Cross-Platform Port Intelligence Tool to Stop Accidental Process Kills During Local Dev AI is heading toward a wall, and most people still don’t see it... Python String Methods Explained Simply (Common Operations) Why We Built a Zero-Knowledge Clipboard Manager for Developers (And Dropped Native Mobile Apps) Add Your Own Component to Bombie in 5 Edits Why Your OSS Advocacy Strategy Probably Doesn't Fit Building an MCP server for a Swiss hosting provider (and what reverse-engineering its manager taught me) Does MCP Still Matter in the AI Ecosystem? Building a Smart LRU Cache in Java: When Machines Mimic Human Memory 🧠💻 A Beginner’s Guide to Redux in React Build a Real-Time Excalidraw-like Collaborative Canvas using Velt MCP and Antigravity🎉 Using Reddit to Validate SaaS Ideas Before Building How We Built an AI That Evolves Alongside a Creator Through Memory Building a Self-Hosted AI WhatsApp Agent for Structured Invoice Extraction Three Design Decisions That Shaped the Enterprise RAG Retrieval Pipeline How React's Virtual DOM Works Under the Hood Build a Dropbox Paper-Style Collaborative Editor with Next.js and Velt💥 Holy Typos, Batman! How I Built 'SpellJump' How to Test Frontend Error States Without Breaking Your Backend A .NET Dinosaur in Web3. Day 8 — Reading & Writing — WishList Chain Building AI Digital Employees with Markus: An Open-Source Platform for Agent Teams [Boost] The Auditor — High-Reasoning Synthesis and the Ethics of Governance Building 'Offline Brain': How I Wrote My First Custom Agent Skill for Android (Google I/O 2026) 📱🧠 Building a Superhuman-Style Collaborative Email Editor with Next.js and Velt🔥 I Built an On-Chain Marketplace Where AI Agents Solve GitHub Bounties for USDC Three Stripe subscription patterns I locked in before going live (with code) Six Ways AI Agents Communicate in 2026. I Benchmarked All of Them. Building AI Digital Employees with Markus: An Open-Source AI Workforce Platform I built a tool that detects broken security headers, missing robots.txt, and WP_DEBUG=true — then opens a PR to fix them automatically NIST Just Exposed the Age Estimation Number Vendors Don't Want You to See Authentication Looks Easy - Until You Build It for Real Users I Built a Free Stock Market Game You Can Play Right Now — No Login, No Download GitHub Agentic Workflows: Building Self-Healing CI for .NET Building a No-Code AI Agent for WooCommerce Order Analytics with Flowise & HPOS Your AI Coding Agent Has Been Flying Blind. Google I/O 2026 Just Fixed That I built a CLI that eliminates README reading forever Measuring AI Gateway Failover: 30 Days of Production Data The Folly of Global AI Platforms: Or How We Built a System That Actually Works in Cameroon Week 9 The 10-Minute Race: Scaling the "Cancel Order" Button to 100K+ Requests Per Second SQL Performance: Indexing, Query Tuning & Explain Plans (Developer Guide) Tutorial: This AI Now Tells You if a Meeting Could Be an Email Why I Got Tired of Class-Heavy UI Code and Started Building Around Attributes GitHub Is No Longer a Place for Serious Work Build an AI-Powered Developer Portal with Backstage and .NET Updates to developer experience on Setapp Node.Js Express CRUD template Lint Your Phishing Templates Like You Lint Your Code From Code to Cloud: 3 Labs for Deploying Your AI Agent I built Voice2Sub: a local AI subtitle generator for video and audio The OCR Rabbit Hole Built a 100k-Document RAG System by Hand. Hermes Read the Architecture in 47 Seconds. I tried monetizing my MCP server with x402 — production needs more than npm install Understanding Tracking Dimensions in Accounting Integrations I Ran My Local, NOT AI, AI Code Auditor on Its Own Source Code Agent Surface Map: Gemma 4 review before you install an MCP Stop Being Nice, Start Being Right": The Day My User Reconfigured My Reward Function Building a Database Performance Testing Tool With AI: The Honest Breakdown Hot To Run LLMs Locally Research blockchain with post-quantum Dilithium and custom zk-STARKs from scratch AI agents do not just need tool access. They need execution control. The CTO’s Blueprint for Governing Multi-Agent AI Systems in the Enterprise I audited our CMS and 86% of our articles were invisible. A Sanity gotcha. Upselling Explained Industry-Specific Tactics for EC Owners 2026 I Keep Hermes Agent's Self-Improvement OFF For the First 14 Days — Here's What Happens When I Don't I Built the Hermes + Claude Code Dual-Stack: Orchestrator Meets Coder — Here's the Full Architecture Stop Using .iterrows(). Here's What Actually Fast Looks Like I Built a SaaS to Stop the Awkward "Hey, Did You Get My Invoice?" Conversation I Renamed a Hot Postgres Table Without Dropping a Request How to Build a Self-Hosted AI Gateway With LiteLLM and Open WebUI What is a Webhook? A Complete Guide for Beginners Headless BI: How a Universal Semantic Layer Replaces Tool-Specific Models Beyond Translation: A Developer's Guide to App Localization (i18n & l10n) Aegis: Designing an Offline Ambient Co-Working Companion for High-Burnout Medical and STEM Grinds Local LLM Code Completion Showdown: Zed AI vs Continue vs Cursor (Honest 2026 Review) The Agentic Payment Protocol Wars Your No-Code AI Agent Has a Memory Problem The Agentic Payment Protocol Wars How to Bypass LinkedIn Commercial Use Limit in 2026 (Without Paying $150/mo) We built a statechart hosting platform where two actors in the same state can migrate to different versions — here's why that matters Playwright vs TWD: A Frontend Developer's Honest Comparison Claude Code's skillListingBudgetFraction: The Undocumented Setting Silently Killing Half Your Skills O GitHub pode mudar sua carreira mais do que você imagina Just redesigned and launched my developer portfolio 🚀 Would genuinely love some honest feedback from the dev community 👨‍💻 Data Virtualization and the Semantic Layer: Query Without Copying Launching opub: donated compute for open-source maintainers Four iteration rounds on a security scanner I run, all of them visible. Here is what the loop actually looks like. Why Good Abstractions Make Debugging Harder Found a Coordinated Inauthentic Network on GitHub: 24 Accounts, Fabricated History, and a Generator That Left Its PID in Three READMEs
Two tables, zero migrations, full LINQ — a .NET data engine that's been running our production for 3 months
rinat kozin · 2026-05-22 · via DEV Community

Today I'd like to slow down a bit and talk about redb.Core — the data engine at the heart of the RedBase ecosystem. The other pieces (redb.Route for pipelines, redb.Tsak for cluster runtime) lean on it, but this post is just about the database part.

I've been working on this project for several years. It started as an attempt to get rid of migrations and turned into what it is now — a typed object store for .NET over PostgreSQL and MSSQL.

It's not a weekend prototype. The free packages on NuGet are at version 2.0, there are 43 packages across the ecosystem, the architecture went through three rewrites, and as of this week it's been running 3 months on production at a 30-year national food distributor (more on that below).

This post is a technical walkthrough of redb.Core — what it is, how it differs from EF Core, what the generated SQL actually looks like, what the production workload looks like, and what's shipping next.


What redb.Core actually is

github.com/redbase-app/redb — Apache 2.0

RedBase stores typed C# objects in two tables (_objects + _values) over PostgreSQL or Microsoft SQL Server. Not JSON blobs. Not JSONB. Real typed columns with FK constraints — NUMERIC(38,18) for money, timestamptz for dates, uuid for GUIDs. Real B-tree indexes. ACID transactions.

The schema is your C# class:

[RedbScheme("Employee")]
public class EmployeeProps
{
    public string FirstName   { get; set; } = "";
    public string LastName    { get; set; } = "";
    public int    Age         { get; set; }
    public decimal Salary     { get; set; }
    public DateTime HireDate  { get; set; }
    public string[]? Skills   { get; set; }
    public Address? HomeAddress { get; set; }
    public Dictionary<int, decimal>? BonusByYear { get; set; }
}

Enter fullscreen mode Exit fullscreen mode

That attribute is the entire schema definition. Call SyncSchemeAsync<EmployeeProps>() once — done. Add a property next sprint — redeploy, call sync again. Old objects still work. No migration files. No DBA ticket. No 2am rollback story.

// Save — entire object graph, one call
await redb.SaveAsync(employee);

// Load — full graph, arrays, dicts, nested classes — all materialized
var e = await redb.LoadAsync<EmployeeProps>(id);

// Query — real LINQ, real SQL
var seniors = await redb.Query<EmployeeProps>()
    .Where(e => e.Salary > 100_000 && e.Age >= 35)
    .OrderByDescending(e => e.Salary)
    .Take(50)
    .ToListAsync();

Enter fullscreen mode Exit fullscreen mode

No DbContext. No Include chains. No Add-Migration. No mapper layer.


Object graphs in one call

This is the part that surprises people coming from EF. Props can contain other Props — single references, arrays, dictionaries — and the entire graph saves and loads as one operation:

[RedbScheme("Order")]
public class OrderProps
{
    public Customer Customer { get; set; }                          // nested class
    public Address ShippingAddress { get; set; }                    // nested class
    public Product[] Products { get; set; }                         // array of classes
    public RedbObject<PaymentProps>[] Payments { get; set; }        // array of full objects with own IDs
    public Dictionary<string, RedbObject<CouponProps>> Coupons { get; set; }  // dict of objects
    public Dictionary<(int Year, string Quarter), string> Reviews { get; set; } // tuple-key dict
}

await redb.SaveAsync(order);   // entire graph persisted, FK ordering handled
var loaded = await redb.LoadAsync<OrderProps>(id);
// loaded.Props.Customer.Address — ready
// loaded.Props.Payments[0].Props — ready (full RedbObject with own Id, DateCreate, etc.)
// loaded.Props.Coupons["SUMMER20"].Props — ready

Enter fullscreen mode Exit fullscreen mode

In EF Core this would be 28 tables, ~40 Include/ThenInclude calls, manual junction tables for the many-to-many, and INSERT ordering that breaks every time someone adds a non-nullable FK without a default.

In RedBase: one SaveAsync, one LoadAsync. The nested RedbObject instances are real first-class objects — they have their own IDs, their own timestamps, they can be queried independently, they participate in tree structures. They are not denormalised JSON glued to the parent.


What the LINQ actually compiles to

Where(e => e.Salary > 100_000 && e.Age >= 35) doesn't get serialized to JSON and re-parsed (that's the Free engine's path — covered later). In Pro, the C# expression tree is walked node-by-node and emitted as parameterized SQL. Roughly:

WITH pvt AS (
  SELECT v._id_object,
         (array_agg(v._Numeric) FILTER (WHERE v._id_structure = $1))[1] AS "Salary",
         (array_agg(v._Long)    FILTER (WHERE v._id_structure = $2))[1] AS "Age"
    FROM _values v
   WHERE v._id_structure = ANY($3::bigint[])
     AND v._id_object IN (SELECT o._id FROM _objects o WHERE o._id_scheme = $4)
   GROUP BY v._id_object
)
SELECT o.*
  FROM _objects o
  JOIN pvt ON pvt._id_object = o._id
 WHERE pvt."Salary" > $5
   AND pvt."Age"    >= $6
 ORDER BY pvt."Salary" DESC
 LIMIT 50;

Enter fullscreen mode Exit fullscreen mode

Parameterized. Plan-cached by PostgreSQL. One index scan on (_id_structure, _id_object), one aggregation pass, B-tree filter on flat columns. The number of filter fields doesn't change the shape of the query.

The C# → SQL compiler handles arithmetic (*, +, %), Math.*, String.Contains/StartsWith/Trim/ToLower, DateTime.Year/Month/..., nullable navigation (x.Address?.City) compiled to IS NOT NULL, the ternary operator compiled to CASE WHEN, StringComparison.OrdinalIgnoreCase compiled to native ILIKE, dictionary access dict["key"] compiled to pivot columns, and a few more edge cases that EF Core itself doesn't always handle.

You can preview the SQL of any query without executing it:

var sql = await query.ToSqlStringAsync();

Enter fullscreen mode Exit fullscreen mode

Like IQueryable.ToQueryString() in EF, but works for trees, GroupBy, window functions too.


Why bulk save is so fast — two tables, two streams

Something to understand before the change-tracking section: the storage layout is two tables, so SaveAsync of a batch is two bulk operations, not N round-trips.

On PostgreSQL the provider uses Npgsql's BeginBinaryImportAsync — native COPY protocol, binary format:

// from redb.Postgres/Data/NpgsqlBulkOperations.cs
await using var writer = await conn.BeginBinaryImportAsync(
    "COPY _objects (_id, _id_parent, _id_scheme, _name, ...) FROM STDIN (FORMAT BINARY)");
foreach (var obj in objectsList) {
    await writer.StartRowAsync();
    await writer.WriteAsync(obj.Id,        NpgsqlDbType.Bigint);
    await writer.WriteAsync(obj.IdScheme,  NpgsqlDbType.Bigint);
    // ... typed writes
}
await writer.CompleteAsync();

Enter fullscreen mode Exit fullscreen mode

For a batch save: one COPY stream writes the _objects rows, another writes the _values rows. Two streams, no per-row round-trips, no string-formatted INSERTs. MSSQL uses SqlBulkCopy for the same role.

This is why 1000 routes × ~40 fields = ~40,000 value rows save in tens of milliseconds inside the 200–300 ms budget. The bottleneck is the network round-trip and the COPY write, not the ORM machinery — there isn't any ORM machinery in the hot path.


Change tracking without DbContext (Pro)

DbContext keeps an in-memory snapshot of every entity you load — that's how it knows what changed. It's also why it isn't thread-safe and why the cache dies with the request.

RedBase Pro takes a different approach. With PropsSaveStrategy.ChangeTracking (Pro only; the free tier uses DeleteInsert), SaveAsync does this on a batch:

  1. One bulk SELECT of existing values for all object IDs being saved.
  2. Build two ValueTreeNode trees — one from your in-memory objects, one from the DB state.
  3. Structural diff — subtrees with matching hashes are skipped entirely (no value comparison, no child traversal). Inserts, updates, and deletes are computed per node.
  4. Three bulk operationsBulkInsertValuesAsync, BulkUpdateValuesAsync, BulkDeleteValuesAsync — each a single round-trip. Inserts go through COPY BINARY again.

Net effect: changing one field in a deeply nested object emits one UPDATE, not a full delete-and-reinsert of the entire props graph. And the comparison happens in C# on the application side — no DbContext lifetime to worry about, safe to run from a background Channel consumer or a Parallel.ForEach.

(In the production code from the previous section, the application also does its own obj.ComputeHash() check at the route level. That part is optional business-code — you could call SaveAsync on every route and the Pro tree-diff would still skip unchanged values internally. It's there as a coarse pre-filter so unchanged routes don't even enter the save pipeline at all, saving the diff work too.)


Production deployment — the numbers

The biggest deployment right now:

  • 30-year national HoReCa food distributor
  • ~150,000 orders/month, ~20,000 B2B customers, 600+ cities
  • 3-node cluster, 4 cores / 8 GB RAM / 50 GB SSD per node
  • ~550 daily internal users (operators, drivers, supervisors, dispatch, back-office)
  • 10–15% CPU under full load
  • Integrations: SAP, Kafka, RabbitMQ, GPS feeds, Mercury / EGAIS / government APIs

Three months in production, no data-layer incidents. Two projects in the company use it, the second one came after the first one proved stable.

Real workload, real timings

The hottest pipeline is the SAP monitoring sync. Every 60 seconds a SQL polling consumer calls a stored procedure on SAP S/4 (usp_TsUM_MonitoringReport_xml), gets ~1000 transportation orders back as XML, syncs reference dictionaries (drivers, vehicles, list items), bulk-loads existing routes from RedBase, hash-compares each one, and saves only the changed ones.

The whole loop fits in ~200–300 ms for ~1000 routes. The actual production code:

From("direct://tsum")
    .RouteId("tsum-processing")
    .ProcessWithRedb(async (redb, exchange, ct) =>
    {
        var orders = (List<TransportationOrder>)exchange.In.Body;

        var sw = Stopwatch.StartNew();

        // 1. Sync dictionaries (Drivers, Vehicles, ListItems) — only new/changed
        var dicts = await DictionarySyncService.SyncFromOrdersAsync(redb, orders, ct);
        if (dicts.DriversNew + dicts.DriversChanged + dicts.VehiclesNew
            + dicts.VehiclesChanged + dicts.ListItemsRelinked > 0)
            await RefDataCache.RefreshAsync(redb);
        var syncMs = sw.ElapsedMilliseconds;

        // 2. Bulk load existing routes by Code — one query, ~1000 codes
        var codes = orders.Select(o => o.Code).ToList();
        var existing = await redb.Query<TransportationRoute>()
            .WhereRedb(o => codes.Contains(o.ValueString!))
            .ToListAsync();
        var existingByCode = existing.ToDictionary(o => o.ValueString!, o => o);

        // 3. Merge: update if hash changed, insert if new, skip if unchanged
        var toSave = new List<IRedbObject>();
        var updatedCount = 0;
        var skippedCount = 0;
        foreach (var order in orders)
        {
            var routeProps = MapOrderToRouteProps(order, dicts);
            if (existingByCode.TryGetValue(order.Code, out var obj))
            {
                var hashBefore = obj.ComputeHash();
                EnrichRouteFromOrder(obj.Props!, routeProps);
                if (obj.ComputeHash() != hashBefore) { toSave.Add(obj); updatedCount++; }
                else skippedCount++;
            }
            else
            {
                toSave.Add(new RedbObject<TransportationRoute>
                {
                    name = $"Route {order.Code}",
                    value_string = order.Code,
                    Props = routeProps
                });
            }
        }

        // 4. One batched save — mixed inserts + updates
        if (toSave.Count > 0)
            await redb.SaveAsync(toSave);

        Logger.LogInformation(
            "[TSUM] orders={Orders} routes(+{Created} ~{Updated} ={Skipped}) " +
            "drivers(+{DN} ~{DC}) vehicles(+{VN} ~{VC}) " +
            "sync={Sync} query={Query} save={Save} total={Total}ms",
            orders.Count, toSave.Count - updatedCount, updatedCount, skippedCount,
            dicts.DriversNew, dicts.DriversChanged,
            dicts.VehiclesNew, dicts.VehiclesChanged,
            syncMs, queryMs, saveMs, sw.ElapsedMilliseconds);
    });

Enter fullscreen mode Exit fullscreen mode

TransportationRoute has ~40 fields and 12 RedbListItem references (Driver, Vehicle, CarMark, ShippingPoint, BusinessType, PlaceTo, PlaceFrom, LoadingZone, TransportStatus, Risk, DeliveryStatus, LoadStatus) plus 2 object references to AD users. Every single one is a foreign key in the database. None of them require a JOIN at query time — the materializer handles it.

Smaller queries (point-lookup of a single route, dictionary fetch, REST endpoints for the UI) run in 50–100 ms including HTTP overhead.

What's actually in the database

After running this in production, the storage looks like this:

  • ~1000 transportation routes/day, plus delivery points, transport snapshots, garage states, slice settings, slice snapshots, drivers, vehicles, yard places, AD user refs — about a dozen [RedbScheme] classes in active use
  • ~500,000 objects in _objects (routes, points, snapshots, dictionary items)
  • ~15,000,000 rows in _values (every typed property of every object)
  • All of that lives in 2 tables_objects and _values — plus the system tables for schemes, structures, lists, users, permissions

15 million typed value rows. Two tables. No 30-table schema. No 200 migration files. The query times above are on this dataset.

If you tried to model the same domain in EF Core flat tables, you'd end up with roughly: Routes, RoutePoints, TransportSnapshots, GarageStates, SliceSettings, SliceSnapshots, Drivers, Vehicles, CarMarks, ShippingPoints, BusinessTypes, YardPlaces, LoadingZones, TransportStatuses, DeliveryStatuses, Risks, TripRisks, AdUsers — plus junction tables and audit tables for each. 30+ tables, dozens of migrations, and every schema change is a deploy event.

In RedBase the schema change is git push. The next InitializeAsync() call adds the new structure rows. Done.

What would EF Core look like here?

I asked myself the same question before starting. Let's count what EF would need:

  • TransportationRoute entity → 1 table
  • 12 ListItem references → 12 lookup tables + 12 nullable FKs
  • 2 AD user references → 2 more FKs
  • The 1000-route batch update → 1000 entities tracked in DbContext for change detection
  • Hash-based skip of unchanged objects → doesn't exist in EF out of the box; you write it manually
  • The same logic across multiple processes/routes → DbContext is per-request, the change-tracker cache dies at the end of every batch

The realistic EF flow looks like this:

// Load existing — needs Include for every lookup or you get N+1
var existing = await db.Routes
    .Include(r => r.Driver)
    .Include(r => r.Vehicle)
    .Include(r => r.CarMark)
    .Include(r => r.ShippingPoint)
    .Include(r => r.BusinessType)
    .Include(r => r.PlaceTo)
    .Include(r => r.PlaceFrom)
    .Include(r => r.LoadingZone)
    .Include(r => r.TransportStatus)
    .Include(r => r.Risk)
    .Include(r => r.DeliveryStatus)
    .Include(r => r.LoadStatus)
    .Include(r => r.RiskSetBy)
    .Include(r => r.KcResponsible)
    .Where(r => codes.Contains(r.Code))
    .ToListAsync();

// Map, mutate, SaveChanges — 1000 tracked entities, full diff snapshot in RAM
// SaveChanges fires N UPDATE statements (one per row) in a transaction
await db.SaveChangesAsync();

Enter fullscreen mode Exit fullscreen mode

In practice on a workload like this:

  • The Include chain on 14 lookups produces a query that PostgreSQL/MSSQL can't always optimise cleanly (cartesian explosion risk; you split-query and pay multiple round trips).
  • SaveChanges on 1000 modified entities emits 1000 individual UPDATEs unless you reach for EFCore.BulkExtensions or similar third-party libraries. RedBase ships COPY BINARY for inserts and a batched UPDATE path for updates in the box.
  • The DbContext snapshot of 1000 tracked entities, each with 14 included lookups, is real memory pressure. AsNoTracking() is faster but you lose change detection — you have to re-implement it.
  • The 200–300 ms budget on a 4-core container is not realistic in this scenario without significant manual optimisation and probably a separate bulk-update path.

I'm not saying EF can't do it. I'm saying the equivalent EF implementation is more code, more moving parts, and significantly slower without third-party packages. RedBase ships a single SaveAsync(toSave) and a hash-based skip primitive in the box.

The hash comparison itself (obj.ComputeHash() != hashBefore) is application-level business code — it's optional; the Pro tree-diff inside SaveAsync would skip unchanged values anyway. But used like this it lets the application skip the ~95% of routes that didn't change between SAP polls before they even enter the save pipeline. Combined with COPY BINARY into two tables and the Pro tree-diff for the routes that do change, the whole loop fits the 200–300 ms budget. No equivalent set of built-in primitives exists in EF Core; you either snapshot manually, re-save everything, or pull in third-party bulk packages.


What EF Core does with complex objects (vs what RedBase does)

The E000 benchmark uses EmployeeProps — a realistic model with nested classes, arrays, dictionaries, and RedbObject references:

What classic ORM needs RedBase
~28 tables 2 tables
FK ordering on every INSERT Single SaveAsync
40+ Include/ThenInclude calls LoadAsync<T>(id) — one line
Migration file for every new field Add property, call SyncSchemeAsync, done
~5,000 INSERTs for 100 employees 1 BulkInsert via COPY protocol

For 100 complex employees: EF Core ≈ 4,000–6,000 INSERTs across 28 tables in FK order. RedBase: ~3,000 typed value rows, one COPY command.

And the test results are published — 525 automated tests across all editions and both databases. All green.


The query engine: free vs Pro

This is where it gets technically interesting.

Free edition compiles your LINQ lambda to a JSON facet format, then calls a plpgsql stored procedure that parses that JSON and generates SQL dynamically. For simple 1–2 field filters, PostgreSQL optimizes correlated EXISTS subqueries well. It works.

Pro edition walks the C# expression tree directly in C# (ExpressionToSqlCompiler), emits native parameterized SQL. Same plan every call — PostgreSQL can cache it. 3–10× faster on complex multi-field filters.

But here's what's happening right now: I'm porting the Pro PVT CTE query engine into the free tier. The plan is full parity on PVT — same CTE shape, same expression coverage, same projection capabilities. Free and Pro will speak the same query dialect.

The remaining differences between Free and Pro will be elsewhere:

  • Query plan caching. Pro builds parameterized SQL in C# (ExpressionToSqlCompiler) — same shape every call, PostgreSQL caches the plan once and reuses it. The free tier generates SQL inside plpgsql with literal values inlined (properly escaped to prevent injection), so each call can produce a slightly different plan. Same correctness, different plan-cache behavior.
  • Materialization. Pro has a parallel materializer (multiple values streams hydrated concurrently into the object graph). Free uses a simpler sequential path.
  • Save strategy. PropsSaveStrategy.ChangeTracking (tree-diff with hash-based subtree skip) is Pro-only. Free uses DeleteInsert — still bulk via COPY, but no per-field diff.

So Pro stays faster on hot paths and large object graphs, but the query language itself — what you can write in .Where, .Select, projections, grouping, having, FTS, regex — will be the same on both sides.


Show me the SQL it generates

Here's a real example — a projection query on EmployeeProps with 16 computed columns:
full names, salary calculations, date extractions, seniority classification via CASE, coalesce for nullable fields, explicit type cast. This is the free tier:

WITH _pvt_cte AS (
    SELECT
        v._id_object,
        (array_agg(v._Long)    FILTER (WHERE v._id_structure = 1000017))[1] AS "Age",
        (array_agg(v._Numeric) FILTER (WHERE v._id_structure = 1000020))[1] AS "Salary",
        (array_agg(v._DateTimeOffset) FILTER (WHERE v._id_structure = 1000018))[1] AS "HireDate",
        (array_agg(v._String)  FILTER (WHERE v._id_structure = 1000016))[1] AS "LastName",
        (array_agg(v._String)  FILTER (WHERE v._id_structure = 1000015))[1] AS "FirstName"
    FROM _values v
    WHERE v._id_structure = ANY(ARRAY[1000015,1000016,1000017,1000018,1000020]::bigint[])
      AND v._id_object IN (SELECT o._id FROM _objects o WHERE o._id_scheme = 1000014)
    GROUP BY v._id_object
)
SELECT
    o._id                                                          AS id,
    ("FirstName" || ' ' || "LastName")                            AS full_name,
    UPPER("FirstName")                                             AS upper_name,
    LENGTH("FirstName")                                            AS name_len,
    ("Salary" * 12)                                                AS yearly_x12,
    (("Salary" * 0.15) - ("Salary" * 0.013))                      AS bonus_after_tax,
    ("Salary" / ("Age" + 1))                                       AS avg_per_year,
    ("Age" % 10)                                                    AS age_mod10,
    ABS(("Age" - 35))                                              AS abs_diff,
    FLOOR(("Salary" / 2))                                          AS floor_half_salary,
    EXTRACT(YEAR  FROM "HireDate")                                 AS hire_year,
    EXTRACT(MONTH FROM "HireDate")                                 AS hire_month,
    EXTRACT(YEAR  FROM AGE("HireDate", '2026-05-20'))              AS tenure_years,
    CASE
        WHEN ("Age" >= 60) THEN 'senior'
        WHEN ("Age" >= 35) THEN 'mid'
        ELSE 'junior'
    END                                                            AS seniority,
    COALESCE("FirstName", '<unknown>')                             AS display_name,
    ("Age")::text                                                  AS age_as_text
FROM _pvt_cte
JOIN _objects o ON o._id = _pvt_cte._id_object
WHERE "Salary" > '0'::numeric
ORDER BY "Salary" DESC
LIMIT 3

Enter fullscreen mode Exit fullscreen mode

One _values scan. One GROUP BY. Sixteen computed output columns. Standard SQL — no magic, fully readable in pgAdmin, fully EXPLAIN ANALYZE-able.

This query was generated by pvt_build_projection_sql() — a plpgsql function that is shipping in the next free release.


When this fits and when it doesn't

Honest take.

Fits well: business apps with collections, nested structures, hierarchies, dictionaries, or schemas that change often. Apps where a business analyst can ask "add a field" and you want to ship the same day. Anything where you'd otherwise end up with 28 tables for one logical entity.

Doesn't fit: flat reporting databases with five fixed tables. Use Dapper there — it's honest and fast.

The argument I'd push back on is the common one — "for 80% of projects EF Core is enough". In my experience, the moment you have collections inside an entity, lookup data, dictionaries, tree structures, or polymorphic types, the migration tax compounds fast. RedBase removes that whole class of work. For the remaining 20% of cases — pure flat schemas — it's overkill, agreed.

And two more axes that usually get left out of that comparison:

  • Extensibility. Adding a field is a property in C# and one InitializeAsync() call — no migration script, no review, no deploy window. The same change in an EF stack is at minimum: model edit + Add-Migration + reviewed SQL + coordinated deploy. Multiply by every iteration in a year.
  • Cost of development. On the TsUM project the data-layer work — schemas, storage, queries, sync pipelines, change-tracking, bulk save — came in at roughly 128 person-hours. The estimate for the same scope on a classic EF + handwritten bulk-update + ASP.NET Identity + 30-table migration stack was on the order of 3000 person-hours. That's not a typo and it's not marketing — it's the gap between "add a property, redeploy" and "add a column, write migration, update DTOs, update mapper, update repository, write tests for all of it". The 80%/20% framing only holds if you count lines of code; once you count engineer-hours over a project lifetime it inverts.

What else is in the box

195+ working examples in the repo. The shortlist of things people usually don't expect to be built-in:

// Tree queries — recursive CTE underneath, no CTE to write yourself
var products = await redb.TreeQuery<Product>(londonHQ.Id, maxDepth: 10)
    .Where(p => p.InStock)
    .ToListAsync();

// Soft delete with atomic mark + background purge + progress in DB
await redb.SoftDeleteAsync(ids, user);

Enter fullscreen mode Exit fullscreen mode

Also shipping in the core (this is the short list — the real list is much longer):

  • GroupBy + window functions in one query
  • Built-in users / roles / permissions (no ASP.NET Identity dependency, no separate auth schema)
  • Export / import via .redb files (PostgreSQL ↔ MSSQL portability)
  • Multi-database in one process with full domain isolation — separate connections, separate scheme caches, separate object caches per domain; one app can talk to several independent RedBase databases at the same time without their metadata leaking into each other
  • Scheme cache and object cache (both per-domain), invalidated on SyncSchemeAsync / SaveAsync, used by the materializer to skip repeated metadata lookups
  • Atomic soft-delete with background purge and progress visible in the DB
  • Tree queries with recursive CTE
  • Polymorphic trees — different C# types at each level of the same tree

There's a lot more that doesn't fit in one section. The architecture page has the full map.


What's coming next

The FreePvtQuery release is the nearest milestone. This is a complete rewrite of the free tier's query engine based on the PVT CTE architecture — bringing it to full parity with Pro on the query side:

  • Single-pass _values scan (vs correlated EXISTS per field)
  • Full expression system: arithmetic, string, date, math, regex, FTS, CASE, coalesce, cast
  • Projections with arbitrary computed columns (like the SQL above)
  • HAVING, DISTINCT ON, GROUP BY extensions

After that: MSSQL support for the new engine, C# LINQ-to-JSON bridge so you keep writing .Where(e => e.Salary > 100_000) without touching JSON. Pro will keep its edge on plan caching (parameterized SQL), parallel materialization, and tree-diff change tracking.


This is maybe 10% of what's in there

This post covered the basics: storage model, query engine, free PVT demo, production deployment. Topics I didn't touch:

  • Polymorphic tree hierarchies (different C# types at every level, same tree API)
  • Object graphs with RedbObject<T> references inside Props
  • Data migrations (Pro): fluent API, computed columns, in-place type changes
  • Export/import: .redb files (JSONL/ZIP), PostgreSQL ↔ MSSQL portability
  • Multi-database in one process (domain-isolated caches)
  • redb.CLI — schema management from command line
  • redb.PropsEditor — runtime props editing UI
  • Integration with redb.Route (22-transport pipeline engine) — separate article

If there's a direction you want covered next — migrations, polymorphic trees, benchmarks, internals of the diff-tree change tracking — drop it in the comments and that's what the next post will be.


Links

There's a lot of material. If you want to go deep without reading three articles, paste these into your AI of choice and ask it to analyse the architecture:

Questions in the comments — I'll answer them directly.