惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

The GitHub Blog
The GitHub Blog
The Hacker News
The Hacker News
O
OpenAI News
TaoSecurity Blog
TaoSecurity Blog
Google DeepMind News
Google DeepMind News
Forbes - Security
Forbes - Security
Spread Privacy
Spread Privacy
SecWiki News
SecWiki News
V
Vulnerabilities – Threatpost
Latest news
Latest news
Y
Y Combinator Blog
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
S
Schneier on Security
Cloudbric
Cloudbric
Webroot Blog
Webroot Blog
G
Google Developers Blog
M
MIT News - Artificial intelligence
Cisco Talos Blog
Cisco Talos Blog
Blog — PlanetScale
Blog — PlanetScale
Attack and Defense Labs
Attack and Defense Labs
aimingoo的专栏
aimingoo的专栏
The Register - Security
The Register - Security
Martin Fowler
Martin Fowler
MongoDB | Blog
MongoDB | Blog
Simon Willison's Weblog
Simon Willison's Weblog
N
News and Events Feed by Topic
L
LINUX DO - 热门话题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Jina AI
Jina AI
美团技术团队
C
Cyber Attacks, Cyber Crime and Cyber Security
H
Hackread – Cybersecurity News, Data Breaches, AI and More
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Hacker News: Ask HN
Hacker News: Ask HN
有赞技术团队
有赞技术团队
N
Netflix TechBlog - Medium
H
Heimdal Security Blog
L
Lohrmann on Cybersecurity
The Last Watchdog
The Last Watchdog
MyScale Blog
MyScale Blog
C
CERT Recently Published Vulnerability Notes
Hugging Face - Blog
Hugging Face - Blog
Recent Commits to openclaw:main
Recent Commits to openclaw:main
T
The Exploit Database - CXSecurity.com
A
About on SuperTechFans
博客园 - 叶小钗
博客园_首页
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
F
Fortinet All Blogs
博客园 - 聂微东

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python) The Hidden Cost of AI Systems Nobody Talks About. undefined vs undeclared, and how typeof behaves Switching from file-based jobs to NATS/Kafka in Rust without changing code io_uring Adventures: Rust Servers That Love Syscalls Why Agentic AI is Killing the Traditional Database The POUR principles of web accessibility for developers and designers Quantum Neural Network 3D — A Deep Dive into Interactive WebGL Visualization How To Install Caveman In Codex On macOS And Windows Automation Pipeline Reliability: Why Your Workflow Breaks When Nobody Is Watching I Built an 'Open World' AI Coding Agent — It Works From ANY Folder From Freelancing to Product: A Tech Service Company's SaaS Transformation China's AI Giants: Adding Tencent Hunyuan & ByteDance Doubao to AI University (74 Providers) On the Vibe Coders and Their Lies clerk: Auto-Summarize Your Claude Code Sessions AI Weekly — 2026/04/10–04/17 | The Model Lockdown Is Here, but the Toolchain Is the Real Battleground AI 週報 — 2026/04/10–2026/04/17 模型封鎖潮來了,但工具鏈才是真戰場 Maybe this is how Open-Source apps are born... 🚀 Fine-Tune LLMs with LoRA and QLoRA: 2026 Guide tRPC v11 + Next.js App Router: End-to-End Type Safety Without the Boilerplate ShadCN UI in 2026: Why I Stopped Installing Component Libraries and Started Owning My Components SaaS Billing in React Server Components: Stripe + Supabase Without a Single `useEffect` Join our DEV Weekend Challenge — $1,000 in Prizes Across TEN winners! Submissions Due April 20 at 6:59 AM UTC. Implementing FSRS Spaced Repetition in Flutter + Supabase — Adding Memory Science to an AI Learning App "I Texted My Localhost From the Train — Claude Code Fixed the Bug Before I Got Home" I Built a Sales Prep AI and It Went Deeper Than Expected Design to Code #2: One JSON, Eleven Outputs Solving the 100M-Row Problem: A Summary Table Pattern for High-Volume Push Notification Logs Flutter Web With Wasm: What Actually Changes For Developers I Built 50 Royalty-Free Soundtracks for My Side Project in a Weekend Using AI Music Generation The Vibe Coding Security Checklist: 7 Things to Check Before You Ship Stop Letting Googlebot Guess Fix Your React App's SEO Right Desconstruindo o Streaming do LinkedIn: Como Criar um Engine de Extração de Vídeo de Alta Performance com HLS e FFmpeg (EDA Part-1) EDA (Exploratory Data Analysis) Explained With Real Life — Why Looking at Your Data Is the Most Important Step in Machine Learning Brand Relationship Management at Scale: Our 4-Touch Outreach System for 200+ Brands Why String.fromEnvironment() Might Return an Empty String in Dart JGuardrails 1.0.0 — Hardening Java LLM Apps Against Jailbreaks, Toxicity, and Prompt Injection Plan and Schedule a Full Week of Threads Content From One Claude Conversation Coding Cat Oran Ep3, Five Tables Changed Everything Updated: BFF Pattern I'm done watching freelancers get buried by 200 proposals. So I'm building the alternative. This is my first post BFS Algorithm in Java Step by Step Tutorial with Examples Tracking LLM Pricing Monthly: An Open Dataset for 22 AI Models How We Measure Content ROI on a Comparison Site: Revenue Attribution Without Perfect Data Introducing Nova AI Ops: The AI-Native Operating System for SRE Teams I built a free desktop video downloader for Windows — Grabbit How Talkie OCR Helps Vision-Impaired & Dyslexic Users Read the World Around Them VRCFaceTracking安装和iPhone面捕配置教程,有bug Even CrowdStrike Can't See Your Agents The Automation Gold Rush: What n8n Workflows and Claude Are Opening Up for Developers Right Now
Building WeaveLLM: Why .NET Deserves a Better then LangChain
Harshil Shah · 2026-05-02 · via DEV Community

Building WeaveLLM: Why .NET Deserves a Better LangChain

Tags: dotnet, ai, csharp, llm
Cover image: architecture diagram of WeaveLLM pipeline


Introduction

Here's a thing I keep running into: .NET developers building serious AI features, and the ecosystem basically telling them to just use Python. LangChain, LlamaIndex, DSPy — every major orchestration framework is Python-first. .NET is an afterthought, if it shows up at all.

But C# developers aren't waiting around. They're shipping customer support bots, RAG pipelines, code-review agents — right now, in production. They're just doing it by calling OpenAI's REST API by hand, copy-pasting retry logic into every service class, and hoping nothing throws in an async chain at 2 AM.

LangChain does have a .NET port. I've used it. It's incomplete, the types don't map cleanly to .NET idioms, and it leans on exceptions as its main error-handling strategy — which is genuinely painful to compose in async code. The deeper issue is that LangChain was designed around Python's dynamic type system. Porting it to C# without rethinking the API from scratch gives you a framework that fights the language the entire time.

So I built WeaveLLM instead. Started as a hobby project, turned into something I actually want to use at work. It's a .NET 8 AI orchestration library designed specifically for C# — railway-oriented results, IAsyncEnumerable<T> streaming, an ASP.NET-style middleware pipeline, and fully generic chains that catch type mismatches at compile time rather than in a 3 AM Slack alert. Here are the four decisions that shaped it.


Design Decision 1: ChainResult<T> Over Exceptions

Let me describe a problem you've almost certainly hit. You call an LLM, it fails — rate limited, timed out, bad input, provider down — and now you need to handle that failure somewhere. In an exception-based framework, that means try/catch at every composition point. Stack a few chains together and you've got nested try/catch blocks all the way down, each one trying to figure out which exception type maps to which recovery strategy.

It's not that exceptions are wrong. It's that they're invisible. A method signature like Task<string> tells you nothing about whether it can fail and how.

WeaveLLM uses railway-oriented programming. Every chain execution returns ChainResult<T> — a result type where errors are values you work with, not surprises that unwind your stack.

// Errors are data, never thrown
public sealed class ChainResult<T>
{
    public bool IsSuccess { get; }
    public bool IsFailure => !IsSuccess;
    public T? Value { get; }
    public ChainError? Error { get; }
    public TokenUsage? TokenUsage { get; }
    public TimeSpan Duration { get; }
    public IReadOnlyDictionary<string, object> Metadata { get; }

    public static ChainResult<T> Success(T value, TokenUsage? usage = null) { ... }
    public static ChainResult<T> Failure(ChainError error) { ... }
    public static ChainResult<T> Failure(string message, string? code = null) { ... }

    // Projects success value to new type; failure passes through unchanged
    public ChainResult<TNext> Map<TNext>(Func<T, TNext> transform) { ... }

    // Destructure into (isSuccess, value, error)
    public void Deconstruct(out bool isSuccess, out T? value, out ChainError? error) { ... }
}

Enter fullscreen mode Exit fullscreen mode

Errors come with structure too, not just a message string:

public record ChainError(string Message, string Code, Exception? InnerException = null)
{
    public static ChainError Timeout(string msg) => new(msg, "Timeout");
    public static ChainError RateLimited(string msg) => new(msg, "RateLimited");
    public static ChainError InvalidInput(string msg) => new(msg, "InvalidInput");
    public static ChainError ProviderError(string msg, Exception? ex = null)
        => new(msg, "ProviderError", ex);
}

Enter fullscreen mode Exit fullscreen mode

Here's what that looks like at the call site, compared to the try/catch version:

// The old way — try/catch at every layer
try
{
    var response = await chain.ExecuteAsync(input);
    return response.Text;
}
catch (RateLimitException) { /* retry */ }
catch (TimeoutException)   { /* fallback */ }
catch (Exception ex)       { /* log */ throw; }

// WeaveLLM — errors are values you can switch on
var result = await chain.ExecuteAsync(input, context);

var (isSuccess, value, error) = result;
if (!isSuccess)
{
    return error!.Code switch
    {
        "RateLimited" => await fallbackChain.ExecuteAsync(input, context),
        "Timeout"     => ChainResult<string>.Failure(error),
        _             => ChainResult<string>.Failure(error)
    };
}

return result.Map(v => v.ToUpperInvariant()); // transforms success, ignores failure

Enter fullscreen mode Exit fullscreen mode

ChainResult<T> also bundles TokenUsage and a Metadata dictionary that middleware layers write into without breaking the type contract. If you've ever used F# or Rust, this is the same railway pattern — errors short-circuit, successes flow through, and Map() lets you transform values without having to unwrap and re-wrap manually.

Approach Error handling Composability Observability built-in
Raw exceptions try/catch at every level Hard to chain None
LangChain.NET Mix of exceptions and nulls Inconsistent None
WeaveLLM Values (ChainResult<T>) Monadic Map/Deconstruct TokenUsage + Metadata

Design Decision 2: IAsyncEnumerable<T> for Streaming

Streaming isn't a polish feature — it's the thing users actually notice. A response that starts rendering in 200ms feels fast, even if the total generation takes 8 seconds. A response that hangs for 8 seconds and then dumps a wall of text feels broken, even if the numbers are the same.

Python gets async generators for this. They work great in Python. In C#, the equivalent is IAsyncEnumerable<T>, which has been in .NET since Core 3.0 and plugs directly into await foreach, LINQ, and ASP.NET Core's response pipeline. WeaveLLM makes it part of the core contract, not an optional add-on.

Every chain has two execution paths: a request/response ExecuteAsync and a token-by-token StreamAsync:

public interface IChain<TInput, TOutput>
{
    string Name { get; }

    Task<ChainResult<TOutput>> ExecuteAsync(
        TInput input,
        ChainContext context,
        CancellationToken cancellationToken = default);

    IAsyncEnumerable<TOutput> StreamAsync(
        TInput input,
        ChainContext context,
        CancellationToken cancellationToken = default);
}

Enter fullscreen mode Exit fullscreen mode

The provider layer exposes the same interface. Here's what consuming it looks like:

public interface IStreamingChatModel : IChatModel
{
    IAsyncEnumerable<string> StreamChatAsync(
        IReadOnlyList<ChatMessage> messages,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);
}

// Each token prints as it arrives
await foreach (var token in model.StreamChatAsync(messages, cancellationToken: ct))
{
    Console.Write(token);
}

Enter fullscreen mode Exit fullscreen mode

And in ASP.NET Core, wiring up a streaming endpoint with Server-Sent Events is about 12 lines:

app.MapGet("/stream", async (HttpContext http, IChain<string, string> chain) =>
{
    http.Response.ContentType = "text/event-stream";
    var context = new ChainContext();

    await foreach (var token in chain.StreamAsync("Tell me a story", context))
    {
        await http.Response.WriteAsync($"data: {token}\n\n");
        await http.Response.Body.FlushAsync();
    }
});

Enter fullscreen mode Exit fullscreen mode

Compare that to callback-based streaming, which is what a lot of .NET AI libraries still use:

// Callback-based — no backpressure, no real cancellation support,
// can't await async work per-token
await chain.StreamAsync(input, onToken: token =>
{
    Console.Write(token);
});

Enter fullscreen mode Exit fullscreen mode

With IAsyncEnumerable<T> you get CancellationToken integration automatically (the consumer cancels mid-stream and the producer stops), full LINQ support via System.Linq.Async, and no adapter layer between your chain and the framework. It's just the language doing what it already does well.

Streaming model Backpressure CancellationToken LINQ composable ASP.NET SSE
Callbacks Manual Awkward No Custom plumbing
Events No No No Custom plumbing
IAsyncEnumerable Built-in Built-in Yes Native

Design Decision 3: The Middleware Pipeline

If you've built anything in ASP.NET Core, you already know how middleware works: components that wrap a request, can inspect or modify it, can short-circuit or pass it through, and compose in a predictable order. Every .NET developer has this mental model. It made sense to borrow it directly for LLM chains.

The interface is deliberately close to ASP.NET's RequestDelegate shape:

public delegate Task<ChainResult<TOutput>> ChainDelegate<TInput, TOutput>(
    TInput input,
    ChainContext context,
    CancellationToken cancellationToken);

public interface IChainMiddleware<TInput, TOutput>
{
    Task<ChainResult<TOutput>> InvokeAsync(
        TInput input,
        ChainContext context,
        ChainDelegate<TInput, TOutput> next,
        CancellationToken cancellationToken = default);
}

Enter fullscreen mode Exit fullscreen mode

next is everything downstream. Call it to continue. Skip it to short-circuit — a cache hit, a circuit breaker tripping. Call it and then do something with the result — tracing, cost tracking, PII scrubbing. WeaveLLM ships six middleware implementations out of the box:

services.AddWeaveLLM()
    .AddOpenAI(o => o.ApiKey = config["OpenAI:ApiKey"])
    .AddReActAgent(maxSteps: 5);

var chain = myLlmChain
    .WithMiddleware(new RetryMiddleware(maxRetries: 3, backoffSeconds: 1.5))
    .WithMiddleware(new CacheMiddleware(ttl: TimeSpan.FromMinutes(10)))
    .WithMiddleware(new RateLimitingMiddleware(requestsPerMinute: 60))
    .WithMiddleware(new TracingMiddleware(activitySource))
    .WithMiddleware(new CostMiddleware(pricing))
    .WithMiddleware(new PiiScrubbingMiddleware());

Enter fullscreen mode Exit fullscreen mode

Writing your own is implementing one method. Here's a logging middleware, complete:

public sealed class LoggingMiddleware<TInput, TOutput>
    : IChainMiddleware<TInput, TOutput>
{
    private readonly ILogger _logger;
    public LoggingMiddleware(ILogger logger) => _logger = logger;

    public async Task<ChainResult<TOutput>> InvokeAsync(
        TInput input,
        ChainContext context,
        ChainDelegate<TInput, TOutput> next,
        CancellationToken cancellationToken = default)
    {
        _logger.LogInformation("Chain {Name} starting", context.ChainName);
        var result = await next(input, context, cancellationToken);
        _logger.LogInformation("Chain {Name} finished: {Status} in {Duration}ms",
            context.ChainName,
            result.IsSuccess ? "OK" : result.Error!.Code,
            result.Duration.TotalMilliseconds);
        return result;
    }
}

Enter fullscreen mode Exit fullscreen mode

LangChain's version of this is "callbacks" — a grab-bag of optional hooks (on_llm_start, on_llm_end, on_chain_error) registered globally and fired via reflection. They can't short-circuit. They can't replace the result. They don't compose with each other in any meaningful way. It's a notification system dressed up as a pipeline.

WeaveLLM's middleware is an actual pipeline. Each component decides whether to call next, what to do with the result, and whether to replace or propagate. That's the difference between observability you can build on and hooks you can only listen to.


Design Decision 4: Generic Chains with Compile-Time Type Safety

LangChain chains are stringly typed by default. Inputs and outputs are usually Dictionary<string, string> and the framework resolves what goes where at runtime. In Python, that's fine — you've got a REPL, you find the bug fast. In C#, you find it in production when a downstream chain reaches for a key the upstream chain forgot to set.

It's an unforced error. The type system is right there.

IChain<TInput, TOutput> makes the contract explicit:

public interface IChain<TInput, TOutput>
{
    string Name { get; }
    Task<ChainResult<TOutput>> ExecuteAsync(TInput input, ChainContext context,
        CancellationToken cancellationToken = default);
    IAsyncEnumerable<TOutput> StreamAsync(TInput input, ChainContext context,
        CancellationToken cancellationToken = default);
}

// Connectable variant for fluent Pipe() composition
public interface IConnectableChain<TInput, TOutput> : IChain<TInput, TOutput>
{
    IConnectableChain<TInput, TNext> Pipe<TNext>(IChain<TOutput, TNext> next);
    IConnectableChain<TInput, TOutput> WithMiddleware(IChainMiddleware<TInput, TOutput> middleware);
}

Enter fullscreen mode Exit fullscreen mode

Pipe<TNext>() enforces that the output type of the left chain matches the input type of the right one — at compile time. Not at test time, not in staging. At dotnet build.

record UserQuery(string Text);
record SearchResults(IReadOnlyList<string> Chunks);
record FinalAnswer(string Text, decimal ConfidenceScore);

// This compiles — types line up
IConnectableChain<UserQuery, FinalAnswer> ragPipeline =
    retrievalChain         // IChain<UserQuery, SearchResults>
        .Pipe(rerankerChain)   // IChain<SearchResults, SearchResults>
        .Pipe(generatorChain); // IChain<SearchResults, FinalAnswer>

// This doesn't compile — SearchResults != UserQuery, caught immediately
// retrievalChain.Pipe(generatorChain); // CS ERROR

Enter fullscreen mode Exit fullscreen mode

ComposedChain also handles failure propagation: if the first chain returns a failure, the second never runs and the error forwards unchanged. No null checks, no manual short-circuiting.

The practical payoff shows up when you're actually using the results:

var result = await ragPipeline.ExecuteAsync(
    new UserQuery("What is the WeaveLLM license?"),
    new ChainContext { SessionId = "user-123" });

if (result.IsSuccess)
{
    Console.WriteLine($"Answer: {result.Value!.Text}");
    Console.WriteLine($"Confidence: {result.Value.ConfidenceScore:P0}");
    Console.WriteLine($"Cost: ${result.TokenUsage?.EstimatedCostUsd:F4}");
}

Enter fullscreen mode Exit fullscreen mode

Full IntelliSense, no casting, no as checks. If you rename a property on FinalAnswer, the compiler tells you everywhere it breaks.

Type safety LangChain (Python) LangChain.NET WeaveLLM
Input/output types Dynamic dict Dynamic dict Generic IChain<TIn, TOut>
Mismatched pipes caught Runtime Runtime Compile time
IDE completion on results None Partial Full IntelliSense
Refactor safety None None Compiler-enforced

What Ships in v0.1.0-alpha

WeaveLLM v0.1.0-alpha is on NuGet across five packages:

dotnet add package WeaveLLM.Core
dotnet add package WeaveLLM.Providers
dotnet add package WeaveLLM.Memory
dotnet add package WeaveLLM.Observability
dotnet add package WeaveLLM.Extensions.DependencyInjection

Enter fullscreen mode Exit fullscreen mode

Providers — four, all production-tested:

  • OpenAI (gpt-4o, embeddings, streaming)
  • Anthropic (claude-sonnet, streaming)
  • Ollama (local inference, embeddings — no API key needed)
  • HuggingFace (Inference API, embeddings)

All share IChatModel. Swap provider with one line.

Agents — three patterns:

  • ReActAgent — Thought → Action → Observation loop until Final Answer
  • PlanAndExecuteAgent — separate planning and execution phases for complex tasks
  • AgentGraph<TState> — state machine for multi-agent workflows with typed shared state and conditional branching

Memory and RAG — full pipeline:

  • IMemoryStore with in-memory, Qdrant, and Postgres (pgvector) backends
  • DefaultRagPipeline with recursive text splitting, hybrid BM25 + vector search, and Reciprocal Rank Fusion
  • Document loaders for plain text, Markdown, and directories

Middleware — six built-in:
Retry, caching, rate-limiting, OpenTelemetry tracing, per-request cost estimation, and PII scrubbing.

Observability — baked in, not bolted on:
OpenTelemetry tracing and metrics via the WeaveLLM ActivitySource and Meter. Per-request token usage and estimated USD cost tracked across all providers.


Where It's Going

v0.2.0-alpha (next 6–8 weeks):

  • Azure OpenAI provider
  • Multi-modal input (image + text messages)
  • Streaming agents — IAsyncEnumerable<AgentStep> for real-time reasoning step visibility
  • Redis memory backend
  • Structured output support

v1.0.0 — Q4 2026:

  • Frozen public API with full semver guarantee
  • Docs site with guides, API reference, and runnable samples
  • Benchmark suite against LangChain Python — qualitative claims become numbers

Honest Alpha Warning

This is a hobby project that got serious. The core abstractions are stable and I won't break them before v1.0. All four providers are tested and working. Some edge cases are still being hardened and breaking changes are possible in non-core areas before v1.0.

I built this because I was frustrated, not because I had a product roadmap. If you've hit the same frustrations with .NET LLM tooling, that's the target audience — and the best time to shape what v1.0 looks like is right now.

If you want to contribute, good-first-issue labels are a good starting point: adding provider adapters, writing integration tests against real APIs, extending the middleware library. Adding a new provider is just implementing IChatModel — the middleware, streaming, and type-safety machinery comes for free.


Licence

MIT. Use it, fork it, ship it in your own projects. No credit needed.

Copyright (c) 2026 WeaveLLM

Enter fullscreen mode Exit fullscreen mode

The railway result, the async stream, the ASP.NET-style middleware, the generic chain — none of these are Python idioms with a C# skin on top. They're what this kind of library looks like when it starts from C# instead of ending there.


Source code, samples, and NuGet links: github.com/harshil-inspire2/WeaveLLM

Keywords: .NET AI, LangChain alternative, AI orchestration, C# LLM, dotnet AI framework, IAsyncEnumerable streaming, railway-oriented programming, ASP.NET middleware pattern