惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
WordPress大学
WordPress大学
Google DeepMind News
Google DeepMind News
T
The Exploit Database - CXSecurity.com
阮一峰的网络日志
阮一峰的网络日志
F
Fox-IT International blog
The GitHub Blog
The GitHub Blog
Engineering at Meta
Engineering at Meta
I
Intezer
P
Privacy & Cybersecurity Law Blog
B
Blog RSS Feed
Latest news
Latest news
小众软件
小众软件
A
Arctic Wolf
Attack and Defense Labs
Attack and Defense Labs
L
LINUX DO - 热门话题
博客园 - 聂微东
B
Blog
T
Troy Hunt's Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
Malwarebytes
Malwarebytes
爱范儿
爱范儿
Recorded Future
Recorded Future
Apple Machine Learning Research
Apple Machine Learning Research
人人都是产品经理
人人都是产品经理
D
Docker
T
Threat Research - Cisco Blogs
MyScale Blog
MyScale Blog
Martin Fowler
Martin Fowler
E
Exploit-DB.com RSS Feed
F
Fortinet All Blogs
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
PCI Perspectives
PCI Perspectives
Scott Helme
Scott Helme
N
Netflix TechBlog - Medium
博客园 - 三生石上(FineUI控件)
T
True Tiger Recordings
C
Check Point Blog
Microsoft Azure Blog
Microsoft Azure Blog
D
Darknet – Hacking Tools, Hacker News & Cyber Security
K
Kaspersky official blog
Security Latest
Security Latest
The Hacker News
The Hacker News
Microsoft Security Blog
Microsoft Security Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
Stack Overflow Blog
Stack Overflow Blog
S
Security @ Cisco Blogs
C
CXSECURITY Database RSS Feed - CXSecurity.com
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
M
Microsoft Research Blog - Microsoft Research

DEV Community

Prompt is Not Runtime: Why I Rejected LLM State-Machines for Deterministic FinTech SDD en proyectos brownfield: pros, contras y la estrategia que realmente funciona Hexagonal Architecture in Practice: Ports, Adapters, and Tests That Skip the Database Your Playwright Tests Will Need Refactoring. Here's How to Make It Painless Development of a custom API layer for Framer CMS integration Stream 24/7 on YouTube with Ant Media Server Chat With Your Raspberry Pi — Control GPIO, Read Sensors, and Manage Services via Telegram Using Garudust Token economics for AI agents: why workflow ownership matters more than task automation Why SMS Codes Are No Longer Enough for Business Security Communicate Ideas Visually: Let AI Run the Feedback Loop Building an Autonomous AI Hiring Agent with Multi-Agent Runtime Orchestration 🚀 Validating lists in Okyline: uniqueness, order, and cross-element rules Base64 encoding visualizer I Built a Browser Game Engine Inside WordPress Without Canvas or WebGL. Here's Why Designing Website Analytics for AI Crawlers Without Surveillance Forget Usernames and Passwords: A Web2 Developer’s Guide to Solana Identity Usage-Based Billing for AI Agents with FastAPI and Kong 30 Days of AI Agents Buying From a Real WooCommerce Store. Here's What the Data Says. AWS - Identity and Access Management Explained for Beginners Token Saving, and Caveman How Superpowers Forces Skill Execution How I Stressed My SQLite Job Queue to 5,000 Continuous Tasks on an Android Phone (And Why It Outperformed the Cloud) Is the job market dead, or has the skill bar increased? Introducing PlanCollab: AI-Powered Cross-Agent Code Planning & Review No More Waiting in Line: How I Built a Web-Based Canteen Queue Management System with Flask and MongoDB Deploying Unbound Validating DNS Resolver on Ubuntu 24.04 Deploying Prometheus Metrics Collection Server on Ubuntu 24.04 AWS IAM Roles Anywhere Hands-On Deploying Grafana Metrics Visualization Platform on Ubuntu 24.04 Deploying Gogs Simple Git Hosting on Ubuntu 24.04 Deploying MongoDB NoSQL Document Database on Ubuntu 24.04 Deploying Passbolt Team Password Manager on Ubuntu 24.04 Deploying OpenWebUI Local AI Interface on Ubuntu 24.04 Deploying Bitwarden Password Management Vault on Ubuntu 24.04 Deploying GitLab CE DevOps Management Suite on Ubuntu 24.04 Panduan Praktis Pasca-Install Ubuntu 24.04 Desktop Agar Sistem Nyaman Dipakai Harian Deploying n8n Workflow Automation Engine on Ubuntu 24.04 Memory Cache: o bug invisível que só aparece quando sua aplicação precisa escalar horizontalmente "this" in JS is SIMPLE as a rock LoRaWAN has ~51 bytes per frame. Your JSON alert doesn't fit. Stop Avoiding Bitwise Operators ERP Product Tree Denormalization: The Maintenance and Scale Conundrum We Leaked 1,368 Customers into Our LIVE Stripe Account via E2E Tests Overlay Widgets vs Real WCAG Scanners: A 2026 Buyer’s Guide How an Accessibility SaaS Broke Its Own Landing (and How We Fixed It) Building the harness around our coding agents: eight failure modes, eight pillars LynxDB - I wanted Splunk's query language without Splunk RAG Is Not Always the Answer Anymore: How AI Agents Search Code in 2026 I Leaked API Keys Through My .env File — Here's What I Learned About Secret Management Score Big with Power Apps: A Step-by-Step Guide to Custom Football APIs IaC Drift Is Inevitable — Design for Detection, Not Prevention I Built a CLI Tool That Writes Better Git Commits Than I Do Adding Text Selection to Bash I Built an Android App With Zero Backend — Here's What Happened I built toklock — the only Anthropic rate-limit proxy that queues requests instead of crashing your agents The Hardest Part of Building an Encrypted Journaling App Wasn’t Encryption Replicate MySQL to ClickHouse with Sling Why I Think the Next Big Blockchains Will Be Built Around AI, Not With AI on Top How to use the Specification Pattern to Clean Up Query Logic in C#, .NET AI may already be turning translators into proofreaders. Coders could be next? One API, every social image - dynamic OG, Twitter, LinkedIn, Pinterest, YouTube AI Agents Need Artifacts, Not Activity. What I Learned Shipping 7 Mac Apps in 12 Months — The Honest Retrospective Being pro-developer in the AI age Circuit Breaker Now Supports LangGraph and Vercel AI SDK Where Does the Data Go? A Comprehensive Guide to Databases Node.js wants to ban AI-generated code. They should. 07/20: Layer 2 – The Data Link Layer: Frames, MAC Addresses & Switches 5 Python Features That Made Me a Better Developer Why "flex" breaks your email in Outlook (and how to catch it in VS Code) Most Organizations Don't Have an AI Problem, They Have an Integration Problem I Built a Privacy-First PDF Toolbox — Your Files Never Leave the Browser The EU AI Act Was Written for Models. Your Agents Need Runtime Compliance. Your AI Agent on Kubernetes Is Probably Exposed to the Internet Right Now 723 Cycles of Zero-Sleep Autonomy: What Running 24/7 for Weeks Actually Looks Like AI Automation vs AI Augmentation: Know Which One You Are Actually Building A .NET Dinosaur in Web3. Day 13 — Access Control Transaction Hooks: A General Primitive for Post-Commit Side Effects (Case Study: Queuert) Lines vs Blocks(CSS): Divide & Grid Explained The Business Context Problem: Why Vulnerability Severity Scores Lie "How I Cut My Go Markdown Linter's Benchmark by 81%" Casting Resurrection on a Dead D&D Table The Story Behind Java: From C++ Limitations to Platform Independence Keep Appium out of your test code: BasePage + lazy locators How I use agents for my personal projects I Built a Compliance Health Scanner for Indian Startups in 24 Hours - Here’s What I Learned What AMQP compatibility means for a local Azure emulator Why I stopped rotating active log files in Python I built a tiny runtime for resumable agent workers The Cost of Showing Up: What the Productivity Advice Does Not Tell You About Being Visible Python Why I Rebuilt My Portfolio with Astro I finally gave my AI agents a shared memory and a team #Crew44 Kimsuky (APT43) — Analysis of the New PebbleDash · AppleSeed Toolset shadcn/ui is Not a Component Library Scaling Monorepos with Turborepo Five Ways to Fail a Transport Terminal themes optimize for syntax highlighting; that's the wrong target Your Clean Domain Could Be Masking an Attack: The Underminr Vulnerability Explained AI Coding Standards at Scale: Versioned AI Rules for Cursor, Claude Code, and Beyond
Run OpenAI Codex CLI on Claude, Gemini, or Llama — in 50 lines of C#
Jung Hyun, N · 2026-05-26 · via DEV Community

OpenAI's Codex CLI ships with a great editor-agent UX: shell tool, apply_patch, plan tracking, the lot. The catch — as of February 2026 it only speaks the OpenAI Responses API. Chat Completion support was dropped (codex-rs/model-provider-info/src/lib.rs: the WireApi enum has one variant, Responses). If you wanted to point it at a Chat-Completion-only endpoint — Ollama, LM Studio, your favorite Llama runner — you're out of luck.

But Codex CLI is happy to talk to any server that speaks Responses. It has a model_provider config block exactly for that. So if you can stand up a Responses-shaped HTTP endpoint backed by the model of your choice, Codex becomes a generic front-end and you choose the brain.

Here's the trick I've been using: a 50-line C# script that runs as both an OpenAI Chat Completion server and a Responses API server, on top of Microsoft.Extensions.AI's vendor-neutral IChatClient abstraction. I then point it at OpenRouter — one API key, hundreds of models including Claude, Gemini, Llama, GPT, you name it — and tell Codex to talk to my local script instead of OpenAI.

End result: OpenAI Codex CLI running on Anthropic's Claude 3.5 Sonnet (or whichever model I'm feeling like that day).

The pieces

I'm using Cadenza.Agent, an MSBuild SDK I ship that turns a single .cs file into a runnable agent server. It's part of a small family of single-file scripting SDKs for .NET 10's file-based programs — same idea as dotnet run script.cs but with a richer Tier-1 API (Tool, UseOllama, UseOpenAi, Run, etc.). The Agent variant exposes:

  • POST /v1/chat/completions — for Aider / Continue / Cursor / Copilot BYOK / sgpt
  • POST /v1/responses — for Codex CLI

Both are backed by the same IChatClient you configure. Switch the backend and the wire-format stays.

For the LLM I'm using OpenRouter, which speaks OpenAI's Chat Completion wire format with a different base URL — perfect for Microsoft.Extensions.AI.OpenAI's drop-in ChatClient. One env var, any model.

For Codex's configuration I'm using its CODEX_HOME environment variable trick: instead of editing ~/.codex/config.toml, you point Codex at a sample-local directory and it loads a fresh config.toml from there. Means I can ship a self-contained sample that never touches the user's global config.

The script

The entire backend, in one file:

#!/usr/bin/env dotnet run
#:sdk Cadenza.Agent@1.0.14

using System.ClientModel;
using OpenAI;

var apiKey = Env.Get("OPENROUTER_API_KEY")
    ?? throw new InvalidOperationException("OPENROUTER_API_KEY env var missing");
var model = Env.Get("OPENROUTER_MODEL") ?? "anthropic/claude-3.5-sonnet";

ServedModelName = "cadenza-codex-openrouter";

// Generate a sample-local Codex home directory.
var codexHome = Path.Combine(Env.Cwd, ".cadenza-codex-openrouter");
MakeDir(codexHome);

var catalogPath = Path.Combine(codexHome, "cadenza-catalog.json").Replace('\\', '/');
var configToml = $"""
    model          = "cadenza-codex-openrouter"
    model_provider = "cadenza"
    model_catalog_json = "{catalogPath}"

    [model_providers.cadenza]
    name     = "Cadenza.Agent (OpenRouter-backed)"
    base_url = "http://localhost:8080/v1"
    wire_api = "responses"
    env_key  = "CADENZA_API_KEY"
    stream_idle_timeout_ms = 300000
    """;
WriteText(Path.Combine(codexHome, "config.toml"), configToml);

// Catalog JSON: declares the served model id to Codex so it stops printing
// "Defaulting to fallback metadata". Fields match codex-rs/protocol/src/
// openai_models.rs ModelInfo schema — every key is required.
var catalogJson = """
    {
      "models": [{
        "slug": "cadenza-codex-openrouter",
        "display_name": "Cadenza (OpenRouter)",
        "description": "OpenRouter-backed agent served by Cadenza.Agent",
        "supported_reasoning_levels": [],
        "shell_type": "default",
        "visibility": "list",
        "supported_in_api": true,
        "priority": 50,
        "availability_nux": null,
        "upgrade": null,
        "base_instructions": "",
        "supports_reasoning_summaries": false,
        "support_verbosity": false,
        "default_verbosity": null,
        "apply_patch_tool_type": "freeform",
        "truncation_policy": { "mode": "tokens", "limit": 8192 },
        "supports_parallel_tool_calls": true,
        "context_window": 200000,
        "max_context_window": 200000,
        "auto_compact_token_limit": 180000,
        "effective_context_window_percent": 95,
        "experimental_supported_tools": []
      }]
    }
    """;
WriteText(Path.Combine(codexHome, "cadenza-catalog.json"), catalogJson);

WriteLine($"Codex config generated at: {codexHome}");
WriteLine("In another terminal, run:");
WriteLine($"  $env:CODEX_HOME      = \"{codexHome}\"");
WriteLine($"  $env:CADENZA_API_KEY = \"any-non-empty-string\"");
WriteLine($"  codex");

// Wire up OpenRouter as the LLM backend.
var openAiOptions = new OpenAIClientOptions { Endpoint = new Uri("https://openrouter.ai/api/v1") };
var chatClient = new OpenAI.Chat.ChatClient(model, new ApiKeyCredential(apiKey), openAiOptions)
    .AsIChatClient();

UseChatClient(chatClient);

await Run();

Enter fullscreen mode Exit fullscreen mode

That's it. No project file, no .csproj, no Program.cs. The #:sdk directive at the top tells the .NET 10 file-based program system to use Cadenza.Agent as the SDK, which pulls in the HTTP server, the Responses wire format, all the package references — and exposes Tool, UseOllama, UseChatClient, Run as bare names you can call directly.

Running it

Save the script as agent-codex-openrouter.cs and:

# Terminal 1 — start the agent server
$env:OPENROUTER_API_KEY = "sk-or-v1-..."
$env:OPENROUTER_MODEL   = "anthropic/claude-3.5-sonnet"  # or any OpenRouter slug
dotnet run agent-codex-openrouter.cs

Enter fullscreen mode Exit fullscreen mode

The first run pulls dependencies — Microsoft.Extensions.AI, the OpenAI SDK, ASP.NET Core. After that it boots in well under a second. The script prints exactly what you need in the second terminal:

Codex config generated at: D:\work\.cadenza-codex-openrouter

In another terminal, run:
  $env:CODEX_HOME      = "D:\work\.cadenza-codex-openrouter"
  $env:CADENZA_API_KEY = "any-non-empty-string"
  codex

Enter fullscreen mode Exit fullscreen mode

Paste those into another terminal, run codex, and you're chatting with Claude 3.5 Sonnet (or whichever OpenRouter model you picked) through the Codex UX. Tools like shell and apply_patch are sent by Codex itself in every request; the agent forwards them to the model and streams the model's function_call outputs back so Codex executes them locally.

What's happening behind the scenes

When Codex sends POST /v1/responses, the agent does this:

  1. Parse the Responses input. Codex sends a message / function_call / function_call_output array; we flatten it into Microsoft.Extensions.AI's IList<ChatMessage> shape.
  2. Honor previous_response_id. Codex chains turns with this id rather than re-sending the full history; the agent keeps a bounded in-memory dictionary of past turns so it can reconstruct context.
  3. Pass through Codex's tools. Codex's shell, apply_patch, update_plan arrive as raw schemas. We declare them to the model as PassthroughFunction instances that have a JSON schema but no real handler — the function-invocation middleware is bypassed for this endpoint, so any function call the model emits streams straight back to Codex.
  4. Call IChatClient.GetStreamingResponseAsync. This dispatches to whichever backend you configured — OpenRouter, Ollama, OpenAI, Anthropic, Azure OpenAI.
  5. Re-emit as Responses SSE. The ChatResponseUpdate stream gets translated into the ~15 SSE event types Codex expects: response.created, response.in_progress, response.output_item.added, response.output_text.delta, response.function_call_arguments.delta, response.completed, and friends.

The IChatClient abstraction is the trick that makes this composable. Cadenza.Agent doesn't care that OpenRouter is "really" Anthropic-this-time-Claude-next-time-Llama; it sees a chat client, calls it, and serializes whatever comes back into the wire format Codex wants.

The CODEX_HOME pattern

I want to stop and praise this. Codex CLI honors a CODEX_HOME environment variable that overrides where it looks for config.toml — instead of ~/.codex/, it reads from whatever directory you point at. The sample uses this to its full effect: it generates a sample-local directory with its own config.toml and cadenza-catalog.json, and prints the exact $env:CODEX_HOME = ... line to paste.

The result: your global ~/.codex/config.toml stays untouched. Different samples — Ollama backend, OpenRouter backend, gpt-5 reasoning effort tweaks — get their own isolated directories. You can have ten of them and they don't interfere. Want to share the setup with a teammate? Hand them the .cs file; their codex command points at the local directory the script generated.

Silencing the "Defaulting to fallback metadata" warning

If you point Codex at a model id it doesn't recognize, it falls back to default metadata for context window and output limits — and prints a warning every turn:

⚠ Model metadata for `cadenza-codex-openrouter` not found.
   Defaulting to fallback metadata; this can degrade performance and cause issues.

Enter fullscreen mode Exit fullscreen mode

This is suppressed by the model_catalog_json config key pointing at a JSON file that declares your slug. The schema is codex-rs/protocol/src/openai_models.rs::ModelInfo — 17 required fields. The sample includes a complete catalog entry; if you swap to a model with a smaller context window (e.g. openai/gpt-4o-mini at 128K), lower the context_window and max_context_window accordingly. Codex truncates prompts to this number, so over-declaring causes silent token overflows on the backing model.

Note also: model_catalog_json replaces Codex's bundled catalog rather than merging. If you want gpt-5-codex to keep working alongside your custom slug, include it in your JSON too.

One footgun I hit (and fixed)

The first time I ran this, Codex refused to start:

Error loading configuration: failed to parse model_catalog_json path
`...\cadenza-catalog.json` as JSON: expected value at line 1 column 1

Enter fullscreen mode Exit fullscreen mode

The cause was a BOM. .NET's Encoding.UTF8 is the BOM-emitting variant, so File.WriteAllText(path, content, Encoding.UTF8) prepends EF BB BF before your data. Rust's serde_json (which Codex uses) rejects this — strict spec compliance: RFC 8259 says JSON implementations MUST NOT add a BOM.

Cadenza's Fs.WriteText had inherited that BOM-emitting default. Fixed by switching to new UTF8Encoding(encoderShouldEmitUTF8Identifier: false) and shipping the SDK as 1.0.14. The same fix applies to Console.OutputEncoding — without it, dotnet-script | jq would corrupt the pipe.

Worth checking your own .NET code that writes files for strict parsers: if it goes through File.WriteAllText(path, text, Encoding.UTF8), you're emitting a BOM. The fix is one line:

File.WriteAllText(path, text, new UTF8Encoding(encoderShouldEmitUTF8Identifier: false));

Enter fullscreen mode Exit fullscreen mode

Why I think this pattern matters

Codex CLI's tool loop is genuinely useful. The Responses API lock-in feels like the kind of vendor coupling that, left unchecked, kills the open-tool ecosystem. The model_providers config + wire_api = "responses" escape hatch is OpenAI explicitly saying "we accept you might want this elsewhere" — and the right move is to take them up on it.

Once you have a Responses server you control, the ecosystem opens up. Want Codex on a $0/month local Ollama model for offline work? Swap UseChatClient for UseOllama — same script, same Codex config, different brain. Want to inject a project-pinned system prompt every Codex session sees? Add it before Run(). Want to log every Codex turn for audit? Wrap the IChatClient with your own middleware. Want to round-robin between OpenRouter and a local model based on prompt size? Write the logic in C# and serve through the same endpoint.

The single-file format is what makes it sustainable. There's no project to maintain, no SDK to manage, no separate binary to ship — just a .cs file you copy into your repo. If dotnet run script.cs is available (it is on .NET 10+), the script runs.

Try it

Install .NET 10, then:

dotnet new install Cadenza.Templates
dotnet new cadenza-agent -n my-codex-backend -o ./my-codex-backend
cd my-codex-backend
# Edit my-codex-backend.cs to use the OpenRouter pattern above
$env:OPENROUTER_API_KEY = "sk-or-v1-..."
dotnet run my-codex-backend.cs

Enter fullscreen mode Exit fullscreen mode

Or grab the ready-to-run sample from the Cadenza repositoryagent-codex-openrouter.cs is the version above. The repo also has agent-codex-backend.cs (Ollama variant) and agent-openrouter.cs (Chat Completion variant for Aider / Continue / Cursor).

If this is useful, let me know what backend you wire up. I'm curious whether anyone gets Codex running on a fine-tuned local model with a local fallback for offline coding — that's the next experiment on my list.


Cadenza is MIT-licensed. Source: https://github.com/rkttu/cadenza. The Cadenza.Agent package ships at 1.0.14 as of writing.

Cover Image Credit: Lukas from Unsplash