GitHub - tkngate/tkngate: Cloudflare for AI Agents

The Cloudflare for Autonomous AI Agents

Zero-knowledge reverse proxy · Multi-provider failover · P2P token mesh

Install • Why Tkngate • Features • Configuration • Docs

Why Tkngate

Your autonomous agents crash when OpenAI goes down. Your budget bleeds when a rogue loop burns $200 in tokens. Your API keys sit in plain text in .env files.

Tkngate fixes all of this. It sits between your agents and the AI providers, acting as an intelligent shield that manages keys, budgets, security, and failover — automatically.

Install

git clone https://github.com/tkngate/tkngate.git
cd tkngate
cp tkngate.example.yaml tkngate.yaml   # edit with your keys
export TKNGATE_MASTER_KEY="your-32-char-secret-key-here-!!"
go run main.go serve

Your agents now point to http://localhost:7477/openai/chat/completions instead of api.openai.com.

Features

Universal API Router

Route through OpenAI, Anthropic, DeepSeek, Kimi, and Groq from a single endpoint. If one provider goes down (HTTP 500/502/503), Tkngate automatically fails over to the next.

AI-WAF & DLP

Block prompt injection attacks and automatically redact PII (credit cards, SSNs, API keys) before they reach the provider.

Stake-and-Slash Reputation (Mesh)

Protect donated API keys from abuse using cryptographic Fraud Proofs. Nodes that route malicious prompts bypassing the WAF are penalized via a Stake-and-Slash trust ledger and permanently blacklisted.

Virtual Keys (Auth Layer)

Generate secure tkngate-sk-... enterprise virtual keys for your teams. Each key acts as an isolated sandbox with hard budget caps, shielding your physical upstream keys.

Strict Rate Limiting (Token Bucket)

Protect your budget and providers from autonomous agent "burst loops" with an ultra-low latency, in-memory Token Bucket rate limiter.

Budget Traffic Lights

Real-time spend tracking with Green → Amber → Red zones. Set global limits, per-session caps, and automatic request blocking when budgets are exhausted.

Distributed Semantic Cache (Redis)

Identical prompts are served from a distributed Redis cache, enabling horizontal scaling across multiple Tkngate proxy nodes. Save tokens and money globally across your fleet. Cache keys are computed from normalised model + messages hashes.

Context Compressor

Automatically compresses Go, Python, and JavaScript code blocks in prompts — stripping comments and whitespace to reduce token usage by up to 40%.

P2P Token Mesh (DRR)

The world's first BitTorrent-style token pool for LLM APIs. Donate spare API keys to the mesh, and get priority access to the network's capacity during outages. Protected by AES-256 zero-knowledge encryption.

Shadow Mode

Silently mirror a fraction of production traffic to an alternative provider (e.g., DeepSeek) to evaluate cost savings — with zero latency impact on the primary request.

Configuration

See tkngate.example.yaml for the full reference.

Documentation

Topic	Link
Budget System & Traffic Lights	docs/budgeting.md
DRR Token Mesh & Reputation	docs/drr-mesh.md
Enterprise Virtual Keys	docs/virtual-keys.md
Strict Rate Limiting	docs/rate-limiting.md
Zero-Knowledge Security	docs/zero-knowledge-security.md
Shadow Mode	docs/shadow-mode.md

Architecture

Agent Request
     │
     ▼
┌──────────────────────────────────────┐
│            TKNGATE PROXY             │
│                                      │
│  Budget Guard → AI-WAF/DLP           │
│       → Context Compressor           │
│       → Semantic Cache               │
│       → Auto-Retry (3x)             │
│       → Universal Router (Failover)  │
│       → Shadow Mode (async mirror)   │
│       → Token Counter → Ledger       │
└──────────────────────────────────────┘
     │
     ▼
  OpenAI / Anthropic / DeepSeek / Kimi / Groq

License

Apache 2.0

推荐订阅源

Hacker News - Newest: "AI"