慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

Google DeepMind News
Google DeepMind News
人人都是产品经理
人人都是产品经理
M
MIT News - Artificial intelligence
博客园 - 叶小钗
MyScale Blog
MyScale Blog
V
Visual Studio Blog
月光博客
月光博客
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
量子位
I
InfoQ
有赞技术团队
有赞技术团队
阮一峰的网络日志
阮一峰的网络日志
Jina AI
Jina AI
V
V2EX
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Blog — PlanetScale
Blog — PlanetScale
Last Week in AI
Last Week in AI
雷峰网
雷峰网
Stack Overflow Blog
Stack Overflow Blog
博客园 - Franky

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
一AI之门,通AWS Bedrock、Google Vertex AI、Gemini及Anthropic
Kuldeep Paul · 2026-05-25 · via DEV Community

虹桥将AWS Bedrock(亚马逊云科技Bedrock)与Google Vertex AI(谷歌Vertex AI)及Gemini(盖姆ini)、Anthropic(安纳崔克)一并导流,通过一兼容OpenAI之API,共享认证、容错与治理。

企业之AI团队,独用一供应商之模型者,鲜矣。众公司之生产栈,多若此:AWS Bedrock上之Claude,用于一类工作;Google Vertex AI上之Gemini,用于另一类;Anthropic之原生API,用于提示缓存等;Google Gemini API直驱低延迟之消费者路径。诸供应商各说其协议,各需其认证,各送其SDK。Bifrost将此诸般,聚于一OpenAI兼容之端点,前AWS Bedrock、Google Vertex AI、Google Gemini及Anthropic,并内置切换、均衡及治理之能。

Bifrost者,Maxim AI所建之开源AI网关也,每秒可应求五仟,每求仅增十一微秒之负累,且通联二十以上之LLM供者于一API。是篇所述,乃释何以众团队置Bedrock、Vertex、Gemini、Anthropic于Bifrost之后,次详各供者之配置,复示Bifrost之选路层如何使多供者之务可恃。其全负累之状貌,尽录于Bifrost之所载之基准测试.

为何Claude与Gemini散布于众云之境

Anthropic携Claude越AWS Bedrock,复越Google Vertex AI,亦越其自出之原API。Google于Vertex AI与直指Gemini API并供Gemini之用诸企业多栖于此数境,其三因复现,皆系于采办、迟滞与能事:

  • AWS Bedrock,适契已持AWS契约之众,以AWS Organizations统辖权柄,且数据驻留之约,系于AWS疆域。
  • Vertex AI 之胜,常在于已行于 Google Cloud 之组织,或欲统辖 Gemini、Claude 及第三方模型于一控制平面之团队。
  • Anthropic 之原生 API,显诸般之能,如提示缓存及最新之 Beta 头部,此等之新,或需数周乃至数月,方得达于 Bedrock 与 Vertex。
  • 双子座之API,可致双子座模型之最短直路,且附赠慷慨之免费层级,于原型设计之际甚为有益.

一旦工作负荷自原型升为量产,团队几乎必赖此数表面以上。然苦痛实现,乃各面皆以其本然SDK操作之时.

管理四家提供者之独立成本何在

无中门之关,则诸供者皆自引其依存,循其代码之径。

  • 不协之SDKboto3为Bedrock,即Google Cloud SDK for Vertex也。google-genai为 Gemini,及 Anthropic SDK 以达直通 API。
  • 异种认证之范式:Bedrock之IAM凭据与SigV4签名,Vertex之OAuth2服务账户,Gemini之API密钥,Anthropic之承载令牌。
  • 请求形态各异:Bedrock之Converse API不类Anthropic之Messages API,二者亦非Vertex之generateContent端点。
  • 无共通之备用故事。:若Bedrock之Claude端点触及其速率之限,尔之码须知如何转而求Anthropic之直API以为备。
  • :碎片化之用度数据:各供者别报其费与所耗,此使成本之分配于众队或末客户者愈繁。

信实之作,散布于 OpenAI、Anthropic、Google Vertex AI、AWS Bedrock 诸处,不可恃直呼 API 及手制重试之术以持之。Bifrost 所为,正为解此难题。

Bifrost 之术,统摄 Bedrock、Vertex、Gemini、Anthropic 而一之。

Bifrost,居应用之层与四者之间,显一OpenAI相容之端。应用之码呼Bifrost,Bifrost则司协议之译、认证之验、导流之途于上游之供。此乃即插即用之模:易尔既有OpenAI、Anthropic、Bedrock或Google之SDK之基URL,则其余之码犹可如故。

尔所获于易换者:

  • 一端统摄四供,兼十六余者。
  • 钥域项目及IAM角色,一配置而通。
  • 一OpenAI服务器所送之流式格式,无论何供应之。
  • 内置导引之则,可按模型名、虚拟钥、或权重而导请。
  • 共通之可察性、治理之道及约束之规,遍及诸上游之供者。

供者之选,用provider/model之句法。bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0达克劳德于贝德洛克。vertex/gemini-2.5-flash达吉米尼于维克斯。gemini/gemini-2.5-pro呼吉米尼直API。anthropic/claude-sonnet-4-20250514击安斯洛普之原API。

于比弗罗斯特内,各供者之设。

之供者,可藉Bifrost之網面介面、API、config.json之檔案,抑或Go SDK以設置之。下文之簡例,示其設置之形貌;其全貌詳載於文檔.

AWS Bedrock

Bifrost內之AWS Bedrock供者,受靜態IAM憑證、EKS上之IRSA、EC2實例配置等。AWS_ACCESS_KEY_ID 调整环境变量之风格。亦涵盖假定之 IAM 角色及其外部 ID 与会话名,此模式与跨账户 Bedrock 访问之标准相合。

{
  "providers": {
    "bedrock": {
      "keys": [{
        "models": ["*"],
        "weight": 1.0,
        "aliases": {
          "claude-3-5-sonnet": "us.anthropic.claude-3-5-sonnet-20241022-v2:0"
        },
        "bedrock_key_config": {
          "region": "us-east-1",
          "role_arn": "env.AWS_ROLE_ARN",
          "external_id": "env.AWS_EXTERNAL_ID"
        }
      }]
    }
  }
}

全屏模式 退出全屏模式

空置访问密钥与密钥,则使Bifrost回退至AWS默认凭证链,此链按序遍历IRSA、ECS任务角色、EC2实例资料、环境变量及共享凭证文件。

谷歌Vertex AI

彩虹桥(的)谷歌Vertex AI提供者 乃藉 Google Cloud 以达 Gemini、Claude 及第三方之模。其模系(Gemini 与 Anthropic)自辨,而适切之请变遂施。Vertex 上,有三认证径:一为服务账户之 JSON,二为应用默认之凭信(GKE Workload Identity 之推举径),三为仅用于 Gemini 之用例之 API 密钥。

{
  "providers": {
    "vertex": {
      "keys": [{
        "models": ["*"],
        "weight": 1.0,
        "vertex_key_config": {
          "project_id": "env.VERTEX_PROJECT_ID",
          "region": "us-central1",
          "auth_credentials": "env.VERTEX_CREDENTIALS"
        }
      }]
    }
  }
}

入全屏模式 出全屏模式

OAuth2令牌之缓存与刷新,皆由Bifrost自动施行。于Vertex之Claude,anthropic_version之标头设为vertex-2023-10-16,凡不协之beta标头,皆于转发之前剔除之。

Google Gemini

The Gemini提供者以Google AI Studio之简钥认证。当Vertex之项目、区域及IAM机制,非工作所需甚多时,当循此径。

{
  "providers": {
    "gemini": {
      "keys": [{
        "value": "env.GEMINI_API_KEY",
        "models": ["gemini-2.5-flash", "gemini-2.5-pro"],
        "weight": 1.0
      }]
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

Gemini之原生流式格式,为Bifrost化转为OpenAI服务器发送事件之标准形,此形已为汝客户端所期,故同请求体可运行于bedrock/... 亦与 gemini/... 相争,而客无更易。

Anthropic

The Anthropic 供者 直呼 Anthropic 之原 API。当劳需提示缓存、初版标头,或 Claude 之任何未传至 Bedrock 或 Vertex 之特征时,用此表。

{
  "providers": {
    "anthropic": {
      "keys": [{
        "value": "env.ANTHROPIC_API_KEY",
        "models": ["claude-sonnet-4-20250514", "claude-opus-4-20250514"],
        "weight": 1.0
      }]
    }
  }
}

Enter fullscreen mode 退出全屏模式

既四供者皆备,一请于 OpenAI 兼容者,可易其模型之域,以指任一供者。应用之码,无复更易。

跨供者之导,备用,均负

昔Bedrock、Vertex、Gemini、Anthropic皆隐于Bifrost之后,可缀合之,为信实与费计之策,否则须定制代码方得:

  • 自动切换:Bifrost之重试与回退 使汝可立主链与备选链。倘 Bedrock 之 Claude 端点始抛 429 或 5xx 之误,Bifrost 可导此呼至 Vertex 上之 Claude,复由此至 Anthropic 自有之 API,皆无需应用端干预。
  • 权重负载均衡:Bifrost 之 键与负载均衡 分流负载于诸供,如一例,Claude之流七十归Bedrock,其三十则往Vertex,此迁更之序也。
  • 费省之途:廉取或敏迟之请可遣Gemini,而险重之思则驻Claude。
  • 地域之途:欧陆之流可固于Vertex。eu-west1者,虽美利坚之交通,落于us-east-1,而应用之码无改易。

盖因导引之决,断于关隘,故应用之师,不复自虑供者之有否,或其败亡之状。

荷供者众,其事之治与察之明

Bedrock、Vertex、Gemini及Anthropic诸系统,合而为一门,则操作之表亦纳于一控域。Bifrost所供者,

  • 虚位之钥、用度之限、速率之约也:或按队,或按客,虚位之钥可颁,各有专设之用度上限与速率之限,无论何上游之供者应之。Bifrost之 治理之能也。覆虚拟键、RBAC、审计日志及粒度速率限制。
  • 统合可观测性: 本地普罗米修斯与 OpenTelemetry输出者遍告诸供者,其请级之度,散踪之迹,费之数也。
  • 护栏:以 AWS Bedrock Guardrails、Azure Content Safety 或 Patronus AI 之内容安全策,施于诸上游供者,一视同仁。
  • 审计日志:万般请求数之不可易之迹,括供者、模、迟滞、符、费,助 SOC 2、GDPR、HIPAA、ISO 27001 之合规报。

若群僚行 Bifrost 于 AWS Bedrock。 乃在其自之VPC内,此交通绝无出其客户AWS之户者。

起始于Bedrock、Vertex、Gemini及Anthropic于Bifrost。

Bedrock、Vertex、Gemini及Anthropic诸系统,会而归于一桥,四SDK、四认证之术、四异构之制,皆化为一OpenAI相容之端。协议之变、OAuth2与IAM之信、流之正、道之择,皆于关隘内成,使应用之师可专攻一API,而平台之师犹掌权柄于费与治。

欲知Bifrost于君之多元AI架构何能,预定演示与诸君偕行。