慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

小众软件
小众软件
博客园 - 叶小钗
有赞技术团队
有赞技术团队
大猫的无限游戏
大猫的无限游戏
博客园_首页
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
L
LangChain Blog
Hugging Face - Blog
Hugging Face - Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
aimingoo的专栏
aimingoo的专栏
Blog — PlanetScale
Blog — PlanetScale
爱范儿
爱范儿
T
Tailwind CSS Blog
Jina AI
Jina AI
量子位
Stack Overflow Blog
Stack Overflow Blog
人人都是产品经理
人人都是产品经理
J
Java Code Geeks
V
Visual Studio Blog
月光博客
月光博客

Hacker News - Newest: "AI"

AI can't read an investor deck AI as an attorney? Student uses ChatGPT, Gemini to sue UW Hacking MCP Servers in AI Systems – The Rug Pull: Tool Changes After Approval GitHub - MeepCastana/KubeezCut: Free Web based video editor GitHub - GenAI-Gurus/awesome-eu-ai-act: Curated tools, official sources, OSS, templates, and guides for EU AI Act compliance. Can AI judge journalism? A Thiel-backed startup says yes, even if it risks chilling whistleblowers Coming soon: 10 Things That Matter in AI Right Now DARPA built an AI to fact-check enemy weapons claims IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures What explains heterogeneity in AI adoption? When AI Meets Muscle: Context-Aware Electrical Stimulation Promises a New Way to Guide Human Movements - Department of Computer Science AI Changed How We Build. It Did Not Change What Matters. Linux rules on using AI-generated code - Copilot is OK, but humans must take 'full responsibility for the… Meta spins up AI version of Mark Zuckerberg to engage with employees Code Mode: Let Your AI Write Programs, Not Just Call Tools | TanStack Blog GitHub - Delavalom/graft: Go framework for building AI agents. Type-safe tools, multi-provider (OpenAI, Anthropic, Gemini, Bedrock), zero vendor SDKs. India's TCS tops estimates, says new AI models did not dent services demand Gen Z's fading AI hype Strong feeling: we are in a folded AI reality GitHub - machinarii/total-recall-catalog: A reference catalog of latest knowledge retrieval, memory & RAG systems GitHub - mensfeld/code-on-incus: Give each AI agent its own isolated machine with root, Docker, and systemd. Active defense detects and stops threats automatically.. Quantization, LoRA, and the 8% Problem: Benchmarking Local LLMs for Production AI Iran war: We spoke to the man making Lego-style AI videos that experts say are powerful propaganda Powell, Bessent discussed Anthropic's Mythos AI cyber threat with major U.S. banks GitHub - immartian/bellamem: Persistent belief-graph memory for AI agents. Retrieves decisive context by importance — not recency, not RAG, not /compact. recursive-mode: The Repo-Native Operating System for AI Engineering After the attack on Sam Altman's home, will AI CEO's go on the offensive? The biggest advance in AI since the LLM Opus 4.6 vs GPT 5.4 One Prompt Unity World Generation Test “AI polls” are fake polls Client Challenge Can AI be a 'child of God'? Inside Anthropic's meeting with Christian leaders How to Switch AI Chatbots and Why You Might Want To GitHub - MattMessinger1/agentic_refund_guardrail: Safe refund policy layer for AI agents — Python + TypeScript. Same behavior, shared tests. Adam/papers/emergent_values_whitepaper.md at master · strangeadvancedmarketing/Adam Ask HN: How do you stop playing 20 questions with your AI coding tools How far can automation and AI support psychotherapy? - @theU GitHub - stagas/rtdiff: realtime git diff gui and AI-assisted commits A Mac Studio for Local AI — 6 Months Later A History of the Early Years of AI at the University of Edinburgh Why AI Coding Tools Still Feel Stuck on Localhost MSN AI Datacenters Are Becoming Strategic Targets twitter.com Penn Researchers Use AI to Surface Unreported GLP-1 Side Effects in Reddit Posts Show HN: MoodSense AI (ML and FastAPI and Gradio, Deployed on Hugging Face) Moodsense Ai - a Hugging Face Space by aman179102 AI models are terrible at betting on soccer—especially xAI Grok GitHub - xialeistudio/echoic GitHub - HimashaHerath/github-dev-wrapped: AI-powered weekly GitHub activity reports deployed to GitHub Pages
何故人工智能流程需用 Kafka 及 Zilla 如何使 Kafka 适配人工智能 | Aklivity 博客
AuthorsAnkit KumarTeam Aklivity · 2026-05-28 · via Hacker News - Newest: "AI"

人工智能之系统,鲜有因模型而失其用者。

盖其败,多由其下之基构,本非为异类之务所设也。

生产之中,人工智能之务,时滞不定,重试频仍,并发骤增,逆压难平,复有众户共治之患,此皆旧式同步之系所难清摹也。示演或可依HTTP请-应链而行,然生产非示演也。

千人之问,应时而至,而大语言模型应之需八秒。乃嵌入服务触顶,而入流不绝。复有重试之请,误生重嵌于向量数据库。更有企业之用、标准之用、免费之用,同此系统,欲得所授之见。

此等非模型之题。乃基建之困也。

基建之困,需基建之策。

AI之务,不类旧式API。

生产式RAG管路,非单次API之呼。乃异步之链,其迟滞各异,通量有界,败亡之态亦殊。

一文档片段至,须藉外API呼以嵌入之。嵌入者,藏于向量数据库。一用户询发,引再嵌入之请,继而相似寻,上下文合,及LLM推论之步,或需数秒以毕其功。

尤要者,此诸段,各自独立。

虽嵌入之速渐缓,犹需摄取以继之。须使查询之处理与文牍索引之负载相隔离。须使重试无复。须使答案如流注般回应于适切之用户,非以轮询得之。

此非徒为效能之优化。实乃架构之要件,事件驱动之系统自能自然表达,然同步请求之链,则不能洁净模拟之。

何故卡夫卡契合人工智能管道之天然

卡夫卡与人工智能系统所需之操作行为紧密对应

解耦服务

Kafka之架构中,摄取服务书文段于题,不须知运行之嵌入模型为何,亦不论向量数据库应答之速,或下游消费者负荷之轻重。嵌入者自依其度而消,独立无碍。若嵌入模型自“text-embedding-3-small”易为本地所寓之替,则上游无改。

解耦之要,盖因人工智能之系统,恒久演进也。

可重演性

人工智能之系统,恒常再生其衍化之态。若汝升格嵌入之模,或需再嵌入全体之文。以卡夫卡(Kafka)为之,重演其主题,可重建下游之态,而不必重筑摄取之史。若RAG之管路中道崩殂,其消费者自已认之偏移量处续起,不至失却请求数,亦不默然弃其工。

是时,事之日志,既为传输之层,复为系统之实录。

结构式阻塞

大语言模型及嵌入接口,其吞吐量有硬性上限。于同步系统,迟缓之推演,将迟滞沿请求数链上溯。重负之下,此常演为级联之败。

卡夫卡之变,本源乎此。缓消费者积滞,不阻生产者。交通之峰化队,渐泄于可持之率——此于设计多变延迟之AI系统,尤关甚巨。

独立消费者

AI之流程非单跳之工。同源之文牍事件,或注嵌入之务,或入分类之器,或经评估之管,或达监控之系,或至稽核之用——各自伸缩,不相牵绊。

Kafka乃脊梁,非客面之户

Kafka为优之事件脊梁。然其自,非客面之API也。

君之用户犹期REST端点、JWT认证、架构校验、流式应答、租户隔离及浏览器兼容。浅见之策,乃于Kafka之前构建定制HTTP服务。

初时可行。然时日既久,凡治理之虑——认证、身分流布、架构执掌、存取控制、速率限制——皆成应用代码之条件,新租户之规又增一部署。治理散于诸务,非存于一处,下游之务惟信封装转之身分而已。

夫建筑之难明,盖由政令不集也。

何故身份传播于人工智能系统中日益关键

众租AI系统非止认证之需,更需信实之身份,流布于非同步之工作流。

试思一RAG系统,其有数级可见之阶:免费用户得见公知,标准用户得见内知,而企业用户得见密知。此阶源于API边界所呈之JWT。下游服务需此身份之信息,以筛取检索之果,定生成之境,并行交付之权。

Kafka自非验证JWT,亦不传信于消息之头。无中枢治理,开发者常以自撰之中间件验之,并转发元数据于Kafka——然信界今居应用之码,凡下游之务皆赖此中间件之正。

此即Zilla所弥之隙。

如何使Zilla弥合鸿沟

Zilla平台立于客户与Kafka之间,此端通HTTP,彼端合Kafka之约。非将治理之理嵌于应用服务,Zilla乃移治理于边缘。

请流之状若此:

POST /queries
Authorization: Bearer <jwt>
  → Zilla validates JWT
  → extracts user tier claim
  → injects trusted Kafka headers
  → writes event to rag.queries
  → RAG pipeline consumes asynchronously→ result written to rag.results
  → client receives streamed response over SSE

AI服务自专注AI之理,而非运之虑。

边缘身份注入

当客端送JWT,Zilla验其令,注信实身份之头于Kafka之讯——如标`user-tier: enterprise`。下游之务直取其头。嵌入者、取回层、RAG链皆无需独验JWT。权柄之决,一断于边,其证随事而行。

强制模式

畸形负载当败于边界,非陷于异步流之深处。Zilla于事件入Kafka前验其JSON模式。缺必`doc_id`之请,或`question`非字符串之询,立得`400`应。无效事件永不得达骨干。

本源流应

人工智能之系统,本乎非同步,然浏览器之客犹期实时互动。Zilla以Server-Sent Events为之桥:客启`GET /results/{queryId}`,Zilla则订阅Kafka之结果主题,应时之响即流至浏览器——无轮询之构,无特制SSE服务之撰操。

每客自滤

众可同订一题。Zilla以JWT所抽之订户身份,滤流事。故企业者得企业级之果,标准者惟得所许之见。此制行于关隘,非行于下游诸务。

架构之实状若何:示之

《Zilla平台RAG演示》乃贯通诸式之典范。但以一`docker compose up`之令,即可启Kafka、Qdrant、嵌入服务、RAG链服务及Zilla,凡此种种,皆统御于一`zilla.yaml`之配置。

其流之状若此:

Client (JWT)
  ├── POST /chunks   →  Zilla validates JWT + schema → write to rag.chunks
  ├── POST /queries  →  Zilla injects user-tier header → write to rag.queries
  └── GET /results   →  Zilla subscribes to rag.results → SSE to client

rag.chunks  →  Embedder → Qdrant
rag.queries →  RAG Chain:
                  → embed query
                  → search Qdrant with visibility filter
                  → call LLM
                  → write result to rag.results

取用之模,本乎结构,非由应用所定。免费用户之查询,惟及于公诸众者;标准用户,则可及于公与内者;企业用户,复可及于密者。其显见之阶,源自JWT,随事件流而传为信实之元数据——无阶值之原,出诸客户自身也。

Zilla平台RAG之示于https://github.com/aklivity/zilla-platform-demos/tree/main/rag-project。示含浏览器界面、多级JWT令牌,并详述前述架构之全貌

。此架构,尔后无需复建。

驱动式人工智能基础架构之核心论,非谓其更精微,乃谓其摹拟人工智能系统固有之运作行止也。

尔之嵌入模型若变,则重演其题。尔之摄取流量若骤,则消费者积滞,而非溃其请径。尔之治理规则若迁,则更其中枢之策,而非重书其应用之理。尔之合规之队若询何用户得何应,则事件之录已含其史。

Zilla者,集治权于边缘,以行身份之传、核验之式、限率之制、滤送之序、流式之API,益彰其利。其治层虽稳,而其后者之AI,时迁则治层亦安。

易其大语言模型,更其向量数据库,增新之消费者,重演旧日之数据。

其界犹固。

欲知 Zilla Platform 与事件驱动 AI 基础设施之详,可也。请示演示.