惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

NISL@THU
NISL@THU
有赞技术团队
有赞技术团队
WordPress大学
WordPress大学
U
Unit 42
腾讯CDC
宝玉的分享
宝玉的分享
Y
Y Combinator Blog
V
Visual Studio Blog
C
Check Point Blog
N
Netflix TechBlog - Medium
云风的 BLOG
云风的 BLOG
博客园 - 聂微东
酷 壳 – CoolShell
酷 壳 – CoolShell
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
P
Privacy & Cybersecurity Law Blog
V
Vulnerabilities – Threatpost
The Hacker News
The Hacker News
人人都是产品经理
人人都是产品经理
Google DeepMind News
Google DeepMind News
Vercel News
Vercel News
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
N
News and Events Feed by Topic
aimingoo的专栏
aimingoo的专栏
S
SegmentFault 最新的问题
Engineering at Meta
Engineering at Meta
Cyberwarzone
Cyberwarzone
The Last Watchdog
The Last Watchdog
S
Secure Thoughts
Recorded Future
Recorded Future
阮一峰的网络日志
阮一峰的网络日志
博客园 - Franky
E
Exploit-DB.com RSS Feed
V
V2EX
S
Security Affairs
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
IT之家
IT之家
爱范儿
爱范儿
小众软件
小众软件
Last Week in AI
Last Week in AI
C
Cybersecurity and Infrastructure Security Agency CISA
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
O
OpenAI News
The Cloudflare Blog
Cloudbric
Cloudbric
L
Lohrmann on Cybersecurity
H
Hacker News: Front Page
C
Cisco Blogs
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Webroot Blog
Webroot Blog
月光博客
月光博客

Show HN

暂无文章

shivvr — semantic embedding & cognitive agent service
kordlessagai · 2026-06-17 · via Show HN

shivvr 🔪 v0.3.0

Ephemeral semantic embedding & cognitive agent service.

Chunk text. Embed with GTR-T5-base. Search via hybrid FST-BM25 vector rank fusion. Stream turn-based cognitive agent reasoning via native Model Context Protocol (MCP).

Capabilities

Ingest

Sentence-boundary chunking + GTR-T5-base embeddings (768d). Stores in RwLock<HashMap> — pure ephemeral compute.

FST-BM25 Hybrid Search

RRF blending dense vectors and sparse lexical indices. FST dictionary scanning provides microsecond safe query guardrails and intent-entity score boosting.

Cognitive GhostAgent

Turn-based agent loop with integrated memory search, document ingestion, session indexing, and sandboxed command execution. Powered by OpenAI/Anthropic.

Native MCP Server

Complete Model Context Protocol HTTP/SSE server endpoint (/mcp/sse) allowing immediate, zero-config integration with Claude Code, Antigravity, and Codex.

Crypto

Per-agent orthogonal matrix rotation on embeddings. Cosine similarity preserved under encryption. Keys are in-memory only.

Dual embedding

organize role uses GTR-T5-base (768d). retrieve role uses OpenAI text-embedding-ada-002 (1536d) — pass your own key or set server-side.

API

MethodEndpointDescription
GET/healthStatus, model info, live counts
GET/mcp/sseMCP Server SSE handshake
POST/mcp/messageMCP Server JSON-RPC message router
POST/sessions/:id/agent/chatStream non-blocking GhostAgent cognitive turns (SSE)
POST/sessions/:id/ingestChunk + embed text into session
GET/sessions/:id/search?q=...Semantic search (supports RRF hybrid & lexical_only)
GET/sessions/:idSession metadata
DELETE/sessions/:idDelete session
GET/tempList temp stores with TTL
POST/temp/:name/ingestIngest into temp store (2 hr TTL)
GET/temp/:name/search?q=...Search temp store
DELETE/temp/:nameDelete temp store
POST/agent/:id/registerRegister per-agent orthogonal key
POST/agent/:id/encryptEncrypt embeddings
POST/agent/:id/decryptDecrypt embeddings
POST/invertReconstruct text from embedding vector

Quick start

# Ingest into session
curl -X POST https://shivvr.nuts.services/sessions/my-session/ingest \
  -H "Content-Type: application/json" \
  -d '{"text": "Supreme Raven is protected by Known Opossum.", "source": "vault_specs"}'

# Autonomous Agent Conversational Chat (Streams Thoughts, ToolCalls, & Answer via SSE)
curl -i -X POST http://localhost:8085/sessions/my-session/agent/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Who protects the Supreme Raven?"}'

# Vector-Lexical Hybrid RRF Search
curl "http://localhost:8085/sessions/my-session/search?q=Known+Opossum&hybrid=true"

# High-speed Lexical-Only BM25 Search (Bypasses ONNX embedder)
curl "http://localhost:8085/sessions/my-session/search?q=Opossum&lexical_only=true"

# Synchronize Claude Code or Antigravity with shivvr's Native MCP Server
nemesis8 mcp add http://localhost:8085/mcp/sse

Search parameters

ParamDefaultDescription
qrequiredQuery text
n5Number of results
hybridfalseBlend semantic vectors + BM25 scores (Reciprocal Rank Fusion)
lexical_onlyfalseBypass vector embedder, execute pure BM25 search
guardrailtrueEnable FST toxic term scanning and automatic query blocking
roleorganizeorganize (768d local) or retrieve (1536d OpenAI)
time_weight0.0Blend semantic + recency score (0–1)
decay_halflife_hours168Recency decay half-life in hours
include_nearbyfalseReturn temporally adjacent chunks
agent_idAgent ID for encrypted search
openai_api_keyPer-request OpenAI key for retrieve role (overrides server key)

Environment

VariableDefaultDescription
PORT8080Listen port
MODEL_PATHmodels/gtr-t5-base.onnxGTR-T5-base ONNX embedder
TOKENIZER_PATHmodels/tokenizer.jsonTokenizer
OPENAI_API_KEYEnables OpenAI completions and retrieve embeddings
ANTHROPIC_API_KEYEnables Anthropic completions and GhostAgent loops
NUTS_AUTH_JWKS_URLEnable auth (open dev mode if unset)
NUTS_AUTH_VALIDATE_URLhttps://auth.nuts.services/api/validateAPI token validation endpoint

Stack

LayerChoice
RuntimeRust + Tokio + axum
CognitionGhostAgent cognitive RAG turn loop (OpenAI / Anthropic compat)
MCP ServerHTTP/SSE JSON-RPC 2.0 Model Context Protocol transport layer
Hybrid IndexTantivy FST deterministic phrase engine + BM25F field indexer
EmbeddingGTR-T5-base (768d) via ONNX Runtime 2.0 — local, required
StorageEphemeral RwLock<HashMap> — no disk, no volume mounts
GPUCUDA 12.6 via ort EP on Cloud Run L4 — CPU fallback automatic
Inversionvec2text gtr-base (projection + T5 enc/dec) — optional