



























暂无文章
Ephemeral semantic embedding & cognitive agent service.
Chunk text. Embed with GTR-T5-base. Search via hybrid FST-BM25 vector rank fusion. Stream turn-based cognitive agent reasoning via native Model Context Protocol (MCP).
Ingest
Sentence-boundary chunking + GTR-T5-base embeddings (768d). Stores in RwLock<HashMap> — pure ephemeral compute.
FST-BM25 Hybrid Search
RRF blending dense vectors and sparse lexical indices. FST dictionary scanning provides microsecond safe query guardrails and intent-entity score boosting.
Cognitive GhostAgent
Turn-based agent loop with integrated memory search, document ingestion, session indexing, and sandboxed command execution. Powered by OpenAI/Anthropic.
Native MCP Server
Complete Model Context Protocol HTTP/SSE server endpoint (/mcp/sse) allowing immediate, zero-config integration with Claude Code, Antigravity, and Codex.
Crypto
Per-agent orthogonal matrix rotation on embeddings. Cosine similarity preserved under encryption. Keys are in-memory only.
Dual embedding
organize role uses GTR-T5-base (768d). retrieve role uses OpenAI text-embedding-ada-002 (1536d) — pass your own key or set server-side.
| Method | Endpoint | Description |
|---|---|---|
| GET | /health | Status, model info, live counts |
| GET | /mcp/sse | MCP Server SSE handshake |
| POST | /mcp/message | MCP Server JSON-RPC message router |
| POST | /sessions/:id/agent/chat | Stream non-blocking GhostAgent cognitive turns (SSE) |
| POST | /sessions/:id/ingest | Chunk + embed text into session |
| GET | /sessions/:id/search?q=... | Semantic search (supports RRF hybrid & lexical_only) |
| GET | /sessions/:id | Session metadata |
| DELETE | /sessions/:id | Delete session |
| GET | /temp | List temp stores with TTL |
| POST | /temp/:name/ingest | Ingest into temp store (2 hr TTL) |
| GET | /temp/:name/search?q=... | Search temp store |
| DELETE | /temp/:name | Delete temp store |
| POST | /agent/:id/register | Register per-agent orthogonal key |
| POST | /agent/:id/encrypt | Encrypt embeddings |
| POST | /agent/:id/decrypt | Decrypt embeddings |
| POST | /invert | Reconstruct text from embedding vector |
# Ingest into session
curl -X POST https://shivvr.nuts.services/sessions/my-session/ingest \
-H "Content-Type: application/json" \
-d '{"text": "Supreme Raven is protected by Known Opossum.", "source": "vault_specs"}'
# Autonomous Agent Conversational Chat (Streams Thoughts, ToolCalls, & Answer via SSE)
curl -i -X POST http://localhost:8085/sessions/my-session/agent/chat \
-H "Content-Type: application/json" \
-d '{"message": "Who protects the Supreme Raven?"}'
# Vector-Lexical Hybrid RRF Search
curl "http://localhost:8085/sessions/my-session/search?q=Known+Opossum&hybrid=true"
# High-speed Lexical-Only BM25 Search (Bypasses ONNX embedder)
curl "http://localhost:8085/sessions/my-session/search?q=Opossum&lexical_only=true"
# Synchronize Claude Code or Antigravity with shivvr's Native MCP Server
nemesis8 mcp add http://localhost:8085/mcp/sse
| Param | Default | Description |
|---|---|---|
q | required | Query text |
n | 5 | Number of results |
hybrid | false | Blend semantic vectors + BM25 scores (Reciprocal Rank Fusion) |
lexical_only | false | Bypass vector embedder, execute pure BM25 search |
guardrail | true | Enable FST toxic term scanning and automatic query blocking |
role | organize | organize (768d local) or retrieve (1536d OpenAI) |
time_weight | 0.0 | Blend semantic + recency score (0–1) |
decay_halflife_hours | 168 | Recency decay half-life in hours |
include_nearby | false | Return temporally adjacent chunks |
agent_id | — | Agent ID for encrypted search |
openai_api_key | — | Per-request OpenAI key for retrieve role (overrides server key) |
| Variable | Default | Description |
|---|---|---|
PORT | 8080 | Listen port |
MODEL_PATH | models/gtr-t5-base.onnx | GTR-T5-base ONNX embedder |
TOKENIZER_PATH | models/tokenizer.json | Tokenizer |
OPENAI_API_KEY | — | Enables OpenAI completions and retrieve embeddings |
ANTHROPIC_API_KEY | — | Enables Anthropic completions and GhostAgent loops |
NUTS_AUTH_JWKS_URL | — | Enable auth (open dev mode if unset) |
NUTS_AUTH_VALIDATE_URL | https://auth.nuts.services/api/validate | API token validation endpoint |
| Layer | Choice |
|---|---|
| Runtime | Rust + Tokio + axum |
| Cognition | GhostAgent cognitive RAG turn loop (OpenAI / Anthropic compat) |
| MCP Server | HTTP/SSE JSON-RPC 2.0 Model Context Protocol transport layer |
| Hybrid Index | Tantivy FST deterministic phrase engine + BM25F field indexer |
| Embedding | GTR-T5-base (768d) via ONNX Runtime 2.0 — local, required |
| Storage | Ephemeral RwLock<HashMap> — no disk, no volume mounts |
| GPU | CUDA 12.6 via ort EP on Cloud Run L4 — CPU fallback automatic |
| Inversion | vec2text gtr-base (projection + T5 enc/dec) — optional |
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。