惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

GbyAI
GbyAI
L
LINUX DO - 热门话题
月光博客
月光博客
B
Blog
博客园 - 叶小钗
美团技术团队
D
Docker
A
About on SuperTechFans
Stack Overflow Blog
Stack Overflow Blog
酷 壳 – CoolShell
酷 壳 – CoolShell
WordPress大学
WordPress大学
P
Proofpoint News Feed
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Y
Y Combinator Blog
V
V2EX
Apple Machine Learning Research
Apple Machine Learning Research
博客园 - 三生石上(FineUI控件)
The Register - Security
The Register - Security
博客园_首页
The Cloudflare Blog
I
InfoQ
T
Tailwind CSS Blog
MongoDB | Blog
MongoDB | Blog
Engineering at Meta
Engineering at Meta
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Microsoft Azure Blog
Microsoft Azure Blog
有赞技术团队
有赞技术团队
C
CERT Recently Published Vulnerability Notes
AWS News Blog
AWS News Blog
Spread Privacy
Spread Privacy
V
Visual Studio Blog
博客园 - Franky
Cloudbric
Cloudbric
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
N
News and Events Feed by Topic
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
Webroot Blog
Webroot Blog
博客园 - 【当耐特】
TaoSecurity Blog
TaoSecurity Blog
B
Blog RSS Feed
N
News | PayPal Newsroom
人人都是产品经理
人人都是产品经理
H
Heimdal Security Blog
L
LangChain Blog
PCI Perspectives
PCI Perspectives
Jina AI
Jina AI
Google DeepMind News
Google DeepMind News
Schneier on Security
Schneier on Security

Show HN

CSP Radar GitHub - awebai/aweb-team-coord-worktrees: An aweb team template for a minimum team with a permanent coordinator and worktrees with local developers. GitHub - fujibee/agmsg GitHub - lucastononro/notify: 100% local, free, offline attention skill for Claude Code: plays a sound and speaks a short status update when a long task finishes, blocks, or needs a decision. GitHub - sebastianwessel/skills: AI Skills tivatdoar / workout-to-work · GitLab GitHub - enumura1/py-sql-cleaner: Find, format, and safely extract embedded SQL from Python files. GitHub - intent-bench/intent-bench: Intent fulfillment benchmark for agentic AI engineering GitHub - steveking-gh/firmion: Firmion is DSL and engine for firmware image generation. GitHub - villagesql/villagesql-skills: Agent skills for VillageSQL - gemini-cli-extension; claude-code-plugin GitHub - 0gsd/enough: a personal language system for planning, writing, and translation. GitHub - Kaelio/ktx: ktx is an executable context layer for data and analytics agents 🐙 Allow Claude Code, Codex, and any AI agent to query data accurately through MCP with skills, memory and a semantic layer GitHub - ThatXliner/xtras: Xliner's Claude Code Skills GitHub - flightdeckhq/flightdeck: Observability and control plane for AI agents. GitHub - search-router/simple-search: Open-source reference app on top of the Search Router API: FastAPI + Jinja metasearch service with pluggable backends, deterministic mocks (no API key needed), RTL UI, Redis cache, and a demo ads cabinet. CSP Radar GitHub - Light-Heart-Labs/DreamServer: Turn your PC, Mac, or Linux box into an AI server. LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. GitHub - Diplomat-ai/diplomat-agent-ts: What can your TypeScript AI agent do to the real world? Scan your code. See which tool calls have zero checks Code Block Selector - Visual Studio Marketplace Prometheus dependency graph — interactive showcase | Riftmap Show HN: I made a vi-like modal keyboard plugin for Figma GitHub - run-llama/liteparse: A fast, helpful, and open-source document parser GitHub - dalemyers/Roar: A macOS CLI tool for notifications GitHub - district-solutions/open-agent-tools-coder: Enables small-to-large self-hosted ai models to use local source code when running tool-calling agentic workloads. We actively data mine 20,900+ (2+ TB) popular github repos using large and small ai models to create reuseable: json, markdown and parquet files for local-first tool-calling models. GitHub - progapandist/stripeek: A local TUI proxy for real-time Stripe API debugging, built for navigating complex payloads fast. GitHub - sir1st/hermes-desktop: All-in-one cross-platform desktop app for Hermes Agent — bundles Python + hermes-agent + hermes-web-ui GitHub - astefanutti/shaderbang: Shebang for Shaders Show HN: Generate Claude Code Workflows using Spec Driven Development approach GitHub - nixys/nxs-universal-chart: The Helm chart you can use to install any of your applications into Kubernetes/OpenShift Show HN: AI agents for UK GDAD PCF roles and their skills The Two Pillars: Mixer Mode and Meta-Software in the Reorganization of Software Work After AI GitHub - JaiCode08/teleport-env What 1,000+ Harness Experiments Taught Me About Self-Improving Agents Show HN: Liiists, a Markdown-first, iOS and CLI list app SwiperTab – Get this Extension for 🦊 Firefox (en-US) GitHub - kouhxp/fftext: Summarize, explain, fact-check, or translate any text, URL, or file. No GPU. No cloud. One command GitHub - sweetpad-dev/sweetpad: Develop Swift/iOS projects using VSCode GitHub - dogmaticdev/IRON: IRON a.k.a. Intermediate Representation Object Notation is a Interpreter/Database that is used to create Programming Languages. GitHub - sjhalani7/vaen: Package your AI coding harness into a portable .agent file, and share it across repos, teams, & the community without ever having to copy-paste instructions, skills, MCP config, or secrets. Show HN: Gandalf the Grader Show HN: Citadeld – replay any CI failure locally from a single file GitHub - tdortman/cuSBF: High-Performance GPU Super Bloom Filter coral-ai/claude-code-token-xray at main · Coral-Bricks-AI/coral-ai GitHub - ulyssestenn/funes: Funes is a Git-based framework for LLM-managed knowledge work: an AI Librarian ingests raw sources, builds an interlinked Markdown knowledge base, and uses it to produce cited reports, analyses, and other outputs. GitHub - ThatXliner/gah: Git Add Hunk, built for agents to use GitHub - harmont-dev/harmont-cli: Command-line client for the Harmont CI platform GitHub - brooksmcmillin/mcp-authflow: OAuth 2.0 Authorization Server framework for MCP servers GitHub - javaid-codes/audit-supply-chain-agents GitHub - amorey/gochan: A small library of common channel architectures for Go, inspired by Rust GitHub - arifozgun/OpenGem: Free, Open-Source AI API Gateway with Gemini, OpenAI & Anthropic Compatibility in 1 file GitHub - Pranesh950/BioPetals: 🌸 Run BIOxAI models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading GitHub - cnguyen14/bounty-doctor: Diagnose a GitHub bounty issue before you waste hours: detects honeypot scam repos, AI-bot attempt swarms, and stale contests. Show HN: CoreMCP – MCP Server for On-Prem DBs Show HN: KittyHTML – Render HTML/CSS as an inline image in your terminal GitHub - bingud/filemat: Web-based file manager Show HN: TruthLens – Free multi-signal deepfake image detector GitHub - apexlocal-jz/claude-usage-tray: Windows system-tray app showing your Claude Code rate-limit usage at a glance. Zero deps, ~300 lines of PowerShell. Cross-IDE (works regardless of VS Code, Cursor, plain terminal). Release v0.1.2.1 · kouhxp/yapsnap GitHub - noopolis/moltnet: Self-hostable chat network for AI agents. Pre-built bridges for Claude Code, Codex, and the Claws. Rooms, DMs, history. No Slack bots, no Matrix, no glue code. GitHub - tamerh/enju: Coordinating Humans, AI Agents, and Compute as Peers on a Shared Workflow Graph Show HN: Continuity-auth – Respect-weighted rate limits for the open web GitHub - luml-ai/luml: AI lifecycle platform where engineers and agents track experiments, train models, and ship to production. GitHub - mrdanielcasper/CoreTex: A UNIX-inspired, biomimetic, flat-file AI harness and knowledge engine. GitHub - clemg/pierre-github: Pierre's diffs.com and trees.software for Github GitHub - lyriks-io/unspaghettit: Behavior-driven AI development without prompt spaghetti. GitHub - sofumel/claude-handoff-revive: Resume Claude Code work after rate/usage/context limits without replaying the prior transcript. Auto-saves at 90%/95% usage. Plugin-installable, 10 languages. GitHub - dotexorg/saferpc: Typed, end-to-end encrypted RPC over any bidirectional channel. GitHub - BeeZeeAgent/beezee: Agent harness orchestration Legato Next.js Boilerplate for Internal Tools · CoreUI GitHub - clark-labs-inc/clark-hash: Clark Hash, 32x smaller searchable sketches for embeddings GitHub - ZeroPointRepo/youtube-mcp: The fastest YouTube transcript + YouTube search MCP for AI agents. Try for free. Typing Mastery — climb toward 100+ WPM, deliberately GitHub - Andebugulin/Awareen GitHub - fayzan123/claude-workflow-composer: Visual desktop app for composing multi-agent coding workflows. Drag agents, attach skills and MCPs, wire handoffs, export to .claude/ GitHub - StackOneHQ/stack-nudge We hardened an LLM agent. Each defense we added made it more exploitable. GitHub - alkait/WhatsKept: Agent-queryable WhatsApp history from an iOS backup — a single Go binary. GitHub - octelium/cordium: Open-source, general-purpose sandbox platform for devs and AI agents that provides identity-based secure access to infrastructure without credentials. GitHub - scosman/videowright: Build animated explainer videos with your coding agent GitHub - dipankar/dscode: The code editor you can take apart. GitHub - zoharbabin/web-researcher-mcp: MCP server (Go) for AI assistants: web search, content extraction, academic/patent/news research. Multi-provider routing, 4-tier scraping, search lenses. Works with Claude, Cursor, and any MCP client. GitHub - scanaislop/aislop: Catch the slop AI coding agents leave in your code: narrative comments, swallowed exceptions, as-any casts, dead code, oversized functions. 50+ rules across 7 languages (TypeScript, JavaScript, Python, Go, Rust, Ruby, PHP). Sub-second, deterministic, no LLM at runtime. MIT-licensed. GitHub - kouhxp/cheap-im: CPU-only voice agent approximating Thinking Machines' Interaction Models demo GitHub - unprovable/OrchidMantis: Orchid Mantis — standalone framework for Zero-Knowledge Proofs of eXploit (ZKPoX). GitHub - TangibleResearch/Halgorithem: A Algo designed to detect AI Hallucitions GitHub - CarpseDeam/Aura-IDE: An AI coding harness that shaped itself - Planner/Worker agents, repo awareness, surgical edits, validation, recovery, and safe diff approvals. GitHub - chojs23/concord: A feature-rich TUI client for Discord GitHub - aerf-spec/aerf: Agent Evidence Receipt Format (AERF) — an open specification for tamper-evident, independently verifiable records of AI agent actions. GitHub - Jwrede/tokentoll: Catch LLM cost changes in code review. Infracost for LLM spend. GitHub - samchon/ttsc: A `typescript-go` toolchain for compiler-powered plugins and type-safe execution + 500x faster lint integrated into compiler GitHub - Higangssh/homebutler: 🏠 Manage your homelab from chat. Single binary, zero dependencies. GitHub - olalie/tapmap: See where your computer connects and what stands out on a live world map. GitHub - Diplomat-ai/diplomat-agent: What can your AI agent do to the real world? Scan your code. See which tool calls have zero checks GitHub - Bajusz15/beacon: Open-source agent for secure remote access, monitoring, and deploys across home-lab and self-hosted machines like Raspberry Pi, N100, or any Linux server. Open web based TTY or tunnel Home Assistant and other local services securely without opening ports. BigTech AI News - Chrome 应用商店 GitHub - vinhnx/VTCode: VT Code is an open-source coding agent with LLM-native code understanding and robust shell safety. Supports multiple LLM providers with automatic failover and efficient context management. GitHub - Lumen-Labs/brainapi2: BrainAPI is a knowledge graph–powered AI memory layer that transforms unstructured data into structured knowledge, enabling intelligent search, recommendations, and contextual memory for AI agents and applications. GitHub - familiar-software/familiar: Let AI watch you work. Familiar lets your AI update its memory, skills, and knowledge by watching your screen. make sidebar/address bar rounded corner toggleable
GitHub - zahraarmantech/ZATRON: Privacy-preserving semantic search via modular arithmetic. 98% retrieval quality, 8x faster than FHE. Patent pending.
zahraarman · 2026-06-01 · via Show HN

Zero-Access Transformed Retrieval Over Noise

Privacy-preserving semantic search via multi-channel modular arithmetic. Search sensitive documents by meaning without exposing content — not to the database, not to the server, not even to the key holder.

▶ Live Demo · GitHub · Patent Pending · US Provisional Patent Filed 2026


The Problem

Standard semantic search stores embeddings as plain vectors. Anyone with database access can cluster documents by topic and infer content — without reading a single word.

The Solution

ZATRON transforms embeddings into modular barcodes. Search still works. Structure disappears.

ZATRON Comparison

Left: raw embeddings — same-topic documents cluster together. An attacker immediately sees the structure.

Right: ZATRON protected — random scatter. No topic structure visible.


Attack Analysis

Can an observer recover document similarity from ZATRON barcodes?

Attack Analysis

Left: raw embedding distances perfectly correlate with true similarity (ρ = 1.00). Attacker wins.

Right: ZATRON barcode distances show zero correlation (ρ = 0.09). Attacker gets nothing.


Results

All numbers on real data. All reproducible.

Retrieval Quality

Benchmark Corpus Quality (% of cosine)
MSMARCO 626,906 passages 98.2%
Natural Questions 5,000 passages, 100 real queries 101.6%
SciFact 5,183 scientific docs 95.7%
NFCorpus 3,633 medical docs 89.9%
STS-B 1,379 sentence pairs 100.1%

Real-World Challenge Tests

Dataset What it is Security (ρ) Top-10 Accuracy
Enron Emails 350,652 real corporate emails 0.034 ✓ 98.0%
Natural Questions 5,000 passages, real Google queries 0.031 ✓ 100%
MSMARCO 626,906 web passages 0.10 ✓ 98.2% MRR

Enron Corporate Email Archive — Full Results

Privacy-preserving retrieval on 350,652 real corporate emails from the Enron leak. This is a stress test: corporate emails have extremely high content similarity (forwarded threads, replies, meeting updates), making retrieval harder than typical document search.

Cosine Baseline ZATRON Retention
Top-1 100% 55.5% 55.5%
Top-5 100% 95.0% 95.0%
Top-10 100% 98.0% 98.0%
MRR@10 100% 72.5% 72.5%
Security (ρ) 0.034 ✓ SAFE

The correct result appears in the top 10 results 98% of the time. Top-1 is lower due to near-duplicate emails swapping ranks — not information loss.

Comparison with Existing Methods

Method MRR@10 Encrypted?
Cosine (float32) .530 No
Binary quantization .514 No
Product quantization .520 No
ZATRON (ours) .528 Yes

8× faster than CKKS FHE on identical hardware (5ms vs 39ms per comparison).

Three embedding models. Five languages. Eight security tests passed.

Multilingual

Language Quality
Arabic 93.4%
Spanish 93.8%
Korean 95.1%
Chinese 95.4%
English 93.5%

Speed Optimization: Coarse-to-Fine Search

Brute-force comparison against all documents is slow at scale. We developed a two-stage approach inspired by radio tuning — scan coarsely first, then fine-tune on candidates:

Stage 1 (Coarse): Compute distance using only 2 of 6 primes across all documents. This is one-third the computation and filters out 95% of irrelevant documents.

Stage 2 (Fine): Compute full 6-prime distance on the remaining 5% of candidates.

Dataset Brute Force Radio Tuning Speedup Quality Loss
MSMARCO 626K 2399s 957s 2.5× Zero
Enron 5K 2.1s 0.8s 2.6× Zero

Top-10 accuracy: 100% on both datasets. The coarse filter never drops the correct result.

No stored index or bucket structure. Zero additional leakage — the optimization happens at query time by the key holder.

At 1M+ documents, the speedup grows to an estimated 5–10×, since the fine stage (5% of corpus) dominates.

Try It

Live demo (no install): https://huggingface.co/spaces/zahraarman/ZATRON

Run locally:

pip install sentence-transformers scikit-learn matplotlib
python demo.py

Quick Start

from zatron_search import ModularBarcodeSystem

system = ModularBarcodeSystem(key="your-secret-key", n_channels=200)
system.fit(corpus_embeddings)

barcodes = system.encode(corpus_embeddings, doc_ids)
query_bc = system.encode_query(query_embedding)
distance = system.compare(query_bc, barcodes[0])

How It Works

  1. Decompose: Project embedding onto 200 PCA channels
  2. Quantize: Convert each channel to integer (0–49)
  3. Mask: Apply rejection-sampled salt + wave interference per document
  4. Store: Keep only modular residues (mod prime)
  5. Search: Compare in modular space — raw embedding never reconstructed

Security

Eight independent attack vectors tested:

Attack Result Status
IND-CPA indistinguishability p = 0.48 Pass
Statistical correlation ρ = 0.10 Pass
Entropy analysis 100% Pass
Per-channel leakage |r| = 0.30 Pass
Key recovery 1.0% vs 1.9% baseline Pass
Chosen-plaintext ρ = 0.00 Pass
Timing side-channel p = 1.00 Pass
CRT reconstruction |r| = 0.01 Pass

Threat model: Protected against unauthorized database observers. The key holder computes distances but never reconstructs raw embeddings. This is a randomized privacy-preserving encoding, distinct from reversible block cipher encryption.

Formal proofs under PRF assumption (HMAC-SHA256) in paper/Formal_Security_Proof.pdf.

Project Structure

ZATRON/
├── README.md
├── zatron_search.py              # Core system (self-testing)
├── demo.py                       # One-command demo
├── generate_visuals.py           # Generate comparison images
├── zatron_comparison.png         # t-SNE visualization
├── zatron_attack.png             # Attack analysis visualization
├── demo/
│   └── encrypted_search_demo.jsx # Interactive web demo
├── paper/
│   ├── Lightweight_Encrypted_Semantic_Search.pdf
│   └── Formal_Security_Proof.pdf
└── LICENSE

Cite

@misc{arman2026zatron,
  title={Lightweight Encrypted Semantic Search via Multi-Channel Modular Signaling},
  author={Zahra Arman},
  year={2026},
  note={US Provisional Patent Filed. github.com/zahraarmantech/ZATRON}
}

License

MIT License. The method is covered by a pending US provisional patent.

Author

Zahra Arman — Independent Researcher — zahra.arman.tech@gmail.com