惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
博客园_首页
H
Hackread – Cybersecurity News, Data Breaches, AI and More
T
ThreatConnect
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - 聂微东
H
Help Net Security
T
Threat Research - Cisco Blogs
Blog — PlanetScale
Blog — PlanetScale
A
Arctic Wolf
G
Google Developers Blog
量子位
U
Unit 42
I
InfoQ
V
V2EX
F
Fox-IT International blog
P
Privacy & Cybersecurity Law Blog
V
Visual Studio Blog
J
Java Code Geeks
大猫的无限游戏
大猫的无限游戏
C
CERT Recently Published Vulnerability Notes
博客园 - 三生石上(FineUI控件)
T
The Exploit Database - CXSecurity.com
T
Tailwind CSS Blog
SecWiki News
SecWiki News
Know Your Adversary
Know Your Adversary
MyScale Blog
MyScale Blog
宝玉的分享
宝玉的分享
The Hacker News
The Hacker News
Project Zero
Project Zero
Application and Cybersecurity Blog
Application and Cybersecurity Blog
月光博客
月光博客
Recent Commits to openclaw:main
Recent Commits to openclaw:main
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
G
GRAHAM CLULEY
C
Cisco Blogs
I
Intezer
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
Recorded Future
Recorded Future
T
Tenable Blog
W
WeLiveSecurity
腾讯CDC
Stack Overflow Blog
Stack Overflow Blog
T
The Blog of Author Tim Ferriss
www.infosecurity-magazine.com
www.infosecurity-magazine.com
D
Docker
C
Cybersecurity and Infrastructure Security Agency CISA
PCI Perspectives
PCI Perspectives

DEV Community

I built a local MCP server that gives Claude Code real PR context — 33s reviews instead of 90s AI, Pig Butchering, and the New Frontier of Scams: Why Scammers Are Becoming Developers Journey Begins: Google Cloud Get Certified Program Edition 2 (2026) I Vibe-Coded an App in a Weekend. Three Weeks Later I Couldn't Explain It. Feeding Raw HTML to Your LLM Is a Token Tax. I Measured It on 10 Real Pages — Median 7.4 , and It Hits Every Scheduled Run Beyond Strict Mode: 5 Advanced TSConfig Settings for Bulletproof TypeScript The bug I kept seeing in math practice: right answers that were too slow gotracer: Turn Go Execution Traces into Actionable Findings Forget Python: Why PHP is the Real Future of AI for the Web Stop Reinventing the Wheel: 5 Hidden Gems in PrestaShop's Tools.php File AI Tools & Products Radar — May 28, 2026 New Benchmark Reveals Hidden Trade-offs in AI Model Tuning Methods What I Learned Building My First Chrome Extension for Google Calendar Trider – The AI Habit Tracker That Actually Gets You (Free, No Ads) 4 Best AI TTS APIs in 2026 Claude Opus 4.8: What Developers Need to Know About Anthropic's New Flagship Claude Opus 4.8: What Developers Need to Know About Anthropic's New Flagship Full Stack Developer Looking for Internship Opportunities How Microservices Talk to Each Other Using WebClient After burning through tens of billions of tokens, I built an Android-like OS that runs entirely in the browser The PrestaShop Modules "Jungle": An Unexpected Opportunity for Your Site? I Ship One AI Testing Feature Every Day — Here's What 6 Days Looks Like Only 2 of 128 YC-backed dev tools companies block unchecked merges Read environment variables from .env file in Angular PrestaShop Added an AI Onboarding System Directly to Its Repo The AI Control Plane Is Becoming the New Shadow IT How-To Spec-Driven AI Development Veltrix Events Were a Disaster Until We Fixed One Crucial Thing Phone-as-keyboard for any USB host — building a driverless HID bridge PrestaShop Development: Is Documentation Really the Problem? Python List Methods Explained Simply (Add, Remove, Sort) Impostor Syndrome in Tech - The Honest Version Nobody Posts About I Built a Tool to Stop Guessing LLM API Costs. Here Is What I Learned. Constraint Decay: Why Your AI Coding Agent Passes Tests But Breaks Production KairoDB-Human-Readable Databases Your best pull request could be a -500 (and that's seniority) I Built a Terminal Typing App Because I Was Tired of Leaving My Terminal Sending SMS from AWS Lambda Markdown to PDF: 8 methods compared (and why most of them disappoint) Coordinar deploys de frontend y backend sin orquestado, usando Github Actions I had to restore an entire database just to recover one deleted row The Sovereign Vault: Building High-Integrity AI with MCP & Local Vision I Built a Lightweight Python RAG Orchestrator That Works with SQLite, PGVector and Qdrant Redis — The Engine of Instant Gratification The Project I Couldn’t Finish 2 Years Ago - Notebook for ChatGPT Less Greedy Code, Less Misery: The Power of SRP Through a Battle-Tested Lens Which Cloud Is Best for Containers & Microservices? Why IBM Cloud Stands Out Modern css kills js 15 AI Coding Hacks Nobody Talks About (2026) Your AI Agents Need an Architecture, Not Just a Prompt AI coding assistants are making juniors worse and seniors lazier AI can generate HTML. Publishing it is still weirdly annoying. Shopify vs Magento for AI Commerce in 2026: Platform-Mediated vs Merchant-Controlled AEO I scanned Langfuse. It observes its own LLM calls through its own platform. Prompt caching in production: the 4 patterns that cut my Anthropic bill (and when not to bother) Why Does My Android Camera Stop Recording When the Screen Turns Off? Doze, WorkManager, and the Right Way to Build a Foreground Service We patched Chromium with 49 C++ hooks to beat Cloudflare — here's how BrowserHand works I Replaced 30 Minutes of Daily Browser Chores with One Cron Job Rename a Kubernetes PVC Without Losing Your Data: PersistentVolume Rebinding A Week in the Life of a Treasure Hunt Engine that Almost Went Off the Rails Architecture of Chaos Part 4 (Finale) — Split-Brain Surgery, Chaos Engineering, and Shipping to Production The Road to KiwiEngine — The Strange Feeling of Publishing Your Own Ecosystem Day 93: Bridging React to iOS Widgets and Face ID The Hidden Cost of Complex AI Platforms: Why Developer Experience Matters Running FreeIPA on Ubuntu Using Podman – Part 2: Step-by-Step Deployment In 2026, you can just prompt your way to a working Android app. 🤯 Why DDR5 Bandwidth Kills Dual-LLM Inference on APUs (Benchmarks Inside) OpenSparrow v2.6 – AI-powered search (RAG), bulk operations, and keyboard shortcuts The New Shape of Supply-Chain Trust Why Analytics Is Product Infrastructure The Fallacies of GenAI Development Stop Building AI Assistants. Build AI Firewalls. I built a "what is my IP" site because I was tired of the ugly ones How to Stop Your AI Agent Before It Does Something You Can't Undo I Just Wanted to Scrape One Page. Why Did I Write 50 Lines of Puppeteer? Amazon STAR Method 2026: The Complete Cheat Sheet (30+ Questions + Scored Examples) Building a Japanese-First Read-Later PWA: From Pocket Shutdown to Launch How to show weather on your personal website in 3 lines of JavaScript (no API key needed) Building user-customizable themes with Tailwind CSS I turned an abandoned Go project into a full terminal Arcade Game Part 2 of 4: Building a Real k6 Test Suite Against a Live Kubernetes App How I structured 12 Flutter paywall screens to share the same purchase logic I Added a Live Dashboard to My LLM Proxy. Zero Instrumentation. Just a URL Change. Free Security Audit API: Scan Your Code in 30 Seconds I Built an Uncensored AI Chatbot With a Mystical Sphinx Persona Agent memory poisoning. The 4-stage enterprise damage chain. 18 developer tools I use to improve my workflow I Found a Free Domain Platform Built by an 18-Year-Old — and It Actually Works Why smart contract deployment still needs better infrastructure Navigating Layoffs: A Comprehensive Guide for Professionals How to Track Website Visitors Without Cookies in 2026 Building a no-signup PDF toolkit with 32 small file tools How to Optimize Images for Website Speed in 2026 (Without Losing Quality) Mastering CSS Grid Subgrid: A Complete Guide ffmpeg-ai: A Free CLI That Turns a Prompt Into a Finished YouTube Short ECS + FARGATE + CONTAINERIZATION + OBSERVABILITY + PRODUCTION ARCHITECTURE Microsoft Told Engineers to Ease Off Claude Code Evolution of Developer Skills Beyond the Cheat Sheets: How to Actually Reason About Partitioning VS Sharding in System Design Interview AI Coding Agents Search Like It's 2009. Provenant Cuts Tokens by 65 .
How I built AgentRAM: a memory API for AI agents without a vector DB
Sean Markwei · 2026-05-29 · via DEV Community

I'm a solo developer in Accra, Ghana, and I just shipped my first real product. It's called AgentRAM (agentram.dev), and it's a memory API for AI agents. This is the build story and the stack.

The problem I kept seeing

Over the last year, AI agents have gone from research toys to actual things people ship. But every agent that needs to remember anything across sessions runs into the same wall: where does the memory go?

The existing answers all felt heavy for what they were doing:

  • Mem0, Zep, Letta want you to set up embedding pipelines and vector databases. Powerful for RAG-style semantic search, but overkill if you just need "remember that user X likes dark mode."
  • OpenAI's Assistants API memory is locked to their platform and billed per-token, which means costs are unpredictable as conversation length grows.
  • Rolling your own with Postgres or Redis works, but it's a real chunk of infrastructure to maintain for each agent project, including auth, multi-tenancy, TTLs, and an HTTP layer.

I wanted something that handled the 70% case ("remember this fact about this agent") without the 100% solution's setup cost. So I built it.

What AgentRAM actually does

One HTTP call to store. One to retrieve. Scoped by agent ID, with optional TTLs and shared namespaces.

# Store a memory
curl -X POST https://api.agentram.dev/memory \
  -H "x-api-key: YOUR_KEY" \
  -d '{"agent_id":"my-agent","key":"user_pref","value":"dark mode"}'

# Retrieve it
curl "https://api.agentram.dev/memory?agent_id=my-agent&key=user_pref"
# {"value":"dark mode"}

Enter fullscreen mode Exit fullscreen mode

That's the whole interaction model. No embeddings, no vector similarity, no semantic chunking, no token accounting. Just durable key-value memory scoped per agent.

Other endpoints fill in the practical needs: list all memories for an agent, full-text search across them, shared namespaces so multiple agents can read from a common pool, and atomic credit-based usage tracking so cost is predictable.

Why not just Redis or Postgres?

This is the question I keep wrestling with, so let me be honest about it.

If you're already running infrastructure for your product, you should absolutely just add a memory table to your existing database. AgentRAM isn't for you.

But for everyone else, and there are a lot of "everyone else" right now in the vibe-coding and agent-prototyping era, AgentRAM removes some real friction:

  • No new infra to provision
  • No auth layer to write for your agents
  • No HTTP wrapper to build around your DB
  • No multi-tenant logic to get right
  • Built-in TTLs, search, and shared namespaces
  • Predictable per-operation pricing (no per-token surprises)

It's a Twilio-style argument: yes, you could roll your own SMS gateway, but for most people, paying per-message is cheaper than the time cost.

The stack

Nothing exotic, all standard boring tech:

  • API: Node.js with Express, deployed on Railway. Auto-deploys from GitHub.
  • Database: Supabase (Postgres), with atomic credit-update logic so concurrent payments and reads don't race.
  • Payments: Paystack. I'm in Ghana, Stripe doesn't operate here yet, but Stripe acquired Paystack in 2020. Paystack handles cards globally plus mobile money and Apple Pay, so coverage is actually broader than Stripe-alone for some users.
  • Email: Resend, with Cloudflare Email Routing for inbound on hello@agentram.dev.
  • DNS: Cloudflare, with WAF and rate limiting.
  • Frontend: Static HTML on Netlify, with Cabinet Grotesk self-hosted. No framework, no build step, just hand-written HTML and CSS.

The whole thing is six HTML pages, one Node server, one Supabase project. It's not a lot. That's intentional.

The pricing model: credits, not subscriptions

After agonising over this, I went with credit-based pricing instead of a monthly subscription.

  • 100 free credits on signup, no card required
  • 1 credit per operation (read, write, delete, search all count as one)
  • Top-ups: $5 for 50,000 ops, $15 for 200,000 ops, $40 for 600,000 ops
  • Founding member tier: $249 one-time for 500,000 ops plus 20% off all future top-ups, for as long as the account is active

Why credits over subscriptions:

  1. Aligns with how AI agent usage actually varies (bursty, unpredictable)
  2. No "wasted" subscription months for users who weren't building that month
  3. No churn anxiety on my side
  4. The unit price is easy to reason about: 1¢ per 100 operations at the Starter tier

The downside is it's slightly weirder to project revenue against. But I'd rather have a model my users actually feel good about.

The "did it actually work" moment

I shipped this today. The full deployment took about a day, including:

  • Pushing the API to Railway
  • Pointing api.agentram.dev at Railway via Cloudflare CNAME
  • Deploying the static site to Netlify
  • Pointing agentram.dev at Netlify via Cloudflare CNAME
  • Verifying Let's Encrypt SSL provisioned on both domains

Then I made my first real $5 test charge to my own account. Watched the credit count tick from 100 to 50,100. Confirmation email landed. The whole pipeline worked end to end.

That's the moment that justifies all the work that came before.

What I'm building next

The build is done. Now the harder thing: distribution. Things I'm planning over the next few weeks:

  • MCP server wrapper. Anthropic's Model Context Protocol is becoming the standard for how AI tools discover and use external services. An MCP server for AgentRAM means Claude Desktop and Cline users can add persistent memory with one config line.
  • LangChain memory backend. Implement BaseMemory so AgentRAM works as a drop-in memory layer for any LangChain agent.
  • LlamaIndex memory module and AutoGen / CrewAI integrations for the same reason.
  • Official SDKs for Python and TypeScript so the curl examples become idiomatic library calls.

What I'd love feedback on

Genuinely, not as a marketing-friendly closer. If you've built agents that need memory, I want to know:

  • Does the API surface feel complete enough, or is something missing?
  • What would make you pick this over rolling your own with Postgres?
  • Which integration would actually move the needle for you?
  • What would make you suspicious of this as a solo dev's project?

agentram.dev if you want to poke at it. 100 free credits if you want to try the API. Comments welcome here, or email me directly at hello@agentram.dev.