ๆƒฏๆ€ง่šๅˆ ้ซ˜ๆ•ˆ่ฟฝ่ธชๅ’Œ้˜…่ฏปไฝ ๆ„Ÿๅ…ด่ถฃ็š„ๅšๅฎขใ€ๆ–ฐ้—ปใ€็ง‘ๆŠ€่ต„่ฎฏ
้˜…่ฏปๅŽŸๆ–‡ ๅœจๆƒฏๆ€ง่šๅˆไธญๆ‰“ๅผ€

ๆŽจ่่ฎข้˜…ๆบ

U
Unit 42
S
Securelist
ๅฐไผ—่ฝฏไปถ
ๅฐไผ—่ฝฏไปถ
WordPressๅคงๅญฆ
WordPressๅคงๅญฆ
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
The GitHub Blog
The GitHub Blog
Apple Machine Learning Research
Apple Machine Learning Research
ๅš
ๅšๅฎขๅ›ญ - ๅธๅพ’ๆญฃ็พŽ
ๅš
ๅšๅฎขๅ›ญ - Franky
Hugging Face - Blog
Hugging Face - Blog
OSCHINA ็คพๅŒบๆœ€ๆ–ฐๆ–ฐ้—ป
OSCHINA ็คพๅŒบๆœ€ๆ–ฐๆ–ฐ้—ป
้…ท ๅฃณ โ€“ CoolShell
้…ท ๅฃณ โ€“ CoolShell
O
OpenAI News
Cloudbric
Cloudbric
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
TaoSecurity Blog
TaoSecurity Blog
MongoDB | Blog
MongoDB | Blog
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
V
V2EX
PCI Perspectives
PCI Perspectives
T
Troy Hunt's Blog
Schneier on Security
Schneier on Security
P
Palo Alto Networks Blog
M
MIT News - Artificial intelligence
V2EX - ๆŠ€ๆœฏ
V2EX - ๆŠ€ๆœฏ
้˜ฎไธ€ๅณฐ็š„็ฝ‘็ปœๆ—ฅๅฟ—
้˜ฎไธ€ๅณฐ็š„็ฝ‘็ปœๆ—ฅๅฟ—
Hacker News - Newest:
Hacker News - Newest: "LLM"
G
Google Developers Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
The Last Watchdog
The Last Watchdog
The Register - Security
The Register - Security
่…พ
่…พ่ฎฏCDC
N
News and Events Feed by Topic
C
Check Point Blog
็ˆฑ่Œƒๅ„ฟ
็ˆฑ่Œƒๅ„ฟ
T
Tailwind CSS Blog
Webroot Blog
Webroot Blog
P
Proofpoint News Feed
S
Schneier on Security
MyScale Blog
MyScale Blog
N
News | PayPal Newsroom
Recorded Future
Recorded Future
T
Tenable Blog
I
InfoQ
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Microsoft Security Blog
Microsoft Security Blog
Simon Willison's Weblog
Simon Willison's Weblog
Engineering at Meta
Engineering at Meta

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Donโ€™t Fail โ€” They Drift Spilling beans for how i learn for exam๐Ÿ˜"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" โ€” What Actually Happened Comfy Cloudโ€™s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions โ€” here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components โ€” Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cรณmo construรญ un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 ๐Ÿš€ I Built an Ethical Hacking Scanner Tool โ€“ Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points โ€” Here's What I Found About How Markets Really Move EcoTrack AI โ€” Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead โ€” I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve โ€” no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like Youโ€™re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace โ€” how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025โ€“62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python) The Hidden Cost of AI Systems Nobody Talks About. undefined vs undeclared, and how typeof behaves Switching from file-based jobs to NATS/Kafka in Rust without changing code io_uring Adventures: Rust Servers That Love Syscalls Why Agentic AI is Killing the Traditional Database The POUR principles of web accessibility for developers and designers Quantum Neural Network 3D โ€” A Deep Dive into Interactive WebGL Visualization How To Install Caveman In Codex On macOS And Windows Automation Pipeline Reliability: Why Your Workflow Breaks When Nobody Is Watching I Built an 'Open World' AI Coding Agent โ€” It Works From ANY Folder From Freelancing to Product: A Tech Service Company's SaaS Transformation China's AI Giants: Adding Tencent Hunyuan & ByteDance Doubao to AI University (74 Providers) On the Vibe Coders and Their Lies clerk: Auto-Summarize Your Claude Code Sessions AI Weekly โ€” 2026/04/10โ€“04/17 | The Model Lockdown Is Here, but the Toolchain Is the Real Battleground AI ้€ฑๅ ฑ โ€” 2026/04/10โ€“2026/04/17 ๆจกๅž‹ๅฐ้Ž–ๆฝฎไพ†ไบ†๏ผŒไฝ†ๅทฅๅ…ท้ˆๆ‰ๆ˜ฏ็œŸๆˆฐๅ ด Maybe this is how Open-Source apps are born... ๐Ÿš€ Fine-Tune LLMs with LoRA and QLoRA: 2026 Guide tRPC v11 + Next.js App Router: End-to-End Type Safety Without the Boilerplate ShadCN UI in 2026: Why I Stopped Installing Component Libraries and Started Owning My Components SaaS Billing in React Server Components: Stripe + Supabase Without a Single `useEffect` Join our DEV Weekend Challenge โ€” $1,000 in Prizes Across TEN winners! Submissions Due April 20 at 6:59 AM UTC. Implementing FSRS Spaced Repetition in Flutter + Supabase โ€” Adding Memory Science to an AI Learning App "I Texted My Localhost From the Train โ€” Claude Code Fixed the Bug Before I Got Home" I Built a Sales Prep AI and It Went Deeper Than Expected Design to Code #2: One JSON, Eleven Outputs Solving the 100M-Row Problem: A Summary Table Pattern for High-Volume Push Notification Logs Flutter Web With Wasm: What Actually Changes For Developers I Built 50 Royalty-Free Soundtracks for My Side Project in a Weekend Using AI Music Generation The Vibe Coding Security Checklist: 7 Things to Check Before You Ship Stop Letting Googlebot Guess Fix Your React App's SEO Right Desconstruindo o Streaming do LinkedIn: Como Criar um Engine de Extraรงรฃo de Vรญdeo de Alta Performance com HLS e FFmpeg (EDA Part-1) EDA (Exploratory Data Analysis) Explained With Real Life โ€” Why Looking at Your Data Is the Most Important Step in Machine Learning Brand Relationship Management at Scale: Our 4-Touch Outreach System for 200+ Brands Why String.fromEnvironment() Might Return an Empty String in Dart JGuardrails 1.0.0 โ€” Hardening Java LLM Apps Against Jailbreaks, Toxicity, and Prompt Injection Plan and Schedule a Full Week of Threads Content From One Claude Conversation Coding Cat Oran Ep3, Five Tables Changed Everything Updated: BFF Pattern I'm done watching freelancers get buried by 200 proposals. So I'm building the alternative. This is my first post BFS Algorithm in Java Step by Step Tutorial with Examples Tracking LLM Pricing Monthly: An Open Dataset for 22 AI Models How We Measure Content ROI on a Comparison Site: Revenue Attribution Without Perfect Data Introducing Nova AI Ops: The AI-Native Operating System for SRE Teams I built a free desktop video downloader for Windows โ€” Grabbit How Talkie OCR Helps Vision-Impaired & Dyslexic Users Read the World Around Them VRCFaceTrackingๅฎ‰่ฃ…ๅ’ŒiPhone้ขๆ•้…็ฝฎๆ•™็จ‹๏ผŒๆœ‰bug Even CrowdStrike Can't See Your Agents The Automation Gold Rush: What n8n Workflows and Claude Are Opening Up for Developers Right Now
๐Ÿ“Ž Paperclip Deep Dive ๐Ÿค– โ€” A Build Guide for an "AI Company" ๐Ÿข Control Plane
Truong Phung ยท 2026-04-30 ยท via DEV Community

Source: github.com/paperclipai/paperclip โ€” "Open-source orchestration for zero-human companies."

This guide distills the architecture, principles, and engineering choices behind Paperclip into an actionable blueprint you can use to build a similar system. It is written so you can read it top-to-bottom and walk away with a concrete plan.


Table of Contents

  1. ๐Ÿค– What Paperclip Actually Is
  2. ๐Ÿง  Core Mental Model: Control Plane, Not Framework
  3. ๐Ÿ“ The 10 Design Principles
  4. ๐Ÿ—๏ธ High-Level Architecture
  5. ๐Ÿ—ƒ๏ธ The Domain Model โ€” How "A Company" Maps to Tables
  6. ๐Ÿ’š The Heartbeat โ€” The Heart of the Runtime
  7. ๐Ÿ”Œ Adapters โ€” "Bring Your Own Agent"
  8. โœ… The Task System & Atomic Checkout
  9. โš–๏ธ Governance, Approvals & The Board
  10. ๐Ÿ’ฐ Budgets & Cost Control
  11. ๐Ÿงฉ Plugin System โ€” Capability-Gated Extensions
  12. ๐Ÿ“ก MCP Server โ€” Agents Talk to the API
  13. ๐ŸŽ“ Skills โ€” Teaching Agents the API
  14. โš™๏ธ Tech Stack & Repository Layout
  15. ๐ŸŒ REST API Surface
  16. ๐Ÿ”’ Multi-Company Isolation & Portability
  17. ๐Ÿ“‹ Audit Trail & Activity Log
  18. ๐Ÿ“ Engineering Conventions
  19. ๐Ÿ—บ๏ธ Step-by-Step Build Plan
  20. โš ๏ธ Pitfalls, Tradeoffs & What To Skip First

๐Ÿค– 1. What Paperclip Actually Is

Paperclip is a Node.js + React self-hosted application that lets you run a "company" of AI agents:

  • You define a company with goals/initiatives.
  • You hire agents (Claude Code, Codex, Cursor, custom CLI, HTTP bot โ€” you pick the runtime).
  • You assign tasks (issues) and budgets.
  • A board operator (human) approves hires, strategic plans, and budget overrides.
  • A scheduler runs each agent on a heartbeat (a short execution window) and tracks cost, status, tool calls, and outputs.

The Paperclip slogan: "If OpenClaw is an employee, Paperclip is the company."

It looks like a task manager (Linear/Jira) but underneath it is an org chart, a budget engine, an approval queue, a multi-runtime executor, and an audit log โ€” all designed for non-human workers.


๐Ÿง  2. Core Mental Model: Control Plane, Not Framework

This is the most important idea to internalize before building anything.

Agent Framework (LangGraph, CrewAIโ€ฆ) Control Plane (Paperclip)
Decides how an agent thinks Decides what an agent works on
Owns the prompt + tool loop Treats the agent loop as a black box
One process, in-memory Many processes, durable state
You ship code You ship a deployment

Concrete consequences for design:

  • The system never runs a "react+plan+act" loop itself. That is the adapter's job.
  • The system does own: identity, scheduling, task ownership, cost ledger, approvals, audit, persistence.
  • The contract with an agent is shockingly small: "I can invoke you, get status, and cancel you."

If you start building a Paperclip-like system and find yourself writing prompt templates or tool-call parsers in the core, you have drifted into framework territory โ€” pull back.


๐Ÿ“ 3. The 10 Design Principles

Lifted (and de-jargoned) from the spec:

  1. Unopinionated execution. The core does not care which model, prompt, or planner an agent uses. It launches a process and waits.
  2. Task-centric communication. Agents do not talk to each other directly. Delegation = task creation. Coordination = task comments. Status = field updates. This makes everything observable and replayable.
  3. Goal-traced work. Every task descends from a company initiative: Initiative โ†’ Project โ†’ Milestone โ†’ Issue โ†’ Sub-issue. No orphan work.
  4. Atomic task ownership. A task can be owned by exactly one agent at a time, enforced at the database layer (not in app code).
  5. Visible problem surfacing. Agents that get stuck must mark issues blocked and escalate. Silent retries are an anti-pattern.
  6. Human board authority. Every irreversible or high-risk action (hiring, big-spend, strategy approval, termination) requires a human approval record.
  7. Cost follows work. Costs are billed against the requesting task chain, not just the executing agent. This makes "who is expensive and why" answerable.
  8. Hard budget ceilings. Soft alert at 80%. At 100%, the agent is auto-paused and further invocations are blocked. No "best-effort."
  9. Progressive deployment. It must run on a laptop with embedded Postgres, then scale to self-hosted / cloud โ€” same code, same schema.
  10. Plugin-extensible, not fork-extensible. Capabilities the core doesn't ship come from out-of-process plugins with declared, gated capabilities.

When you design your system, keep this list visible and bounce every PR against it.


๐Ÿ—๏ธ 4. High-Level Architecture

                            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                            โ”‚       React UI (Vite)      โ”‚
                            โ”‚  Org chart ยท Tasks ยท Costs โ”‚
                            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                           โ”‚ REST + SSE
                                           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Node.js Server (TypeScript / Express)         โ”‚
โ”‚                                                                  โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚  REST API   โ”‚  โ”‚  Scheduler  โ”‚  โ”‚  Approvals  โ”‚  โ”‚ Plugins โ”‚  โ”‚
โ”‚  โ”‚ (handlers)  โ”‚  โ”‚ (heartbeat) โ”‚  โ”‚   engine    โ”‚  โ”‚  host   โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”‚         โ”‚                โ”‚                 โ”‚              โ”‚       โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                          โ–ผ                                        โ”‚
โ”‚                 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
โ”‚                 โ”‚   Adapter Mgr    โ”‚โ”€โ”€โ”€โ–ถโ”‚   Agent runtime  โ”‚      โ”‚
โ”‚                 โ”‚ (claude_local,   โ”‚    โ”‚ (child process / โ”‚      โ”‚
โ”‚                 โ”‚  codex_local,    โ”‚    โ”‚  HTTP webhook)   โ”‚      โ”‚
โ”‚                 โ”‚  http, process)  โ”‚    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
โ”‚                 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
                           โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚  PostgreSQL (or PGlite)  โ”‚
              โ”‚  companies ยท agents ยท    โ”‚
              โ”‚  issues ยท heartbeats ยท   โ”‚
              โ”‚  costs ยท approvals ยท     โ”‚
              โ”‚  activity_log            โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

      Sidecar (optional):
      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
      โ”‚   MCP server (thin REST   โ”‚  โ—€โ”€โ”€โ”€ agents call here to read/write Paperclip
      โ”‚       wrapper)            โ”‚
      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Enter fullscreen mode Exit fullscreen mode

The 12 subsystems the spec calls out โ€” this is the checklist for "feature complete v1":

  1. Identity & Access
  2. Org Chart & Agents
  3. Work & Task System
  4. Heartbeat Execution
  5. Workspaces & Runtime
  6. Governance & Approvals
  7. Budget & Cost Control
  8. Routines & Schedules
  9. Plugins
  10. Secrets & Storage
  11. Activity & Events
  12. Company Portability (export/import)

๐Ÿ—ƒ๏ธ 5. The Domain Model

This is where most of the cleverness lives. The schema is small but every column matters.

๐Ÿข Companies

companies(
  id uuid pk,
  name, description, status (active|paused|archived),
  pause_reason, paused_at,
  issue_prefix text not null,        -- e.g. "ACME"
  issue_counter int not null,        -- monotonic, used for ACME-123
  budget_monthly_cents int default 0,
  spent_monthly_cents int default 0,
  attachment_max_bytes,
  require_board_approval_for_new_agents bool
)

Enter fullscreen mode Exit fullscreen mode

Why an issue_prefix + issue_counter? So tasks have human-friendly IDs (ACME-42) that are stable, sortable, and unique per company without leaking other tenants' counts.

๐Ÿค– Agents

agents(
  id, company_id, name, role, title, icon,
  status (active|paused|idle|running|error|pending_approval|terminated),
  reports_to uuid โ†’ agents.id null,            -- the org chart edge
  capabilities text,
  adapter_type text,                           -- claude_local | codex_local | http | ...
  adapter_config jsonb,                        -- adapter-specific
  runtime_config jsonb default {},             -- timeouts, cwd, env
  default_environment_id,
  context_mode (thin|fat) default thin,
  budget_monthly_cents int default 0,
  spent_monthly_cents int default 0
)

Enter fullscreen mode Exit fullscreen mode

Why adapter_type + adapter_config (jsonb)? Lets you support N agent runtimes without N tables. The polymorphism lives in code (the adapter manager) and JSON, not in DDL.

๐Ÿ“ Issues (tasks)

issues(
  id, company_id, project_id, goal_id, parent_id,
  title, description,
  status (backlog|todo|in_progress|in_review|done|blocked|cancelled),
  priority (critical|high|medium|low),
  assignee_agent_id, assignee_user_id,

  -- Atomic checkout fields:
  checkout_run_id, execution_run_id,
  execution_agent_name_key, execution_locked_at,

  -- Provenance:
  created_by_agent_id, created_by_user_id,
  issue_number, identifier,                    -- e.g. ACME-42
  origin_kind, origin_id, origin_run_id, origin_fingerprint,
  request_depth int default 0,                 -- how deep the delegation chain is
  billing_code text                            -- "cost follows work"
)

Enter fullscreen mode Exit fullscreen mode

๐Ÿ’š Heartbeat runs (one row per execution window)

heartbeat_runs(
  id, company_id, agent_id,
  invocation_source (scheduler|manual|callback),
  status (queued|running|succeeded|failed|cancelled|timed_out),
  started_at, finished_at, error,
  external_run_id text,                        -- adapter's run id, for resume
  context_snapshot jsonb                       -- what was passed in
)

Enter fullscreen mode Exit fullscreen mode

๐Ÿ’ฐ Cost events (the ledger)

cost_events(
  id, company_id, agent_id, issue_id, project_id, goal_id,
  billing_code,
  provider text, model text,
  input_tokens, output_tokens, cost_cents,
  occurred_at
)

Enter fullscreen mode Exit fullscreen mode

โš–๏ธ Approvals (governance queue)

approvals(
  id, company_id,
  type (hire_agent|approve_ceo_strategy|budget_override_required|request_board_approval),
  requested_by_agent_id, requested_by_user_id,
  status (pending|revision_requested|approved|rejected|cancelled),
  payload jsonb,                               -- the proposed change
  decision_note, decided_by_user_id, decided_at
)

Enter fullscreen mode Exit fullscreen mode

๐Ÿ“‹ Activity log (the audit tape)

activity_log(
  id, company_id,
  actor_type (agent|user|system), actor_id,
  action text,                                 -- "issue.checked_out"
  entity_type, entity_id,
  details jsonb,
  created_at
)

Enter fullscreen mode Exit fullscreen mode

๐Ÿ” Indexes that matter (don't skip)

agents(company_id, status)
agents(company_id, reports_to)                   -- org-chart traversal
issues(company_id, status)
issues(company_id, assignee_agent_id, status)    -- "what's on my plate"
issues(company_id, parent_id)                    -- subtasks
issues(company_id, project_id)
cost_events(company_id, occurred_at)
cost_events(company_id, agent_id, occurred_at)   -- per-agent rollups
heartbeat_runs(company_id, agent_id, started_at desc)
approvals(company_id, status, type)
activity_log(company_id, created_at desc)

Enter fullscreen mode Exit fullscreen mode

Lesson: every index starts with company_id. Tenant isolation is a query-plan concern, not just an auth concern.


๐Ÿ’š 6. The Heartbeat

The heartbeat is the runtime kernel. Everything else is plumbing around it.

๐Ÿ”„ Lifecycle of a single tick

1. Scheduler decides "agent A should run now"
       โ†“
2. Insert heartbeat_runs row (status=queued)
       โ†“
3. Adapter manager looks up agents.adapter_type
       โ†“
4. Adapter.invoke(agentConfig, context):
        - Build prompt/context
        - Spawn child process OR fire HTTP webhook
        - Pass session_id from previous run if resumable
       โ†“
5. Stream logs, status, tool calls back into the run row
       โ†“
6. Wait until: exit | timeout | cancel
        - On timeout: send stop signal, wait graceSec, force-kill
       โ†“
7. Persist: token usage, cost_events rows, output snippet, error
       โ†“
8. Update heartbeat_runs (status=succeeded|failed|timed_out)
       โ†“
9. Emit activity_log entry; broadcast SSE to UI

Enter fullscreen mode Exit fullscreen mode

โšก Wakeup triggers (only four)

Trigger Meaning
timer Cron-like โ€” "every 5 minutes"
assignment A new task was checked out to this agent
on_demand Human or API pressed the "Run now" button
automation System-internal trigger (future)

๐Ÿ” Coalescing

"If an agent is already running, new wakeups are merged (coalesced) instead of launching duplicate runs."

This rule alone prevents 90% of the duplicate-spend bugs you'd otherwise hit.

โ–ถ๏ธ Session resumption

For adapters that support it (Claude CLI, Codex CLI), Paperclip stores the external_run_id / session ID in the heartbeat row. The next tick passes it back so the agent reloads its context. Operators can reset the session when context goes stale.

โš™๏ธ Runtime config

runtime_config:
  cwd: /workspaces/acme-engineering
  timeoutSec: 1800        # max wall time per heartbeat
  graceSec: 30            # SIGTERM โ†’ SIGKILL window
  env:
    ANTHROPIC_API_KEY: ${secret:anthropic_key}
  promptTemplate: ...     # adapter-specific
  args: [...]

Enter fullscreen mode Exit fullscreen mode

๐Ÿ›ก๏ธ Safety

"Local CLI adapters run unsandboxed on the host machine."

The spec is honest about this. Mitigations: per-agent OS user, restricted cwd, secrets managed by the host (not in prompts), and capability-gated plugins for anything the agent can't do directly.


๐Ÿ”Œ 7. Adapters โ€” "Bring Your Own Agent"

The adapter is the only abstraction over agent runtimes. It is intentionally tiny.

interface Adapter {
  invoke(agentConfig: AgentConfig, context?: HeartbeatContext): Promise<RunHandle>;
  status(agentConfig: AgentConfig): Promise<AgentStatus>;
  cancel(agentConfig: AgentConfig): Promise<void>;
}

Enter fullscreen mode Exit fullscreen mode

That's the whole contract. Three methods.

๐Ÿ”Œ Built-in adapters

Adapter Mechanism
process Spawns an arbitrary CLI as a child process
http POSTs to a webhook; agent lives wherever it lives
claude_local Claude Code CLI, supports session resume
codex_local OpenAI Codex CLI
cursor Cursor headless mode
gemini-local, pi_local, opencode-local, hermes_local Other local CLIs
openclaw_gateway Calls a managed cloud service

๐Ÿ† Why this design wins

  • Adding an agent runtime is a self-contained PR. Drop a folder under packages/adapters/<name>/. No core changes.
  • Most adapters are 100โ€“300 lines. They're mostly: spawn process, wire stdin/stdout, parse final JSON, report cost.
  • Polymorphism in JSON, not types. adapter_config jsonb lets each adapter define its own shape; the manager just passes it through.

๐Ÿ“Š Integration levels (acceptable degrees of "support")

Level What the adapter does
Minimum Callable; reports exit code
Status Reports success/failure/progress
Full Reports cost, updates tasks, calls back into Paperclip API

You don't need full instrumentation on day one. A new adapter can land at "Minimum" and be useful.


โœ… 8. Task System & Atomic Checkout

The task system is what stops two agents from doing the same work at the same time. It is the second-most-important runtime concept after the heartbeat.

๐ŸŒฒ Hierarchy

Initiative   (board-level direction, e.g. "Reach $1M ARR")
  โ””โ”€โ”€ Project          (e.g. "Self-serve checkout")
       โ””โ”€โ”€ Milestone   (e.g. "Public beta")
            โ””โ”€โ”€ Issue   (e.g. "Add Stripe webhook handler")
                 โ””โ”€โ”€ Sub-issue

Enter fullscreen mode Exit fullscreen mode

Every task traces up to an initiative; no work is "for nothing."

๐Ÿ” Atomic checkout (the magic SQL)

// Request
POST /issues/:issueId/checkout
{ "agentId": "uuid", "expectedStatuses": ["todo","backlog","blocked","in_review"] }

Enter fullscreen mode Exit fullscreen mode

Server-side:

UPDATE issues
SET assignee_agent_id = :agentId,
    status            = 'in_progress',
    started_at        = COALESCE(started_at, now())
WHERE id = :issueId
  AND status = ANY (:expectedStatuses)
  AND (assignee_agent_id IS NULL OR assignee_agent_id = :agentId);

Enter fullscreen mode Exit fullscreen mode

If the row count is 0, return 409 Conflict with the current owner/status. Otherwise the row is locked to that agent.

This single update is the entire concurrency story. No queues, no Redis locks, no leases. The DB row is the lock.

๐Ÿค Cross-team work & escalation rules

  • Any agent can create a task for any other agent (no permission walls โ€” visibility is total).
  • The receiving agent must complete, block, or escalate. They cannot silently cancel a cross-team request.
  • Escalation goes up their own reports_to chain.

๐Ÿท๏ธ Billing codes

When agent X delegates to agent Y, Y's cost_events are tagged with the billing code from X's task. Roll-ups answer "how much did Initiative #3 actually cost across the whole graph?"

๐Ÿ”„ State machine

backlog โ”€โ†’ todo โ”€โ†’ in_progress โ”€โ†’ in_review โ”€โ†’ done   (terminal)
   โ”‚         โ”‚           โ”‚
   โ”‚         โ””โ”€โ†’ blocked โ†โ”˜
   โ”‚         โ”‚
   โ””โ”€โ†’ cancelled (terminal)

Side effects:
  โ†’ in_progress  : sets started_at if null
  โ†’ done         : sets completed_at
  โ†’ cancelled    : sets cancelled_at

Enter fullscreen mode Exit fullscreen mode


โš–๏ธ 9. Governance, Approvals & The Board

The "board" is a single human operator (in v1). They have unrestricted authority โ€” pause, resume, override, terminate.

๐Ÿ“ฅ Approval queue

The approvals table is a generic mechanism. Four request types ship by default:

Type Who proposes What it gates
hire_agent CEO agent (or any agent if company requires) Creating a new agent
approve_ceo_strategy CEO agent Initial org/task plan
budget_override_required Any agent Spending past hard limit
request_board_approval Any agent Anything escalated to a human

Each approval carries a payload jsonb describing the proposed change. Approving an approval is what causes the change โ€” the request isn't applied until decided.

๐Ÿš€ The bootstrap sequence

This is what happens when a user starts a new company:

1. Human creates Company + Initiatives
2. Human writes initial top-level tasks
3. Human creates a "CEO" agent from a default template
4. CEO agent runs, proposes:
     - org structure (sub-agents to hire)
     - task breakdown
     - hiring approvals
5. Board reviews + approves
6. CEO begins delegating; the company is alive

Enter fullscreen mode Exit fullscreen mode

๐Ÿ”‘ Decision authority

Agents can propose anything. Agents can execute only on tasks they own. Anything else routes through approvals. This is the rule that prevents an agent from, say, "deciding" to spawn 50 sub-agents and bankrupting the company.


๐Ÿ’ฐ 10. Budgets & Cost Control

Cost is treated like rate-limiting: a soft warning, then a hard wall.

๐Ÿ“Š Reporting levels

Level Question it answers
Per-agent "Is this agent expensive?"
Per-task "Did this PR cost too much?"
Per-project "What's our $ on Project X?"
Per-billing-code "What did Initiative #3 cost end-to-end?"
Company-wide "What did the company spend this month?"

๐Ÿšง Enforcement

Soft alert default threshold: 80%
At 100%:
  - Set agent status to paused
  - Block new checkout/invocation for that agent
  - Emit high-priority activity event

Enter fullscreen mode Exit fullscreen mode

The "auto-pause" is the entire mechanism. There is no graceful degradation, no "let it finish the current task." It stops.

โš™๏ธ Budget configuration

  • Periods: daily | weekly | monthly | rolling
  • Per-agent and per-company budgets are independent. Both must allow the run.
  • "Unlimited" is a setting; if you want it, you set it explicitly.

๐Ÿ’ณ Cost ingestion

Agents (or their adapter) POST to:

POST /companies/:companyId/cost-events
{ agentId, issueId, provider, model, input_tokens, output_tokens, cost_cents, billing_code, occurred_at }

Enter fullscreen mode Exit fullscreen mode

The server enforces the company scope, denormalizes into rollups, and runs the budget check. Cost events are append-only โ€” no edits, no deletes.


๐Ÿงฉ 11. Plugin System

Plugins extend Paperclip without forking it. The architecture is two pieces:

  • Worker: Node.js process running the plugin's logic. Out-of-process by design.
  • UI: React components mounted at named "slots" in the host UI.

๐Ÿ› ๏ธ Worker contract

import { definePlugin } from "@paperclipai/plugin-sdk";

export default definePlugin({
  async setup(ctx) {
    ctx.data.register("widget.summary", async (params) => { ... });
    ctx.actions.register("widget.run",  async (input) => { ... });
    ctx.tools.register("widget.search", schema, async (input) => { ... });
    ctx.events.on("issue.checked_out", async (e) => { ... });
    ctx.jobs.register("daily.rollup",  async () => { ... });
  },
  onConfigChanged(newConfig) {},
  onShutdown() {},
  onValidateConfig(config) {},
  onWebhook(input) {},
  onHealth() {},
});

Enter fullscreen mode Exit fullscreen mode

๐Ÿ” Capability gating

Every API on ctx requires a declared capability in the plugin manifest:

companies.read, issues.read, issues.create,
events.subscribe, jobs.schedule,
agent.sessions.create, agents.invoke,
ui.sidebar.register, ui.detailTab.register, ...

Enter fullscreen mode Exit fullscreen mode

The host enforces them at call time. A plugin without issues.create cannot create an issue, even if it tries.

๐Ÿ–ผ๏ธ UI slots

Plugins mount React into named slots:

page, sidebar, sidebarPanel, settingsPage, dashboardWidget,
globalToolbarButton, detailTab, taskDetailView,
projectSidebarItem, toolbarButton, contextMenuItem,
commentAnnotation, commentContextMenuItem

Enter fullscreen mode Exit fullscreen mode

The UI side gets typed React hooks:

usePluginData<T>(key, params?)        // fetch worker data
usePluginAction(key)                   // invoke worker action
usePluginStream<T>(channel)            // SSE
useHostContext()                       // { companyId, entityId, entityType }

Enter fullscreen mode Exit fullscreen mode

๐Ÿงฑ Why out-of-process?

  • A crashing plugin doesn't take down the server.
  • Plugins can be in any language that can speak the IPC protocol.
  • Capability gating is enforceable at the IPC boundary, not just by trust.

๐Ÿ“ก 12. MCP Server

packages/mcp-server is a thin Model Context Protocol wrapper around the REST API. It exists so that any MCP-aware agent runtime (Claude Code, Cursor, etc.) can read and write Paperclip without bespoke integration code.

Configured with:

PAPERCLIP_API_URL
PAPERCLIP_API_KEY
PAPERCLIP_COMPANY_ID    (optional)
PAPERCLIP_AGENT_ID      (optional)
PAPERCLIP_RUN_ID        (optional)

Enter fullscreen mode Exit fullscreen mode

Tool surface (representative)

Read: getMe, listAgents, listIssues, getIssue, listComments, listProjects, listGoals, listApprovals, ...

Write: createIssue, updateIssue, checkoutIssue, addComment, suggestTask, requestConfirmation, decideApproval, ...

Escape hatch: paperclipApiRequest({ path, method, body }) โ€” restricted to /api paths and JSON bodies, lets agents reach endpoints with no dedicated tool yet.

Lesson: the MCP server has no business logic. It is a translation layer. Single source of truth = the REST API. This is why it can stay tiny.


๐ŸŽ“ 13. Skills

A skill is a markdown file (plus optional examples) that teaches an agent how to use the Paperclip API. It is adapter-agnostic โ€” Claude, Codex, custom, all read the same SKILL.md.

The bundled skills (under /skills) include:

  • paperclip โ€” the master skill: task CRUD, status reporting, cost logging, comms rules.
  • paperclip-create-agent โ€” how to propose hiring a new agent (writes to approvals).
  • paperclip-create-plugin โ€” scaffolding a plugin.
  • paperclip-converting-plans-to-tasks โ€” taking a CEO's plan into atomic issues.
  • paperclip-dev โ€” meta-skill for editing Paperclip itself.
  • para-memory-files โ€” managing persistent agent memory.

A skill is not code; it's prose + examples. The agent's runtime loads it as part of its system context. This means upgrading a skill upgrades every agent that uses it, no redeploy.


โš™๏ธ 14. Tech Stack & Repo Layout

Concern Choice
Backend Node.js 20+, TypeScript, Express (REST only โ€” no tRPC)
Frontend React + Vite
DB PostgreSQL; PGlite for local/dev, Supabase or Docker Postgres for prod
ORM Drizzle (drizzle.config.ts in packages/db)
Auth Better Auth
Tests Vitest + Playwright
Package mgr pnpm 9.15+ workspaces
License MIT

Top-level layout

.agents/skills/      # Agent skill definitions
.claude/skills/      # Claude-specific skills
.github/             # CI, templates
cli/                 # `npx paperclipai onboard` etc.
docker/              # Compose + Dockerfiles
docs/                # Public docs site
doc/                 # Internal SPEC.md, SPEC-implementation.md
evals/               # Agent eval framework
packages/
  adapters/          # claude-local, codex-local, cursor-local, ...
  adapter-utils/     # shared adapter helpers
  db/                # Drizzle schema + migrations
  mcp-server/        # MCP wrapper
  plugins/
    sdk/             # @paperclipai/plugin-sdk
    create-paperclip-plugin/
    sandbox-providers/e2b/
  shared/            # types, utils
patches/             # pnpm patch files
releases/            # release artifacts
report/              # reporting tools
scripts/             # one-off ops scripts
server/              # the Node server
  src/
  scripts/
skills/              # the bundled skills
tests/               # cross-package tests
ui/                  # the React app

Enter fullscreen mode Exit fullscreen mode

One-command onboarding

npx paperclipai onboard --yes
# or:
git clone https://github.com/paperclipai/paperclip.git && cd paperclip
pnpm install
pnpm dev

Enter fullscreen mode Exit fullscreen mode

pnpm dev boots: server (with PGlite embedded), UI (Vite), and a watcher.


๐ŸŒ 15. REST API Surface

The full v1 surface, grouped. Use this as the spec for your server.

๐Ÿข Companies

GET    /companies
POST   /companies
GET    /companies/:companyId
PATCH  /companies/:companyId
PATCH  /companies/:companyId/branding
POST   /companies/:companyId/archive

Enter fullscreen mode Exit fullscreen mode

๐ŸŽฏ Goals

GET    /companies/:companyId/goals
POST   /companies/:companyId/goals
GET    /goals/:goalId
PATCH  /goals/:goalId
DELETE /goals/:goalId

Enter fullscreen mode Exit fullscreen mode

๐Ÿค– Agents

GET    /companies/:companyId/agents
POST   /companies/:companyId/agents
GET    /agents/:agentId
PATCH  /agents/:agentId
POST   /agents/:agentId/pause
POST   /agents/:agentId/resume
POST   /agents/:agentId/terminate
POST   /agents/:agentId/keys                  # mint API key for the agent
POST   /agents/:agentId/heartbeat/invoke      # manual on-demand wakeup

Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Issues

GET    /companies/:companyId/issues
POST   /companies/:companyId/issues
GET    /issues/:issueId
PATCH  /issues/:issueId
POST   /issues/:issueId/checkout              # atomic
POST   /issues/:issueId/release
POST   /issues/:issueId/admin/force-release   # board-only
POST   /issues/:issueId/comments
GET    /issues/:issueId/comments
POST   /companies/:companyId/issues/:issueId/attachments
GET    /issues/:issueId/attachments

Enter fullscreen mode Exit fullscreen mode

๐Ÿ’ฐ Costs & budgets

POST   /companies/:companyId/cost-events
GET    /companies/:companyId/costs/summary
GET    /companies/:companyId/costs/by-agent
GET    /companies/:companyId/costs/by-project
PATCH  /companies/:companyId/budgets
PATCH  /agents/:agentId/budgets

Enter fullscreen mode Exit fullscreen mode

โš–๏ธ Approvals

GET    /companies/:companyId/approvals?status=pending
POST   /companies/:companyId/approvals
POST   /approvals/:approvalId/approve
POST   /approvals/:approvalId/reject

Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Š Activity & dashboard

GET    /companies/:companyId/activity
GET    /companies/:companyId/dashboard

Enter fullscreen mode Exit fullscreen mode

Design notes

  • Every write that mutates state writes one row to activity_log in the same transaction.
  • Authorization is one model: the API key resolves to an actor (user, agent, or system) and a company scope. The same handler serves UI requests and agent requests; only the actor type differs.
  • No RPC, no GraphQL. Plain REST keeps the MCP wrapper trivially thin.

๐Ÿ”’ 16. Multi-Company Isolation & Portability

The deployment is single-tenant for the operator (you run your own server), but multi-company within the deployment (one Paperclip can host several orgs).

Isolation is enforced three ways:

  1. Schema: every domain table has company_id and every index leads with it.
  2. Authorization: the actor's API key carries a company scope; handlers reject mismatches.
  3. Storage: secrets, attachments, plugin state are namespaced by company.

๐Ÿ“ฆ Portability

  • Template export โ€” schema only (org chart, roles, default tasks). Useful for "starter companies."
  • Snapshot export โ€” full state including tasks, comments, costs. With secret scrubbing before serialization.
  • Imports are atomic; either the whole company appears or nothing does.

๐Ÿ“‹ 17. Audit Trail & Activity Log

Every state mutation produces:

activity_log(
  actor_type โˆˆ {agent, user, system},
  actor_id,
  action       e.g. "issue.checked_out",
  entity_type, entity_id,
  details jsonb,
  created_at
)

Enter fullscreen mode Exit fullscreen mode

Two consequences:

  • Replay โ€” you can reconstruct any past state by walking the log.
  • Tool-call tracing โ€” when an agent calls the MCP server, those calls become activity entries. "What did agent X actually do at 3:14am?" is a query, not an investigation.

๐Ÿ“ 18. Engineering Conventions

These are guardrails worth copying verbatim:

  1. Keep changes company-scoped. Every query, every cache key, every authorization check. No cross-tenant code paths exist.
  2. Contracts must be in sync. The DB schema, the OpenAPI spec, the TypeScript types, and the MCP tool definitions are generated from one source. Drift is a bug.
  3. Migrations are append-only. Never edit a migration after it has shipped. Use pnpm db:migrate to generate; never hand-write SQL into old files.
  4. One PR = one logical change.
  5. Each PR declares the model that wrote it. (Cute but useful telemetry.)
  6. All tests pass before merge. CI green. Code-review tool score = 5/5.
  7. Fail visibly. Agents that hit unexpected state mark tasks blocked; servers return errors; UIs show them. No silent fallbacks.
  8. Read SPEC-implementation.md when in doubt. When SPEC.md and the implementation spec disagree, implementation wins for v1.

๐Ÿ—บ๏ธ 19. Step-by-Step Build Plan

If you are building a Paperclip-like system from scratch, do it in this order. Each step is shippable on its own.

๐ŸŒฑ Phase 0 โ€” Skeleton (1-2 days)

  • pnpm monorepo with server/, ui/, packages/db, packages/shared.
  • Express server, Vite React app, Drizzle + PGlite for dev.
  • Health check endpoint, hello world UI.

๐Ÿ” Phase 1 โ€” Companies & Auth

  • companies table.
  • Better Auth for human users.
  • API-key model: every key is (actor_type, actor_id, company_id).
  • Middleware that resolves the key into an Actor and rejects on company mismatch.

๐Ÿข Phase 2 โ€” Org Chart

  • agents table with reports_to.
  • CRUD endpoints + UI org-chart view.
  • Status field with transitions, but no runtime yet โ€” agents are just data.

๐Ÿ“ Phase 3 โ€” Tasks

  • issues + goals + projects tables with the full hierarchy.
  • Implement atomic checkout with the exact SQL above. Write a regression test that races 50 concurrent checkouts and asserts exactly one wins.
  • Kanban / list UI.

๐Ÿ’š Phase 4 โ€” The Heartbeat (the moment your project becomes real)

  • heartbeat_runs table.
  • Adapter manager interface (3 methods: invoke, status, cancel).
  • Build one adapter first: process (just spawn a CLI you control). Don't start with Claude.
  • Scheduler:
    • Cron loop for timer triggers.
    • Hook on issue checkout โ†’ emit assignment wakeup.
    • "Run now" button โ†’ on_demand.
  • Coalescing: if a run is already running for an agent, drop new wakeups, mark them as merged.
  • Timeouts + grace + force-kill.

๐Ÿ’ฐ Phase 5 โ€” Cost & Budgets

  • cost_events table.
  • Budget fields on companies and agents.
  • Ingestion endpoint with company-scope check.
  • On every cost insert: recompute spent / budget; if past 100%, pause agent + emit activity.
  • Dashboards: per-agent, per-task, per-project rollups (use the indexes you already built).

โš–๏ธ Phase 6 โ€” Approvals & Governance

  • approvals table; generic payload + type.
  • request_board_approval flow end-to-end.
  • "Hire agent" requires approval; approving the approval creates the agent row.
  • Board UI with a single "approvals" inbox.

๐Ÿ“‹ Phase 7 โ€” Activity Log + SSE

  • Append activity_log in the same transaction as every mutation.
  • Server-sent events broadcast new activity to subscribed UIs.
  • "Recent activity" feed and per-entity history.

๐Ÿ”Œ Phase 8 โ€” More adapters

  • Wrap a real CLI (Claude Code or Codex). Reuse adapter-utils for stdio framing and JSON parsing.
  • Add http adapter for remote agents.
  • Now you can ship to early users.

๐Ÿ“ก Phase 9 โ€” MCP Server

  • Standalone package that calls your REST API.
  • One MCP tool per important endpoint, plus the escape-hatch apiRequest.
  • Test it with Claude Code locally.

๐ŸŽ“ Phase 10 โ€” Skills

  • Pick the top 3 things agents do badly without guidance and write SKILL.mds for them.
  • Distribute via .agents/skills/ and tell adapters to load them into the system context.

๐Ÿงฉ Phase 11 โ€” Plugins

  • Out-of-process worker SDK with definePlugin.
  • IPC: simplest is JSON over stdio with a request-id correlation.
  • Manifest with declared capabilities; host enforces at every IPC call.
  • UI slot system: a registry keyed by slot name, plugins mount React via iframe or shadow DOM.

๐Ÿ“ฆ Phase 12 โ€” Portability

  • POST /companies/:id/export โ†’ JSON snapshot, with a secret_scrub pass.
  • POST /companies/import โ†’ atomic, transactional.

โœจ Phase 13 โ€” Polish

  • One-command onboarding (npx <yourtool> onboard) that generates .env, runs migrations, opens browser.
  • Docker compose for "self-host on a box."
  • Telemetry (anonymous, opt-out).

โš ๏ธ 20. Pitfalls and Tradeoffs

๐Ÿšซ Things to not do, especially early

  • Don't build your own agent loop. The whole point is to be unopinionated. Wrap a CLI; ship.
  • Don't add tRPC / GraphQL. It makes the MCP wrapper non-trivial. Plain REST is the contract that survives.
  • Don't centralize prompts in the server. Prompts belong in adapters or skills. The core has zero opinion about model behavior.
  • Don't treat budgets as soft. "Best effort" budget enforcement is no enforcement. Build the auto-pause from day one.
  • Don't allow direct agent-to-agent calls. Force everything through tasks/comments. You'll thank yourself when debugging.
  • Don't put company_id on "most" tables. Put it on every table.
  • Don't sandbox plugins via trust. Out-of-process + capability manifest, or nothing.

โš–๏ธ Honest tradeoffs Paperclip makes

Tradeoff What you get What you lose
Single human board operator (v1) Simple authority model No multi-stakeholder governance
REST + jsonb polymorphism Easy to extend, MCP is trivial Less compile-time safety than tRPC
Local CLI adapters unsandboxed Maximum runtime freedom You own the host security story
Atomic checkout via SQL Dead simple, no extra services Doesn't scale past a single Postgres
Skills as markdown Hot-swappable; runtime-agnostic Behavior depends on adapter discipline
Plugins out-of-process Crash isolation; multi-language Higher latency than in-proc

๐Ÿ”€ Where to deviate if your domain differs

  • If your "agents" are humans-in-the-loop, keep the same model โ€” add assignee_user_id, you already have it.
  • If you need multi-board governance, generalize decided_by_user_id to a poll-style record on approvals.
  • If costs aren't $/tokens, generalize cost_events to usage_events with provider-defined units. Keep the rollup shape.
  • If you need horizontal scale, the bottleneck is the heartbeat scheduler. Move it to a leader-elected job runner; everything else (REST, DB) already scales.

๐Ÿ’ก TL;DR for Building Your Own

  1. It's a control plane, not a framework. Three-method adapter contract. Don't pretend otherwise.
  2. Postgres schema is the architecture. Get companies / agents / issues / heartbeat_runs / cost_events / approvals / activity_log right and 80% of behavior falls out.
  3. The heartbeat is the kernel. Coalesce, timeout, persist runs, log activity.
  4. Atomic SQL UPDATE = your concurrency story.
  5. Hard budget ceilings, not soft ones.
  6. Tasks are the only communication channel between agents.
  7. REST + MCP + skills, in that order. Each is a thin layer over the previous.
  8. Plugins out-of-process, capability-gated.
  9. Every table, query, and index starts with company_id.
  10. Append-only audit log in the same transaction as every mutation.

Build those ten things and you have Paperclip. Everything else is polish.


๐Ÿ“š Sources


If you found this helpful, let me know by leaving a ๐Ÿ‘ or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! ๐Ÿ˜ƒ