GitHub - sriram7737/pramagent

Trust middleware for LLM agents: deterministic tool policy, HITL approvals, and tamper-evident audit traces. Alpha - read the implementation status before customer-facing pilots.

Pramagent wraps OpenAI, Anthropic, Gemini, Ollama, local, and OpenAI-compatible providers with guardrails that run outside the model. The most differentiated layer is ToolGuard: deterministic tool validation with JSON Schema, tenant/action allow-lists, side-effect taxonomy, dangerous-chain detection, output scanning, and HITL escalation. The current package also ships curated safety rule corpora, persistent HITL queues, thin adapters for popular agent frameworks, and compliance evidence generation.

Alpha Maturity Notice

Pramagent is published as Alpha software. It has live smoke-test evidence for Sepolia anchoring, S3 cold archive, local load testing, real OpenAI/Ollama provider calls, and bundled red-team runs, but it has not passed an external penetration test, SOC 2 audit, HIPAA assessment, or regulated-production certification.

Do not treat Pramagent as bank-grade or healthcare-grade security infrastructure. Do not claim prompt-injection immunity, production compliance, or third-party-validated safety from the bundled benchmarks alone. Read Implementation status, Live test results, and Hardening guide before using it in a customer-facing pilot. The June 11 active security prompt results are tracked in Security test results.

Bare Install Quickstart

This works with the base package only. No Docker, API server, or provider key is required.

import asyncio
from pramagent import Pramagent

async def main():
    resp = await Pramagent().run("Summarize this request", tenant_id="demo", session_id="s1")
    print(resp.output)
    print(resp.trace.this_hash)

asyncio.run(main())

That creates a tamper-evident trace using the deterministic mock provider.

Swap to a real OpenAI model by setting OPENAI_API_KEY:

from pramagent import Pramagent
from pramagent.providers import OpenAIProvider

armor = Pramagent(provider=OpenAIProvider(model="gpt-4o-mini"))

Run against NVIDIA NIM with an nvapi-* key:

from pramagent import Pramagent
from pramagent.providers import NvidiaProvider

armor = Pramagent(provider=NvidiaProvider(model="meta/llama-3.3-70b-instruct"))

Frequently Asked Questions

How do I add safety guardrails to an LLM agent?
Install Pramagent and wrap your agent call with the trust stack. Pramagent enforces deterministic policy outside the model, so the LLM cannot override the tool policy, HITL gate, or audit chain by changing its own text output.

How do I audit AI agent decisions in production?
Every Pramagent call produces a hash-chained TraceEvent with layer decisions, verdicts, provider metadata, PII redactions, HITL status, and this_hash / prev_hash. The local chain can be verified and optionally anchored externally.

How do I prevent prompt injection in a Python LLM agent?
IsolationLayer scans inputs before the model sees them. It covers known instruction overrides, chat-template wrapper attacks, authority framing, base64/hex/unicode-escape encoded payloads, and targeted multilingual override phrases. v0.8.0 adds structured classifier verdicts, held-out PINT/TensorTrust style fixtures, provenance-aware stricter scanning for tool output and retrieved content, and optional pramagent[ml] embedding/DeBERTa layers. This is defense-in-depth, not proof of prompt-injection immunity.

How do I stop unsafe model output from reaching users?
OutputJudgeLayer runs an LLM-as-judge on every output before it returns — the "is the OUTPUT safe?" check that regex cannot give. It catches semantic failures deterministic rules miss (working malware, bypass walkthroughs, confirmed destructive actions, leaked internals). On by default in the public demo, opt-in for /v1/run (PRAMAGENT_OUTPUT_JUDGE=1). It is fail-closed, but it is itself a model — strong defense-in-depth, not a guarantee.

How do I stop unsafe tool calls from an AI agent?
Use ToolGuardLayer with ToolPolicy. Pramagent validates JSON Schema, tenant/action allow-lists, side-effect class, call frequency, argument injection, and dangerous chains before any side effect can execute.

How do I add human approval to AI agent actions?
Use HITLLayer or a ToolGuard policy with Verdict.ESCALATE. Silence is never consent: if approval does not arrive, the action remains unexecuted.

Does Pramagent work with OpenAI, Anthropic, Gemini, Ollama, and local models?
Yes. Pramagent ships provider adapters for OpenAI, Anthropic, Gemini, Ollama, NVIDIA NIM, and OpenAI-compatible local endpoints, plus a deterministic mock provider for tests.

Is Pramagent compliant with SOC 2, HIPAA, or the EU AI Act?
No. Pramagent includes compliance evidence mapping and tamper-evident logging features that can support an assessment, but it has not passed SOC 2, HIPAA, EU AI Act conformity assessment, or an external penetration test.

API And Dashboard Install

pip install "pramagent[api,dashboard,redis,postgres]"

From source:

git clone git@github.com:sriram7737/pramagent.git
cd Pramagent
pip install -e ".[dev,api,redis,postgres,dashboard]"

CLI And Docker Quickstart

pramagent init
pramagent validate

Run the local stack:

cp .env.example .env
docker compose up -d

Open:

API docs: http://localhost:8080/docs
Dashboard: http://localhost:8501

Public Live Demo

The API can serve a single-page NVIDIA NIM demo at /demo. It is disabled by default and is meant for public evaluation, not production traffic.

PRAMAGENT_DEMO_ENABLED=true
PRAMAGENT_DEMO_RATE_LIMIT=60
PRAMAGENT_ALLOW_MEMORY_STORE=1
uvicorn pramagent.api.app:app --host 0.0.0.0 --port 8080

The demo asks the visitor for their own nvapi-* key on each run. Pramagent uses that key only for the current provider call; it is not written to traces, logs, stores, usage records, or the hash-chain payload. Each demo run uses an isolated in-memory trace store and returns the output, trust-layer events, redactions, HITL state, latency, this_hash, prev_hash, and local chain verification.

The public throttle is keyed by client IP plus a short in-memory SHA-256 hash of the visitor's nvapi-* key. If a visitor switches to a different NVIDIA key, they get a fresh demo bucket without Pramagent storing the plaintext key. A DEGRADED demo result means the upstream model call failed and Pramagent returned its safe default with a trace; try another listed NIM model or verify that the key has access to the selected endpoint.

Run the release sanity checks:

python -m pytest -q --tb=no
python -m pramagent.cli redteam --json --attacks 100
python -m pramagent.cli redteam --json --dynamic --attacks 200 --seed 999

Current local result: 640 passed, 2 skipped. The latest targeted prompt suite also passed with 0 failures across emergency override, output override, margin/liquidation, IBAN/SWIFT, ambiguous escalation, PHI, false-positive, base64, hex, unicode-escape, multilingual override-token, and chat-template-wrapper cases.

ToolGuard Example

import asyncio

from pramagent import Pramagent, Verdict
from pramagent.layers import ToolGuardLayer, ToolPolicy
from pramagent.layers.tool_guard import SideEffect

guard = ToolGuardLayer(policies=[
    ToolPolicy(
        name="send_payment",
        side_effect=SideEffect.PAYMENT,
        action=Verdict.ESCALATE,
        allowed_tenants={"finance_team"},
        schema={
            "type": "object",
            "required": ["amount_usd", "destination"],
            "properties": {
                "amount_usd": {"type": "number", "minimum": 0.01, "maximum": 5000},
                "destination": {"type": "string", "pattern": r"acct-\d{6,}"},
            },
            "additionalProperties": False,
        },
    )
])

armor = Pramagent(tool_guard=guard)

async def main():
    decision = armor.validate_tool(
        "send_payment",
        {"amount_usd": 250.00, "destination": "acct-123456"},
        tenant_id="finance_team",
        session_id="demo",
    )
    print(decision.verdict)  # ESCALATE

    too_large = armor.validate_tool(
        "send_payment",
        {"amount_usd": 9000.00, "destination": "acct-123456"},
        tenant_id="finance_team",
        session_id="demo",
    )
    print(too_large.verdict, too_large.reason)  # BLOCK: schema violation

    wrong_tenant = armor.validate_tool(
        "send_payment",
        {"amount_usd": 250.00, "destination": "acct-123456"},
        tenant_id="marketing_team",
        session_id="demo",
    )
    print(wrong_tenant.verdict, wrong_tenant.reason)  # BLOCK: tenant mismatch

    response = await armor.run(
        "Summarize this payment request",
        tenant_id="finance_team",
        session_id="demo",
        action="send_payment",
    )
    print(response.hitl)
    print(response.trace.this_hash)

asyncio.run(main())

Built-In Rule Corpora

Pramagent now includes deterministic, importable rule bundles. They are plain Python Rule objects, so a reviewer can inspect exactly what is enforced.

from pramagent import Pramagent
from pramagent.layers import SafetyLayer
from pramagent.rules import ALL_RULES, JAILBREAK_PATTERNS, OWASP_LLM_TOP10

armor = Pramagent(
    safety=SafetyLayer(rules=[*JAILBREAK_PATTERNS, *OWASP_LLM_TOP10])
)

strict_armor = Pramagent(safety=SafetyLayer(rules=ALL_RULES))

Included corpora:

JAILBREAK_PATTERNS
OWASP_LLM_TOP10
INJECTION_CORPUS
FICTIONAL_WRAPPER
PHI_PATTERNS
FINANCIAL_PII

Escalation Policy

Verdict.ESCALATE means "suspicious, but not certain enough to block." What the pipeline does with it is configurable per stage — pre (the input pass, before the model runs) and post (the output pass, after) — with one of "log" (record and continue), "hitl" (route to the human-in-the-loop gate, idle-on-silence), or "block" (hard stop). The default is "log" so adding an ESCALATE rule never silently starts gating traffic; the ESCALATE verdict is always recorded in the trace either way.

# Healthcare / finance — maximum caution
Pramagent(safety=SafetyLayer(rules=[...]),
          escalate_policy={"pre": "hitl", "post": "block"})

# Developer tool — minimal interruption (default)
Pramagent(safety=SafetyLayer(rules=[...]),
          escalate_policy="log")

# Internal enterprise — gate suspicious input, log suspicious output
Pramagent(safety=SafetyLayer(rules=[...]),
          escalate_policy={"pre": "hitl", "post": "log"})

A string applies to both stages; a dict sets them independently. Invalid values raise at construction, not at request time.

Persistent HITL Queue

For approval flows that must survive process restarts, use the persistent queue backends:

from pramagent.layers import HITLLayer
from pramagent.queue import SQLiteHITLQueue

hitl = HITLLayer(
    require_approval_for=["send_email", "wire_transfer"],
    store=SQLiteHITLQueue("hitl.db"),
    timeout_s=None,  # wait until another process approves or denies
)

InMemoryHITLQueue, SQLiteHITLQueue, and PostgresHITLQueue are available under pramagent.queue.

Framework Adapters

Pramagent is meant to sit under existing agent frameworks, not replace them.

from pramagent.adapters import PramagentNode, PramagentHook, PramagentGuard

# LangGraph
guard_node = PramagentNode(armor=armor)

# AutoGen
PramagentHook(armor=armor).attach(agent)

# CrewAI
safe_tool = PramagentGuard(armor=armor).wrap_tool(send_email)

Generic helpers are also available:

from pramagent.adapters import protect, protect_tool

Compliance Evidence

ComplianceReporter.generate() can produce point-in-time evidence packages from Pramagent traces and mappings:

from pramagent.compliance import ComplianceReporter

ComplianceReporter(store=store, audit=audit).generate(
    framework="SOC2",
    period_start="2026-01-01",
    period_end="2026-06-30",
    tenant_id="demo",
    output="evidence.json",
)

Supported mapping targets include SOC2, HIPAA, GDPR, NIST AI RMF, EU AI Act, and PCI DSS. This is engineering evidence, not a certification.

When To Use Pramagent

You are wrapping LLM calls or agent workflows and need audit trails, policy checks, HITL approvals, PII scrubbing, and provider fallback in one place.
You want deterministic tool policy outside the model, especially for actions like payments, data export, account changes, or admin operations.
You are building an internal tool or pilot where honest safety evidence matters more than marketing claims.
You need tamper-evident traces with optional Sepolia anchoring and encrypted S3 cold archive support.
You already use LangGraph, AutoGen, CrewAI, or a custom loop and want a thin trust layer around prompts, tool calls, and approvals.

When Not To Use Pramagent Yet

You need certified bank-grade, healthcare-grade, or SOC2-audited production infrastructure today.
You need proven jailbreak resistance against a serious red team; the bundled benchmark is only a deterministic smoke test, not third-party assurance.
You need mature enterprise dashboard auth such as SSO/OIDC/RBAC. Optional generated dashboard keys and SQL users exist, but this is not an enterprise IAM plane yet.
You need production-grade scale evidence, chaos engineering, or SLA-backed capacity numbers beyond the published local Docker Compose load run.
You need billing-grade Stripe/Chargebee metering rather than the local usage ledger and event hooks.

What Works Today

Capability	Status	Notes
Provider adapters	Implemented	Mock, OpenAI, Anthropic, Gemini, Ollama, OpenAI-compatible/local
Rule corpora	MVP	129 deterministic rules across jailbreaks, OWASP LLM risks, injection, fictional-wrapper bypasses, PHI, and financial PII
ToolGuard	Strong MVP	Draft 2020-12 JSON Schema, allow-lists, side-effect taxonomy, output scanning, Redis-backed chain state
HITL	Beta	Slack callbacks, persistent SQLite/Postgres queues, quorum/escalation primitives, ServiceNow/PagerDuty/email/webhook notifiers
Audit trail	Strong MVP	SHA-256 hash chain; optional real Sepolia anchoring
PII redaction	Strong MVP	Context-aware patterns for common regulated data; bounded email scrubbing avoids long-input regex DoS
Auth/rate limits/quotas	Beta	JWT/API keys, token buckets, per-tenant quotas
Framework adapters	MVP	LangGraph node, AutoGen hook, CrewAI guard, generic protect/protect_tool helpers
Dashboard	Prototype	Shared-key fallback, optional SQL users with generated keys, tenant scoping, traces, approvals, metrics, usage page, CSRF
Redis/Postgres backends	Beta	Wired and tested locally; needs scale/load testing
OpenTelemetry	Partial	Per-layer spans exist; dashboards and alerting need hardening
Red-team benchmark	MVP	Static and dynamic mutation modes; includes base64, translation-wrapper, and authority-framing regressions
Billing hooks	MVP	In-memory hash-chain usage ledger plus fail-open webhook; no Stripe/Chargebee provider yet
S3 cold archive	MVP	Gzip + encrypted trace archive wrapper; metadata sink hook
Compliance evidence	MVP	`ComplianceReporter.generate()` for JSON/text/PDF-style evidence packages

Integration Safety Contract

Pramagent should not replace human workflows that already work. Treat it as a policy and evidence layer around risky agent actions, not as a mandate to put AI into every decision path.

Before integrating a new feature or agent workflow, require three gates:

Isolation contract: declare which trust layers the feature touches. HITL features need a negative test proving the action cannot proceed without an authenticated approval. Isolation features need tenant/session boundary tests.
Regression baseline: run the full suite plus the new feature tests. Zero regressions are allowed for previously passing safety, trace, auth, and store behavior.
Consequence traceability: every approved or triggered action must leave a trace that explains why it was allowed, who/what approved it, what policy applied, and which downstream side effect was attempted.

The reusable reviewer prompt for this is in Security audit prompt.

Honest Limits

Prompt-injection defense is not complete. The bundled static corpus and seeded dynamic mutation smoke tests now include base64, translation-wrapper, and authority-framing regressions. v0.8.0 adds structured verdicts, provenance-aware stricter scanning, held-out PINT/TensorTrust-style fixtures, and optional pramagent[ml] embedding/DeBERTa layers, but the project still needs larger third-party red-team sets and external assessment.
ToolGuard is a hard policy gate outside the model, but it is not a sandbox.
ToolGuard chain detection and per-session call limits are per-process unless a shared Redis backend is configured (PRAMAGENT_TOOL_GUARD_REDIS_URL or PRAMAGENT_REDIS_URL). When running multiple uvicorn workers, a dangerous tool chain whose steps land on different workers is only detected with a shared Redis backend; the Redis path uses an atomic Lua append so concurrent same-session calls never lose history.
Slack is the main decision-collecting HITL adapter today. ServiceNow, PagerDuty, email, and generic webhooks are useful notification/escalation adapters. Persistent SQLite/Postgres approval queues exist, but broader enterprise approval workflows are still in development.
Dashboard auth has tenant-scoped shared-key fallback plus optional SQL-backed users with generated dashboard keys and key regeneration. It is still not SSO/OIDC/RBAC-grade.
Ethereum anchoring is Sepolia/testnet-oriented; no mainnet runbook, verifier contract, HSM/KMS key-management story, or enterprise anchoring operating model is included yet.
The usage ledger is local audit evidence for pilots, not an invoice-grade billing system.
Redis/Postgres support exists, but the stack has not been chaos-tested or load-tested for high-stakes deployments.
No external penetration test or formal compliance certification has been run.
QuantumLayer is future research only. It is not implemented, advertised as a feature, or exposed as a production API.

Optional Anchoring And Archive

pip install "pramagent[ethereum,s3]"

Ethereum/Sepolia anchoring submits the audit head as transaction calldata and stores the tx hash plus block number on the trace when configured. S3 cold archive wraps a primary store and archives pruned/erased traces as encrypted gzip JSON while keeping metadata available for compliance reporting.

Demo Flow

pramagent init
docker compose up -d
python -m pytest -q --tb=no
python -m pramagent.cli redteam --json --dynamic --attacks 200 --seed 999

Then use the dashboard to inspect traces, pending HITL approvals, audit status, metrics, and per-tenant usage.

推荐订阅源

Hacker News - Newest: "LLM"