Normalize Provider Error JSON So Your Agent Can Actually Handle Failures

Anthropic's rate limit error looks like this:

{"error": {"type": "rate_limit_error", "message": "Rate limit exceeded"}}

OpenAI's rate limit error looks like this:

{"error": {"code": "rate_limit_exceeded", "message": "...", "param": null, "type": "tokens"}}

Google's looks like this:

{"error": {"code": 429, "message": "Quota exceeded", "status": "RESOURCE_EXHAUSTED"}}

If your agent handles multiple providers, you end up with provider-specific error parsing code scattered everywhere. llm-pretty-error normalizes them into a single LLMError object.

The Shape of the Fix

from llm_pretty_error import normalize_error, LLMError

try:
    response = call_provider(prompt)
except Exception as raw_error:
    error: LLMError = normalize_error(raw_error)

    print(f"Provider: {error.provider}")     # "anthropic"
    print(f"Kind: {error.kind}")             # "rate_limit"
    print(f"Message: {error.message}")       # human-readable
    print(f"Retry after: {error.retry_after_seconds}")  # int or None
    print(f"Status code: {error.status_code}")  # 429
    print(f"Request id: {error.request_id}")  # provider request ID for support

One object. Same fields regardless of which provider threw the error.

What It Does NOT Do

llm-pretty-error does not retry for you. Normalization and retry are separate concerns. Use llm-retry-py or llm-fallback-chain for retry logic.

It does not cover every possible provider error. It covers the most common ones (rate limit, auth, not found, context length, content filter) from Anthropic, OpenAI, and Google. Novel error types from less common providers fall through to kind: "unknown".

It does not modify the original exception. normalize_error() returns a new LLMError dataclass. The original exception is attached as error.original_exception for full stack trace access.

Inside the Library

Detection is provider-first, then error type:

def normalize_error(exc: Exception) -> LLMError:
    provider = detect_provider(exc)
    status_code = extract_status_code(exc)
    raw_body = extract_response_body(exc)

    if provider == "anthropic":
        return parse_anthropic_error(exc, status_code, raw_body)
    elif provider == "openai":
        return parse_openai_error(exc, status_code, raw_body)
    elif provider == "google":
        return parse_google_error(exc, status_code, raw_body)
    else:
        return LLMError(
            provider="unknown",
            kind="unknown",
            status_code=status_code,
            message=str(exc),
            original_exception=exc,
        )

Provider detection: Anthropic SDK exceptions have type(exc).__module__.startswith("anthropic"). OpenAI SDK exceptions have type(exc).__module__.startswith("openai"). Google SDK exceptions have class names starting with known Google error prefixes.

request_id: Anthropic includes a request-id header in error responses. OpenAI includes x-request-id. These are invaluable for filing support tickets. normalize_error() extracts them if available.

The 31 tests cover: each provider's major error types (rate limit, auth, not found, context length, content filter, server error), unknown provider fallthrough, missing status code, missing request ID, and the original_exception attachment.

When to Use It

Use it in any multi-provider agent where you want unified error handling. The normalization adds a small overhead (a few microseconds) per error. For error paths, that is negligible.

The request_id field is worth its weight in gold when you have a provider error that needs support investigation. Include it in your error logs:

logger.error(
    "llm_call_failed",
    provider=error.provider,
    kind=error.kind,
    status_code=error.status_code,
    request_id=error.request_id,
    message=error.message,
)

When you file a support ticket with Anthropic or OpenAI, the request_id is the first thing they will ask for.

Install

pip install git+https://github.com/MukundaKatta/llm-pretty-error

from llm_pretty_error import normalize_error, LLMError, ErrorKindLike
from llm_retry_py import LLMRetry

retry = LLMRetry(max_attempts=3, base_delay=1.0)

def call_with_normalized_errors(prompt: str) -> str:
    try:
        return retry.call(
            fn=lambda: call_llm(prompt),
            classify_error=lambda e: normalize_error(e).kind,
            retryable_kinds={"rate_limit", "server_error"},
        )
    except Exception as e:
        error = normalize_error(e)
        if error.kind == "context_length":
            # Trim and retry
            return call_with_shorter_prompt(prompt)
        elif error.kind == "content_filter":
            return "I cannot help with that request."
        else:
            logger.error("unhandled_llm_error",
                        kind=error.kind,
                        provider=error.provider,
                        request_id=error.request_id)
            raise

Sibling Libraries

Library	What it solves
`tool-error-classify`	Classify tool call exceptions by ErrorKind enum
`llm-retry-py`	Retry logic using error classification
`llm-circuit-breaker-py`	Open circuit after repeated failures
`llm-fallback-chain`	Route to backup provider on failure
`agent-run-id`	Thread run IDs through logs for correlation

llm-pretty-error + tool-error-classify cover both sides of agent error handling: LLM API errors and tool execution errors, both normalized to comparable representations.

What's Next

Provider-specific error metadata: Anthropic rate limit errors include a retry-after header. OpenAI rate limit errors include a Retry-After header. Some errors include quota information. Surfacing these in LLMError.metadata as a dict would make the full error context accessible without digging into original_exception.

Streaming error normalization: streaming responses can fail mid-stream with errors that appear differently than synchronous call errors. Adding stream error parsing would cover the streaming use case.

Error frequency tracking: normalize_error() could optionally emit to a counter (by provider and kind) for observability. The counters would be per-process but useful for detecting error rate spikes.

Built as part of the agent-stack family: composable Python primitives for production LLM agents.

推荐订阅源

DEV Community