Agent Surface
Error handling

Agent-Specific RFC 9457 Extensions

Structured error fields that enable agents to diagnose and recover from failures autonomously

Summary

Beyond RFC 9457's core five fields, agents need is_retriable, retry_after_ms, doc_uri, trace_id, suggestions, and errors. These extensions enable autonomous recovery: agents check is_retriable to decide whether to retry, read suggestions for correction steps, and use trace_id to investigate with support.

FieldTypePurpose
is_retriablebooleanRetry same request unchanged?
retry_after_msnumberMilliseconds to wait before retry
doc_uriURLLink to detailed error documentation
trace_idstringCorrelation ID for internal lookup
suggestionsstring[]Ordered recovery steps (first is most likely fix)
errorsProblem[]Nested errors for batch operations

RFC 9457 Problem Details provides a foundation. Agents need additional fields to decide whether to retry unchanged, modify and retry, or escalate. These extensions are not part of the RFC but follow its registry pattern for vendor-specific error metadata.

Core Agent Extensions

is_retriable (boolean)

True if retrying the same request unchanged might succeed. False for terminal errors where the request itself is wrong.

{
  "type": "https://example.com/errors/rate-limit-exceeded",
  "status": 429,
  "is_retriable": true,
  "retry_after_seconds": 60
}

retry_after_ms (number)

Milliseconds to wait before retry. Preferred over Retry-After header for parsing consistency. Agents can sleep synchronously:

if (error.is_retriable && error.retry_after_ms) {
  await sleep(error.retry_after_ms);
  return retry();
}

doc_uri (string, URL)

Link to detailed error documentation, troubleshooting, and examples. Helps humans understand what went wrong:

{
  "type": "https://example.com/errors/auth-expired",
  "doc_uri": "https://docs.example.com/auth/token-refresh"
}

trace_id (string)

Correlation ID matching server logs and OpenTelemetry spans. Enables support teams to investigate failures without the agent having to reproduce the issue:

{
  "type": "https://example.com/errors/internal-error",
  "status": 500,
  "trace_id": "01HV3K8MNP2QRS3TUVWX",
  "is_retriable": true,
  "retry_after_ms": 5000
}

suggestions (array of strings)

Ordered list of concrete recovery steps. The first suggestion is the most likely fix:

{
  "type": "https://example.com/errors/validation-error",
  "suggestions": [
    "Provide a value for the required 'amount' field",
    "The 'currency' field must be a 3-letter ISO 4217 code (e.g., 'USD')"
  ]
}

errors (array of Problem Details)

RFC 9457 supports aggregating multiple errors. Useful for batch operations or multi-field validation. Each element follows the same Problem Details schema:

{
  "type": "https://example.com/errors/batch-failure",
  "status": 422,
  "detail": "3 of 5 items failed validation",
  "errors": [
    {
      "type": "https://example.com/errors/validation-error",
      "detail": "Item 0: 'amount' is required",
      "is_retriable": true
    },
    {
      "type": "https://example.com/errors/validation-error",
      "detail": "Item 2: 'currency' is invalid",
      "is_retriable": true
    }
  ]
}

TypeScript Schema

import { z } from 'zod';

const ProblemDetails = z.object({
  type: z.string().url().optional().default('about:blank'),
  title: z.string(),
  status: z.number().int().min(400).max(599),
  detail: z.string(),
  instance: z.string().optional(),
  is_retriable: z.boolean().optional(),
  retry_after_ms: z.number().int().optional(),
  doc_uri: z.string().url().optional(),
  trace_id: z.string().optional(),
  suggestions: z.array(z.string()).optional(),
  errors: z.array(z.lazy(() => ProblemDetails)).optional(),
});

type ErrorResponse = z.infer<typeof ProblemDetails>;

Use this schema in API responses and CLI error output for consistency. See /templates/errors-and-auth/problem-details.ts for a production-ready implementation.

Agent-Side Error Handling

When an agent encounters a Problem Details response, it branches on is_retriable:

async function callTool(
  tool: string,
  params: Record<string, unknown>
): Promise<unknown> {
  const response = await fetch(`/api/tools/${tool}`, {
    method: 'POST',
    body: JSON.stringify(params),
  });

  if (!response.ok) {
    const error = await response.json();

    if (error.is_retriable === false) {
      // Terminal error — escalate or fail the task
      throw new Error(
        `${error.title}: ${error.detail}. See ${error.doc_uri}`
      );
    }

    if (error.is_retriable && error.retry_after_ms) {
      // Transient error with explicit wait time
      await sleep(error.retry_after_ms);
      return callTool(tool, params); // Retry unchanged
    }

    if (error.suggestions && error.suggestions.length > 0) {
      // Recoverable with modification — agent rewrites params
      // (implementation depends on agent framework)
    }
  }

  return response.json();
}

Batch Error Aggregation

When processing multiple items, return partial success with detailed per-item errors:

async function batchCreateInvoices(invoices: InvoiceData[]): Promise<Response> {
  const results = [];
  const errors = [];

  for (let i = 0; i < invoices.length; i++) {
    try {
      results.push(await createInvoice(invoices[i]));
    } catch (err) {
      errors.push({
        type: 'https://example.com/errors/validation-error',
        title: 'Validation failed',
        status: 422,
        detail: `Item ${i}: ${err.message}`,
        instance: `batch-item-${i}`,
        is_retriable: true,
      });
    }
  }

  if (errors.length > 0) {
    return Response.json(
      {
        type: 'https://example.com/errors/batch-failure',
        title: 'Batch operation partially failed',
        status: 207, // Multi-status
        detail: `${results.length} succeeded, ${errors.length} failed`,
        errors,
      },
      { status: 207 }
    );
  }

  return Response.json(results);
}

Intent Tracing on Cancellation

When an agent cancels a request in-flight, include context so it can reason about cleanup:

{
  "type": "https://example.com/errors/operation-cancelled",
  "title": "Operation Cancelled",
  "status": 408,
  "detail": "Request aborted by client after 5000ms",
  "trace_id": "span-xyz789",
  "is_retriable": true,
  "intent": {
    "operation": "create_invoice",
    "resource_id": "inv-pending-123",
    "state_before_cancel": "payment_processing",
    "recovery": "Operation may be in progress server-side. Poll GET /invoices/inv-pending-123 to check status."
  }
}

On this page