Agent Surface
Discovery & AEO

AEO Implementation Checklist

Complete tiered checklist for making your site and API discoverable and consumable by AI agents

Summary

This tiered checklist organizes AEO implementation from essential foundation items through comprehensive support. Tier 1 includes robots.txt, llms.txt, and OpenGraph tags — low effort, high impact. Later tiers add JSON-LD, Markdown content negotiation, Content Signals, OpenAPI links, API Catalog, MCP Server Cards, Agent Skills indexes, OAuth metadata, Web Bot Auth, and commerce protocols where relevant.

  • Tier 1: robots.txt, llms.txt, OpenGraph meta tags
  • Tier 2: OpenAPI discovery, JSON-LD markup
  • Tier 3: Content negotiation, content structure, AGENTS.md
  • Tier 4: Well-known endpoints, API Catalog, MCP Server Cards, Agent Skills, auth discovery
  • Start with Tier 1 before advancing

This checklist covers the full AEO implementation surface organized by tier. Tier 1 items are the minimum viable foundation — ship these first. Later tiers add depth and capability for teams investing seriously in agent discoverability.

Each item links to the relevant section of this documentation where applicable.

Tier 1 — Foundation

These items cost little to implement and unlock all downstream tiers. Complete these before anything else.

robots.txt

  • robots.txt exists at /robots.txt and is served with Content-Type: text/plain
  • User-agent: * with Allow: / covers all legitimate bots
  • Training crawlers explicitly disallowed: GPTBot, ClaudeBot, Google-Extended, Applebot-Extended, cohere-ai, CCBot (or explicitly allowed — the decision is made and documented)
  • Retrieval bots allowed: ChatGPT-User, Claude-SearchBot, PerplexityBot, Googlebot are not blocked
  • Sitemap: directive points to your primary sitemap
  • /robots.txt is tested with Google's robots.txt tester

See robots.txt for AI Agents.

llms.txt

  • /llms.txt exists and is served as Content-Type: text/plain; charset=utf-8
  • File begins with # Product Name (H1 required)
  • Blockquote summary follows the H1 (> Description here)
  • All significant documentation sections appear as H2 groups with link lists
  • Each link includes a colon-separated description: - [Title](url): Description
  • Links point to .md URLs where available, HTML otherwise
  • Total file size is under 5,000 tokens (approximately 20,000 characters)
  • ## Optional section used for supplementary content

See llms.txt.

OpenGraph Meta Tags

  • og:title is entity-first: "Product Name — Page Topic"
  • og:description is 150–300 characters, factual, answer-dense
  • article:modified_time is present and accurate (W3C datetime format)
  • article:published_time is present
  • article:tag values are present for topic categorization

Sitemap

  • XML sitemap exists at /sitemap.xml or a sitemap index at /sitemap.xml referencing sub-sitemaps
  • Every documentation page appears in the sitemap
  • <lastmod> is present and accurate for every URL
  • <lastmod> format is W3C datetime: YYYY-MM-DD or YYYY-MM-DDTHH:MM:SSZ
  • Sitemap contains fewer than 50,000 URLs (use sitemap index if larger)
  • Sitemap is referenced in robots.txt

Semantic HTML

  • Main content is wrapped in <article> or <main>
  • Navigation is in <nav> with aria-label
  • Headings use semantic tags (h1h6), not styled <div> elements
  • Page has exactly one <h1> matching the <title> tag
  • <time datetime="..."> used for dates

Tier 2 — Content Quality

Content structure improvements that directly affect whether agents extract accurate answers.

Answer-First Structure

  • Every page puts the direct answer or key fact in the first 200 words
  • Introductory paragraphs do not delay the answer with background or context
  • Tables appear before prose explanations of the same data
  • Rate limits, pricing, and other quantitative facts appear as early as possible

See Content Structure for AI Consumption.

Self-Contained Sections

  • Each H2 section can be read and understood without context from preceding sections
  • No section references "the previous step" or "as described above" without restating what was described
  • Code examples include all necessary imports and context
  • Error messages are quoted verbatim with exact capitalization and punctuation

Markdown Availability

  • Documentation pages are available in Markdown format at .md URLs or via Accept: text/markdown
  • <link rel="alternate" type="text/markdown" href="..."> present in HTML <head>
  • Markdown output strips navigation and footer, preserves content

See Serving Markdown to Agents.

Vary Header

  • All responses that differ by Accept header include Vary: Accept
  • CDN configuration honors Vary: Accept (separate cache entries per format)
  • Content-Type: text/markdown; charset=utf-8 used for Markdown responses (not text/plain)

Tier 3 — Structured Data

JSON-LD schema markup for AI Overview inclusion and retrieval system accuracy.

FAQPage Schema

  • Pages with FAQ or Q&A content include FAQPage JSON-LD
  • Every Question.name exactly matches visible heading text
  • Every Answer.text exactly matches visible answer text
  • No schema content exists that is not visible on the page
  • Schema validated with Google Rich Results Test

See Structured Data for Agents.

TechArticle Schema

  • All technical documentation pages include TechArticle JSON-LD
  • dateModified is accurate and kept current
  • author and publisher are present
  • proficiencyLevel is set (Beginner or Expert)

HowTo Schema

  • Step-by-step guides and tutorials include HowTo JSON-LD
  • Each step includes position, name, text, and url (anchored to heading)
  • totalTime is present in ISO 8601 duration format
  • tool array lists prerequisites

WebAPI Schema

  • API reference pages include WebAPI JSON-LD
  • documentation, endpointUrl, and provider are present
  • version reflects the current API version

SoftwareApplication Schema

  • Product and tool pages include SoftwareApplication JSON-LD
  • softwareVersion is kept current
  • downloadUrl and codeRepository present where applicable

Tier 4 — Discovery Endpoints

Machine-readable capability declarations at standard paths.

/.well-known/ai

  • /.well-known/ai exists and is valid JSON
  • capabilities array accurately reflects what you support
  • api.spec points to a valid OpenAPI JSON file
  • mcp.endpoint present if you run an MCP server
  • llms.index points to your llms.txt

See Well-Known Discovery Endpoints.

OpenAPI Specification

  • OpenAPI spec available at /openapi.json or /api/openapi.json
  • Spec validates against OpenAPI 3.1 schema
  • Every operation has a summary and description
  • Every parameter has a description
  • Every response has a description and schema
  • Spec is kept in sync with actual API behavior (tested in CI)

API Catalog

  • /.well-known/api-catalog exists if you expose public APIs
  • Catalog links to canonical OpenAPI specs, documentation, status pages, and auth docs
  • HTTP Link: </.well-known/api-catalog>; rel="api-catalog" header present where practical
  • Catalog omits internal-only APIs that agents should not discover

/.well-known/agent-card.json

  • agent-card.json published if your service can be delegated to by other agents
  • skills array describes what the agent can do with realistic examples
  • authentication specifies how to obtain credentials
  • capabilities.streaming accurately reflects whether you support streaming

MCP Server Card

  • /.well-known/mcp/server-card.json published if you run an HTTP MCP server
  • transport correctly specifies streamable-http, sse, or stdio
  • authentication section accurate and complete
  • Card describes the server before connection: name, purpose, tools, transport, and auth
  • /.well-known/mcp.json redirects or points to the server card if used for compatibility

Agent Skills Index

  • /.well-known/agent-skills/index.json exists when your site teaches task-specific agent workflows
  • Each skill is narrow, procedural, and linked as a separate SKILL.md
  • Linked skills follow the Agent Skills format: required name and description, optional scripts/, references/, and assets/
  • Skill descriptions explain when to use the skill, not only what topic it covers
  • Skills are generated from canonical docs or reviewed with the same rigor as public docs

OAuth Protected Resource Metadata

  • /.well-known/oauth-protected-resource exists for OAuth-protected APIs
  • Metadata points to the correct authorization server, JWKS, supported scopes, and DPoP requirement
  • Protected resource metadata is linked from API docs and MCP metadata where applicable

Web Bot Auth

  • /.well-known/http-message-signatures-directory exists if your service sends signed bot or agent requests
  • Published keys are rotated and documented
  • Receiving services can verify signed requests before granting high-trust access

Tier 5 — Developer Experience

Improvements for developers using AI coding assistants with your APIs and SDKs.

AGENTS.md

  • AGENTS.md exists at repository root
  • ## Commands section lists all essential commands (build, test, lint, dev server)
  • Non-standard tooling (unusual package manager, monorepo setup) documented
  • ## Agent Boundaries section with three tiers: autonomous / ask first / never
  • Testing rules stated explicitly: framework, location, required coverage
  • File is under 500 lines
  • No prose paragraphs — only lists, code blocks, tables

See AGENTS.md and Context Files.

llms-full.txt

  • /llms-full.txt exists with all documentation content inlined
  • File is generated automatically from source content (not maintained by hand)
  • Content-Type: text/plain; charset=utf-8
  • Cache-Control TTL is appropriate for how frequently docs change
  • Token count documented (in comments or referenced from llms.txt)

Copy for AI Button

  • Documentation pages include a "Copy for AI" button that copies Markdown to clipboard
  • Button displays estimated token count after copy
  • Markdown copy excludes navigation and footer content

Token Count Disclosure

  • x-markdown-tokens header present on Markdown responses
  • Token counts visible to developers (in UI, in response headers, or in llms.txt)
  • Pages over 20K tokens are flagged for splitting

Analytics

  • AI bot traffic tracked separately from human traffic (by user-agent)
  • Which pages retrieval bots visit most frequently is visible
  • Retry-After header compliance monitored (are bots respecting rate limits?)

Tier 6 — Advanced

High-investment items for teams deeply committed to AEO.

NLWeb /ask Endpoint

  • /ask endpoint returns direct answers to natural-language questions
  • Endpoint accessible as both HTTP GET and as an MCP server
  • Response includes sources array with URLs and excerpts
  • Indexed against current documentation content (not stale snapshots)

See Well-Known Discovery Endpoints.

/mcp Endpoint

  • MCP server running and accessible to agents
  • All API operations exposed as MCP tools with accurate descriptions
  • Resource endpoints serve documentation and configuration data
  • Authentication works with the flow documented for your agent clients, usually OAuth 2.1 Client Credentials, token exchange, or host-managed delegated auth
  • Tool definitions tested with MCP inspector
  • Health check endpoint for uptime monitoring

See the MCP Servers section.

Content Signals

  • Content-Signal: search=yes, ai-input=yes, ai-train=no declared in robots.txt or response headers
  • Policy header overridden appropriately for gated content
  • robots.txt and Content Signals policy are consistent with each other

See robots.txt for AI Agents.

Instructions Section in llms.txt

  • ## Instructions section added to llms.txt
  • Instructions correct known misconceptions agents have from training data
  • Instructions flag rapidly-changing content (pricing, feature availability)
  • Instructions clarify naming conventions that conflict with industry defaults
  • Instructions point to authoritative sources for high-stakes questions

See llms.txt.

agents.json / Wildcard AI Flows

  • agents.json at root describes common workflows with step-by-step actions
  • Workflows cover the top 3–5 agent use cases for your product
  • Each flow includes success_output and failure_output descriptions
  • Flows tested with an agent to verify they execute correctly

See Well-Known Discovery Endpoints.

Agentic Commerce

  • x402 implemented for machine-payable resources if agents must pay for access
  • Universal Commerce Protocol or Agentic Commerce Protocol manifests exist only if agent-mediated commerce is a real product surface
  • Commerce manifests link to canonical product, price, refund, and support policies

Scoring Your Implementation

Use the tier count to assess where you stand:

Tiers completedAEO scoreDescription
00No AEO implementation
Tier 11Minimum viable — AI agents can find you
Tier 1–22Content optimized for agent consumption
Tier 1–33Structured data enables AI Overview inclusion
Tier 1–44Fully machine-discoverable via standard protocols
Tier 1–55Developer-grade AEO, IDE integrations supported
Tier 1–66Reference implementation, advanced agent integration

Most production SaaS products targeting developer audiences should aim for Tier 4–5. Tier 6 items are appropriate for companies that want to be the primary source agents cite in their domain, or that build infrastructure other agents depend on.

On this page