Agent Readiness: How to Prepare Your Site for the Agentic Web
AI agents have moved past reading pages. They book flights, order parts, run checkout, and sign into APIs on behalf of their users. Sites that expose the right signals get used. Sites that don't get skipped. The new Agent Protocol Readiness Checker scans your URL for the exact signals agents now look for.
- AI Agents
- MCP
- Agentic Commerce
- Technical SEO
Why Agent Readiness Is Its Own Problem Now
For years, the only visitor you had to design for was a human with a browser. Then Google’s crawler showed up, and SEO became a discipline. Then LLM crawlers arrived, and Answer Engine Optimization became the next thing you had to worry about.
Now a third kind of visitor is here. Agents built on OpenAI, Anthropic, and Google models browse sites with intent. They read product pages, fill forms, negotiate pricing, and call APIs. They also decide, in the first second or two, whether your site is worth the round trip.
Agents behave less like crawlers and more like impatient power users. They check /.well-known/ paths before they touch your HTML. They send Accept: text/markdown to skip your navigation. They look for an MCP Server Card so they can talk to your app as a tool, not a document. If those signals are missing, the agent either falls back to scraping your HTML (slow and lossy) or moves on to a competitor that speaks its language.
This is what we built the Agent Protocol Readiness Checker to measure. It is not an SEO audit. It is a protocol audit. It checks whether your site is ready to be used by an agent rather than read by a person.
What the Agent Protocol Readiness Checker Does
Give the tool a URL. It runs 20-plus probes against your origin and groups the findings into five scored categories:
- Discoverability. Can agents find your robots.txt, sitemap, and Link headers?
- Content Accessibility. Does your server honor
Accept: text/markdownthe way the acceptmarkdown.com spec describes? - Bot Access Control. Have you written an explicit policy for GPTBot, ClaudeBot, Google-Extended, and PerplexityBot? Have you published a Web Bot Auth key directory?
- Protocol Discovery. Do you expose an MCP Server Card, an Agent Skills manifest, an RFC 9727 API Catalog, OAuth discovery metadata, and WebMCP tool annotations?
- Agentic Commerce. Can an agent pay you? The checker looks for x402, ACP, UCP, and MPP signals.
Each check returns pass, warn, or fail, with the server evidence behind the verdict. Failures roll up into a prioritized recommendation list, and categories combine into a weighted grade from A to F. The goal is not another vanity score. The goal is to hand you the six fixes that will most change how agents treat your site.
Let’s walk through what each category actually measures and why it matters.
1. Discoverability: The Basics That Still Decide Everything
Before an agent can use anything fancier, it needs to know what exists at your origin. That starts with three plain files.
robots.txt
The checker fetches /robots.txt and confirms it returned a valid response. RFC 9309 has been the formal standard since 2022, and an agent that cannot read your robots file assumes the worst: that you haven’t thought about machine access at all. If you only do one thing from this post, publish a valid robots.txt. Our Robots.txt Validator catches the common mistakes (stray BOMs, wildcard traps, unquoted Sitemap directives).
Sitemap declaration
A sitemap at /sitemap.xml is nice. A Sitemap: line inside robots.txt that points at it is better, because agents resolve the robots file first. The checker rewards sites that do both.
Link response headers
Modern agents also read the Link header on your homepage for hints about resources at /.well-known/*. Sites that serve, for example, Link: </.well-known/mcp/server-card.json>; rel="mcp-server-card" give an agent its next step without a second round trip. If you want to see what headers your origin sends today, run it through our HTTP Header Checker.
Discoverability is a small category (15% of the final grade), but the failures here cascade. If robots is broken, the bot-access category can’t be verified. If the Link headers are missing, downstream categories have to guess.
2. Content Accessibility: The Markdown Negotiation Checks
This is the category most sites fail on. It also happens to be the easiest to fix.
The acceptmarkdown.com proposal is simple: when a client sends Accept: text/markdown, the server should return the same content as Markdown instead of HTML. Agents love this because HTML is noisy. Stripping chrome, navigation, and JavaScript out of a page eats tokens and introduces errors. A Markdown representation of your article is half the size and ten times easier to parse.
The checker runs four probes against your URL:
Accept: text/markdownis honored. The server should returnContent-Type: text/markdown; charset=utf-8. A UTF-8 declaration matters because agents feed the body into a tokenizer that assumes the encoding.Vary: Acceptis set. Without this header, a CDN that cached the HTML response will serve that HTML to the next agent asking for Markdown. One missing header corrupts an entire origin for every AI client behind the same CDN.- Unsupported Accept types return 406. If an agent sends
Accept: application/x-weird-type, the right answer is406 Not Acceptable, not a silent HTML fallback. Returning 406 tells the agent’s retry logic that it asked for the wrong thing. - q-values are respected. An agent that sends
Accept: text/html;q=0.1, text/markdown;q=1.0is saying “I will take HTML if I must, but I strongly prefer Markdown.” The server should honor that weighting.
Most origins get zero out of four on this category. A CDN worker that transforms HTML to Markdown on demand fixes all four in an afternoon. The payoff compounds: every agent that hits your site from that point forward pulls a clean, tokenized representation of your content. For a deeper look at how agents read AI-friendly content, see our AI Readiness Checker.
3. Bot Access Control: Saying Yes Clearly
The default robots.txt says nothing about AI agents. Silence gets interpreted two ways, depending on the agent. Some assume silence means “fine, keep going.” Others assume silence means “this site hasn’t opted in.” Both interpretations hurt you, because neither matches what you actually want.
The Agent Protocol Readiness Checker looks for three explicit signals.
AI bot user-agents in robots.txt
The checker searches your robots.txt for rules targeting the crawlers that matter today: GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, Claude-Web, anthropic-ai, Google-Extended, PerplexityBot, Perplexity-User, Meta-ExternalAgent, Applebot-Extended, Bytespider, CCBot, cohere-ai, DuckAssistBot, Amazonbot, MistralAI-User. Three or more named agents earns a pass. A shorter list earns a warn. Zero earns a fail, because at that point you have no AI policy at all.
Writing User-agent: GPTBot followed by Allow: / is not the same as saying nothing. It is a public commitment that a specific company’s agent may read your site under a specific rule. That commitment is load-bearing when an agent’s policy engine decides whether to fetch you.
If you want to probe how a specific AI bot currently behaves against one of your paths, our AI Bot Path Tester will simulate the request against the rules in your live robots.txt.
Cloudflare Content Signals
Cloudflare proposed Content Signals in late 2025: three directives (search, ai-input, ai-train) that sit inside robots.txt and declare separate policies for crawling, retrieval-for-response, and training. The checker scans your robots file for any Content-Signal: directive. One is enough to pass.
Content Signals matter because “block GPTBot” is a blunt instrument. It blocks training, retrieval, and grounded answers in one swing. Content Signals let you allow grounded answers (so your brand shows up in ChatGPT citations) while blocking training (so your content doesn’t get compressed into a model’s weights).
Web Bot Auth
Web Bot Auth is the newest piece of this category. Agents sign their requests with an Ed25519 keypair. The public key is discoverable at /.well-known/http-message-signature-directory as a JWKS. When an agent hits your server, you verify the signature against the published key and know for certain which agent sent the request.
The checker probes that directory and confirms it returns JSON. If you haven’t published one, you cannot tell a legitimate agent from a scraper wearing its user-agent. The security case alone is compelling. The practical case is bigger: agents that support Web Bot Auth get lower rate limits and access to more of your site. Published keys pay back immediately.
4. Protocol Discovery: The Well-Known Endpoints
This is the heart of the tool and the category with the most weight (25% of the final score). It is also the part of the agent stack that changes monthly, so the checker leans on well-specified endpoints rather than vendor-specific tricks.
MCP Server Card
The Model Context Protocol is how Claude, ChatGPT, and a growing list of agents discover callable tools on a remote server. The MCP Server Card at /.well-known/mcp/server-card.json advertises your server’s name, capabilities, transport, and auth model. The checker probes that path and falls back to /.well-known/mcp.json if it is missing.
If your product has any kind of API, an MCP Server Card is the move that turns your site from a document into a tool. An agent that finds a Server Card stops scraping and starts invoking. That is a better experience for the user and a cheaper interaction for you.
Agent Skills
Agent Skills is a newer manifest format that lives at /.well-known/agent-skills/index.json. It complements MCP by describing agent-usable workflows, not just tools: “create a shipment,” “file a refund,” “look up a booking.” The checker probes that path and looks for a valid response.
If your site already publishes an OpenAPI spec or an MCP Server Card, generating an Agent Skills manifest is mostly a translation exercise. The ROI is that Claude Code and similar clients will surface your skills to users by name.
WebMCP
WebMCP is the browser-side cousin of MCP. Instead of advertising tools through a /.well-known/ URL, you annotate <form> elements directly in your HTML with toolname and tooldescription attributes, or declare tools via a <meta name="webmcp" ...> tag. The checker scans your homepage HTML for either pattern.
The benefit is that an agent using your page in a browser can discover and invoke those tools without leaving the tab. WebMCP is a small amount of markup for a large amount of agent fluency.
API Catalog (RFC 9727)
RFC 9727 defines /.well-known/api-catalog as a pointer to all the APIs your origin exposes, served as application/linkset+json. The checker confirms the endpoint exists and that its content type is correct. Many origins get a warn here: they serve the path, but with application/json instead of application/linkset+json. Fixing the content type is one header on one route.
OAuth Discovery
Two specs matter here:
- RFC 8414 describes OAuth Authorization Server metadata at
/.well-known/oauth-authorization-server. This tells an agent how to start an OAuth flow against your issuer. - RFC 9728 describes OAuth Protected Resource metadata at
/.well-known/oauth-protected-resource. This tells an agent, when it hits a 401 from your API, which issuer to authenticate against and which scopes to request.
An agent that cannot do OAuth discovery cannot automate a signed-in action on your site without human intervention. If your product has a user account, publish both.
5. Agentic Commerce: Can an Agent Pay You?
This is the newest category and the one that gets the most pushback from skeptics. The question underneath it is straightforward: when an agent wants to buy something from you on behalf of its user, what does that transaction look like?
The checker measures four competing answers.
x402
x402 revives HTTP status code 402 (“Payment Required”) and adds a PAYMENT-REQUIRED header with a machine-readable offer: price, currency, accepted payment rails, settlement endpoint. An agent that receives a 402 signs a payment, re-submits the request, and gets the resource. The checker looks for a 402 status or a PAYMENT-REQUIRED header on your homepage and any endpoint it probes.
x402 is the lowest-commitment option. You pick one paid endpoint, return a 402 with terms, and you are in. Stripe, Coinbase, and several crypto settlement providers support the flow today.
ACP (Agentic Commerce Protocol)
ACP is OpenAI’s standard. It lives at /.well-known/agentic-commerce and describes a fuller checkout surface: product catalog, pricing, tax, shipping, returns. If you sell physical or digital goods and you want ChatGPT to transact with your store directly, ACP is the lane.
UCP (Universal Commerce Protocol)
UCP piggybacks on OAuth. You declare commerce scopes like ucp:scopes:checkout_session inside your OAuth Authorization Server metadata. The checker fetches your OAuth AS document and searches for any ucp:scopes:* value. One match earns a pass.
UCP is the lightest-weight commerce protocol of the four because it reuses the OAuth layer you already have. If you ship tokens for anything, you are halfway there.
MPP (Machine Payments Protocol)
MPP, advertised at /.well-known/machine-payments, is the most general. It is less about checkout flow and more about advertising what kinds of machine-to-machine payments your service accepts: stablecoins, account-to-account bank rails, per-token metering.
A passing score on commerce does not require all four. It requires at least one, because “agents can pay you” is a single capability with four standards competing to own it. Pick the one that fits your business and ship it.
What the Final Score Actually Means
The tool combines the five category scores into a weighted overall grade.
| Category | Weight |
|---|---|
| Discoverability | 15% |
| Content Accessibility | 20% |
| Bot Access Control | 15% |
| Protocol Discovery | 25% |
| Agentic Commerce | 25% |
Scores over 85 earn an A. Between 70 and 84 is a B. The lower grades fall off fast, and that is on purpose. A site that scores in the D range is not just imperfect for agents; it is functionally invisible to them. It does not advertise tools, does not serve Markdown, does not declare a bot policy, does not support agentic payments. For the fraction of traffic that is already agent-driven, that site reads as a parked domain.
Most sites we have audited score between 10 and 30 on their first run. That is fine. The tool is designed to meet you where you are and surface the six highest-leverage changes. Fixing three of them usually moves a site from F to C in under a day.
How to Run the Check
Go to the Agent Protocol Readiness Checker, paste a URL, and wait about ten seconds. The result page includes:
- Your overall score and letter grade.
- The five category scores with a colored status per check.
- The raw evidence (headers, status codes, matched substrings) behind each check, so you can verify the tool’s read against your own logs.
- A prioritized recommendation list of the top fixes.
You can run the check against staging domains, internal origins, and production. It respects outbound URL safety rules and caps body reads at 512 KB so a misconfigured server cannot burn your rate limit.
A Practical Order to Fix Things
If you want a fast lift, do these in order:
- Publish a valid robots.txt with explicit
User-agent:rules for GPTBot, ClaudeBot, Google-Extended, and PerplexityBot, plus aSitemap:directive. Validate it with our Robots.txt Validator. - Add Markdown content negotiation at your CDN edge. Check the
Acceptheader, convert HTML to Markdown on the fly, setContent-Type: text/markdown; charset=utf-8andVary: Accept. Return 406 for unsupported types. - Publish an llms.txt at your root with pointers to the pages you most want agents to cite. Generate and validate it with our LLMs.txt Generator and Validator.
- Expose an MCP Server Card at
/.well-known/mcp/server-card.json. Even a minimal card (name, description, version, transport) unlocks agent discovery. - Add Content Signals to robots.txt. One line declaring
ai-input: yes, ai-train: nois enough to pass the check and to publish a real policy. - Publish a Web Bot Auth JWKS so legitimate agents can sign requests against your origin.
That list is roughly two engineer-days of work for a team with a normal CDN and a normal auth server. It moves a site from F to B, and it future-proofs the origin against the next year of agent protocol churn.
What Agent Readiness Looks Like in a Year
The exact list of checks the tool runs today will not be the list it runs in twelve months. MCP will ratify a formal capability negotiation spec. Agent Skills will merge with, or replace, parts of OpenAPI. ACP, UCP, and MPP will consolidate into fewer, stronger standards. We will add probes as they land and retire probes as they become default.
What will not change is the shape of the problem. Agents decide in the first few requests whether your site is worth using. Your job is to expose, quickly and clearly, what your site can do and how to use it. Every signal the checker looks for is a shortcut that lets an agent commit to your origin instead of giving up on it.
Run the Agent Protocol Readiness Checker against your homepage. Fix the top three recommendations. Run it again. Watch how differently agents treat your site after those three changes land.
Related Reading
- The Complete Guide to Answer Engine Optimization (AEO) and GEO
- AI Readiness Checker for content-structure and LLM crawl scoring
- LLMs.txt Generator and Validator for publishing a clean map of your site to language models
- HTTP Header Checker for verifying
Vary,Link, and content-type headers agents depend on