How to Audit a Page for AI Readiness
A page audit for AI readiness evaluates a single URL across seven dimensions to determine whether AI systems can access, understand, trust, and extract content from it. Unlike traditional site crawls, a page-level audit goes deep on one URL — checking 120+ signals and outputting a scored report with prioritized fixes.
Why Page-Level Audits Matter More Than Site Audits
AI systems cite pages, not domains. Your homepage might score 8.5 while your pricing page scores 3.2 — and the pricing page is the one people are searching for. Traditional site crawls find technical issues across thousands of pages but give you no prioritized action plan for any single URL.
82.5% of AI citations link to nested pages, not homepages (Onely 2025). 47% of Google AI Overview citations come from pages ranking below position #5 — content quality and structure, not ranking position, determines what gets cited.
Key Takeaway
The use case: You wrote a 3,000-word guide and it never appears in AI answers. A page audit shows you exactly why — whether it's a blocked crawler, JS-only rendering, missing author attribution, or no schema. Site audits can't do this.
The 7-Dimension Framework
Every AI Search Visibility page report evaluates these seven dimensions. Dimensions 1–6 are scored on a 0–10 scale and combined into an overall score. Dimension 7 (Risk) issues flags rather than scores — a single flag can disqualify an otherwise high-scoring page.
Crawlability & Access
15%Can AI crawlers reach, fetch, and parse your page? This is a binary gate — if AI crawlers are blocked, no other optimization matters.
What it checks
- —robots.txt directives for GPTBot, OAI-SearchBot, Claude-SearchBot, PerplexityBot
- —HTTP status code (200 vs 4xx/5xx)
- —Canonical tag (self-referencing, not pointing away)
- —JavaScript rendering — does content exist without JS execution?
- —Mixed content (HTTPS page loading HTTP resources)
- —noindex / nosnippet meta tags
69% of AI crawlers cannot execute JavaScript (SearchVIU 2025). If your content is JS-rendered, it simply doesn't exist for most AI systems.
Common issues found
Snippet & CTR Signals
15%Does your page produce a clear, compelling snippet in AI results? AI systems read title and description first to assess topic fit.
What it checks
- —Title tag: length, keyword presence, boilerplate detection
- —Meta description: unique, not auto-generated
- —H1: present, single, matches title intent
- —Open Graph tags: og:title, og:description, og:image
- —Breadcrumb presence (visible and/or schema)
- —Date visibility in SERP (datePublished in schema)
Boilerplate titles ('Home | Company Name') are the #1 Google title rewrite trigger. Rewritten titles are less likely to be selected by AI systems as representative of the page.
Common issues found
Intent & Content Value
20%Does your content match what users are actually searching for, and does it add unique value? Content-intent mismatch is a hard blocker.
What it checks
- —Search intent match (informational / navigational / commercial / transactional)
- —Content depth — thin content < 200 words flagged automatically
- —Information gain: unique value beyond competitor pages
- —First-hand experience signals (specific details, named examples, original data)
- —Answer-first architecture: direct answer in first 60 words
- —Filler content detection: data point density per 500 words
- —AI writing pattern detection: formulaic structure, low burstiness
Google's OriginalContentScore (confirmed in API leak) evaluates uniqueness regardless of length. Princeton GEO paper: keyword stuffing has near-zero or negative effect on AI citation rates.
Common issues found
Trust & E-E-A-T
20%Does your page demonstrate genuine expertise and transparency? 96% of AI Overview citations come from verified authoritative sources.
What it checks
- —Author byline present and linking to bio with credentials
- —Person schema in JSON-LD (name, jobTitle, affiliation, sameAs)
- —About page existence and quality
- —External source citations on factual claims (3+ per page)
- —Publication date and last-updated date visible
- —YMYL classification + appropriate disclaimers
- —AI content disclosure (if applicable)
Pages with expert author attribution are cited at 2.4x the rate of anonymous pages (PresenceAI). 70.4% of sources cited by ChatGPT include Person schema in JSON-LD (EverTune).
Common issues found
Schema Markup
10%Is your content machine-readable via JSON-LD schema? Only 12.4% of websites implement structured data — a major competitive advantage for those who do.
What it checks
- —JSON-LD presence in <script type="application/ld+json">
- —@context validity (missing = BLOCKER — silently ignored)
- —Schema type appropriateness for page type
- —Required properties for each schema type
- —Content-schema match (schema must match visible content)
- —datePublished / dateModified in Article schema
FAQPage schema: 3.2x more likely to appear in Google AI Overviews (Frase). GPT-4 accuracy improves from 16% to 54% when content uses structured data.
Common issues found
AI Extractability / Citeability
20%Can AI systems pull clean, standalone quotes from your content? 44.2% of all LLM citations come from the first 30% of text.
What it checks
- —Answer-first architecture: direct answer in first 60 words
- —Self-contained 'answer capsules' after each H2/H3 (40-60 words)
- —External citation count and quality (3+ credible sources per page)
- —Content structure: tables, ordered lists, numbered steps
- —Marketing language density (superlatives, vague qualifiers)
- —Entity density (target: 15-20 named entities per 1,000 words)
- —llms.txt implementation
Pages with external citations: 34.9% AI selection rate vs. 3.2% without (PresenceAI). Comparison tables increase citation rates 2.5x vs. unstructured text.
Common issues found
Risk Analysis (Red Team)
FlagsDoes your page have signals that make AI systems avoid citing it? Risk flags are disqualifiers — a 9.2/10 page can still be penalized.
What it checks
- —Google 2024 spam policy violations (scaled content, site reputation abuse)
- —FTC violations: fake reviews, undisclosed affiliate content
- —EU AI Act disclosure requirements
- —Cookie consent / GDPR compliance
- —Hidden content signals (invisible to users, readable by crawlers)
- —Malware / cryptomining script signatures
- —Deceptive dark patterns (manipulative urgency, subscription traps)
Scaled AI content was the #1 reason for domain-level manual actions in 2025. FTC fake review violations: up to $50,000 per violation.
Common issues found
Severity Classification
Every issue found in an audit is classified at one of four severity levels. The classification determines the order of fixes — Blockers must be resolved first because they prevent all other optimizations from mattering.
| Severity | Definition | Expected Action |
|---|---|---|
| Blocker | Prevents AI from accessing or citing the page entirely | Fix immediately — other optimizations are moot until resolved |
| High | Significantly reduces citation probability | Fix within 1–2 weeks |
| Medium | Reduces citation quality or frequency | Fix within 1–2 months |
| Low | Minor improvement opportunity | Fix when convenient |
Effort Estimation
Every fix also receives an effort estimate so you can prioritize quick wins over high-effort changes when both have similar impact.
| Label | Time | Examples |
|---|---|---|
| XS | < 30 minutes | Adding FAQPage schema, fixing robots.txt, adding author byline, updating dateModified |
| S | 30 min – 2 hours | Rewriting opening paragraphs, adding external citations, fixing meta descriptions, adding Person schema |
| M | 2–8 hours | Adding comparison tables, improving content depth, implementing full schema suite, creating author bio page |
| L | > 8 hours | Full content rewrite, building topic cluster, establishing original research, YMYL compliance overhaul |
Score Interpretation
Pass
Strong AI visibility. Page is likely being cited or is close to it. Focus on maintaining freshness and expanding content depth.
Needs Fix
Significant gaps. AI may occasionally cite this page but inconsistently. Address High-severity issues first.
Critical
Multiple blockers. Page is unlikely to appear in any AI citations. Fix Blockers immediately before any other work.
Sample Audit Walkthrough
Here's what a typical audit result looks like for a mid-performing content page. This example reflects real patterns we see across pages in the 5–7 score range.
Sample page
example.com/blog/email-marketing-guide
6.1/10
Needs Fix
Top 3 fixes (priority order):
Add author byline + Person schema
No author attribution on a commercial-intent content page. Add named author, link to bio, and add Person schema with jobTitle and sameAs.
Add external citations (3+)
Zero cited sources. Add 3–5 external links to credible studies or primary sources. This alone moves selection rate from 3.2% to 34.9%.
Rewrite opening 60 words to answer-first
Content opens with background context. Restructure to place the direct answer in the first 55 words, then expand.
Run Your Own Audit
Two options: automated (60 seconds, full 120+ signal report) or manual (10 minutes, surface-level check).
Manual 10-Minute Checklist
A site audit crawls hundreds or thousands of pages for technical issues but gives you no prioritized action plan for any single page. A page audit goes deep on one URL — checking 120+ signals across all 7 dimensions — and outputs a scored report with specific, prioritized fixes. AI systems cite pages, not domains, so page-level analysis is more actionable for AI visibility.
Start with your highest-value pages: the pages you most want to appear in AI answers for your target queries. This typically means your pricing page, key product or service pages, and your most-trafficked content articles. 82.5% of AI citations link to nested pages, not homepages — so your homepage is rarely the highest priority.
Re-audit after every significant content change, after major algorithm updates, and on a quarterly schedule for your highest-priority pages. Time-sensitive content (news, pricing, product specs) should be re-audited monthly. Evergreen content audited quarterly is sufficient if no major changes occurred.
No — focus on Blocker and High severity issues first. Blockers prevent AI from accessing or citing the page entirely; fixing them has immediate impact. High issues significantly reduce citation probability. Medium and Low issues improve quality incrementally. A page with all Blockers and High issues resolved is likely to perform better than a page with everything fixed except one Blocker.
AI crawlers blocked in robots.txt — found in 31% of audited pages. The second most common is JavaScript-only rendering: 69% of AI crawlers cannot execute JavaScript, so content inside React/Vue/Angular components that aren't server-rendered is invisible. The nosnippet meta tag is less common but completely prevents AI from extracting any text from the page.
Yes. Auditing competitor pages is one of the highest-value uses of the tool. Understanding why a competitor's page gets cited over yours — and which specific signals they have that you don't — gives you a concrete implementation checklist. Focus on their schema implementation, content structure, and author attribution.
Significantly. Many AI visibility signals overlap with Google's core ranking factors: content depth, E-E-A-T, schema markup, page speed, and canonicalization. Fixing AI blockers typically improves traditional rankings simultaneously. The main divergence is that AI citation prioritizes extractability and author attribution more than traditional PageRank signals.
Pages scoring 8.0+ on the 0–10 scale have a significantly higher probability of appearing in AI citations. The 8.0 threshold maps to: AI crawlers allowed, answer-first structure in place, 3+ external citations, author attribution with Person schema, and at least FAQPage or Article schema implemented. Scores below 5.0 indicate multiple blockers that need immediate attention.