Did Google's May 2026 guide end GEO and AEO as separate disciplines?

Google's position is that, from Google Search's perspective, optimizing for generative AI is still SEO. The position is internally consistent for Google's own ecosystem (Search, AI Overviews, Discover). It does not extend to ChatGPT, Claude or Perplexity, which run on independent retrieval stacks with different source preferences. GEO and AEO remain meaningful disciplines precisely where Google's perimeter ends.

Should I create an llms.txt file?

Only if your audience includes agents. Four independent studies (Limy, OtterlyAI, ALLMO, SE Ranking) converge on the same finding: llms.txt has essentially no measurable impact on AI Search citations today. The genuine use case is the agentic web: products consumed by Cursor, Claude Code, Copilot or MCP-aware tooling. If you publish a developer-facing API or docs site, ship it. If your audience is end users searching on ChatGPT, your time is better spent on content actions #6 to #10.

How long before this checklist produces measurable results?

Technical actions (#1 to #5) ship in one to two weeks and produce no immediate visibility lift, just the conditions for one. Content actions (#6 to #10) generally start showing up in citations within four to eight weeks. Authority signals (#11 to #15) compound over three to six months. Multi-LLM actions (#16 to #20) depend on the engine: Perplexity updates fastest, Gemini is slower. The realistic horizon to see a Share of Voice trend reverse is one quarter.

Which AI assistant should I prioritize in 2026?

Similarweb's March 2026 data put ChatGPT at 56.7% of AI assistant traffic, down from 86.7% in January 2025. Gemini is around 18%, the remainder distributed across Claude, Perplexity, Copilot and Grok. Concentration is decreasing, not increasing. For most B2B audiences, ChatGPT remains the largest single channel, but pure ChatGPT optimization leaves the other 43% of traffic uncaptured. Track all four major engines.

AEO Checklist 2026: 25 Actions to Get Cited

Q: Can I measure AI visibility without a dedicated tool?

For the first 20 to 30 strategic prompts, yes. Run each prompt weekly on ChatGPT, Claude, Gemini and Perplexity. Log whether you are cited, where in the answer, with what surrounding context. Track referral traffic from chatgpt.com, perplexity.ai, claude.ai and gemini.google.com in your analytics. Past 30 prompts and four engines, the manual cost exceeds the cost of a dedicated tracker. The methodology is what matters: keep it stable across weeks so the data is comparable.

TL;DR: 25 concrete actions to improve how often ChatGPT, Claude, Gemini and Perplexity cite your brand. Every statistic is sourced. Action #2 (llms.txt) has been rewritten to reflect what four independent studies, and Google itself, now say about it.

On 15 May 2026, Google published its AI Optimization Guide. The position is blunt: for Google Search, optimizing for generative AI is still SEO, and tactics like llms.txt, AI-specific Schema.org, content chunking and inauthentic mentions are dismissed. Google has a commercial reason to draw that line. Its ecosystem (Search, AI Overviews, Ads) depends on remaining the single entry point. Tools that measure visibility on ChatGPT, Claude or Perplexity sit outside that perimeter.

The empirical picture is more nuanced than either reading suggests. Limy (515 million bot events), OtterlyAI (90 days of data), ALLMO (94 614 cited URLs) and SE Ranking (300 000 domains) all converge: llms.txt has essentially no impact on AI citations today. Google is right on that specific point. But the Princeton GEO paper (KDD 2024) shows that other techniques, ones Google never names, produce measurable lifts. Quotation Addition is the single top-performing technique tested: +41% on Position-Adjusted Word Count, +28% on Subjective Impression. Statistics Addition and Cite Sources follow at +30 to +40% on the same metric. Cite Sources alone drove a +115% visibility increase for websites that started in fifth position on the SERP, what the paper calls a democratisation effect.

The checklist below is built from that empirical layer, not from vendor positioning.

The 2026 baseline

58.5% of US Google searches end without a click (SparkToro & Datos, 2024)

-25% search engine volume by 2026 (Gartner, Feb 2024 scenario)

56.7% ChatGPT share of AI assistant traffic in March 2026, down from 86.7% in Jan 2025 (Similarweb)

How to use this list

Work through the five categories in order. Technical foundations come first because they gate everything downstream. Then sharpen content structure, build authority signals, extend reach beyond Google, and finally measure. The middle of the list points back to one specific tool because, beyond a certain volume, manual tracking stops scaling. The rest is method, not product.

1. Technical foundations

The crawl, render and discovery layer that conditions everything else

#1 Ship standard Schema.org markup (Article, Organization, FAQ, HowTo)

Why it matters

The honest version: Ahrefs analysed 1 885 pages in May 2026 and found no measurable lift on AI Overviews citations from Schema.org, and a slight negative correlation in some segments. Google's May 2026 guide says the same. The reason to ship structured data is unchanged though: rich results in classic SERPs, better parsing by Bing and DuckDuckGo, cleaner ingestion by downstream knowledge graphs. Treat it as table stakes, not as a GEO lever.

Implementation tip

Start with Organization on the homepage and Article on every editorial page. Add FAQPage where you already have a real FAQ block. Validate with Google's Rich Results Test. Skip the AI-specific extensions Google explicitly tells you to ignore.

#2 Publish llms.txt only if your audience is agents, not humans

Why it matters

Four independent measurements published between late 2025 and early 2026 reach the same conclusion. Limy logged 408 hits on /llms.txt out of 515 million bot events (statistically zero). OtterlyAI saw 84 hits across 62 100 bot visits (0.1%). ALLMO found llms.txt in 0.00106% of cited URLs across 11 867 AI responses. SE Ranking ran an XGBoost model on 300 000 domains and found no effect on citation frequency. The actual use case is agentic web: products consumed by Cursor, Claude Code, Copilot or MCP-aware tooling. Stripe, Vercel and Anthropic publish llms.txt for exactly that reason.

Implementation tip

If you publish a developer-facing API or docs site, ship llms.txt and llms-full.txt with concise endpoint references. If your audience is end users searching on ChatGPT or Perplexity, your time is better spent on actions #6 to #10. KIME's nuanced read is worth the eight minutes.

#3 Hold LCP under 2.5 seconds on your editorial pages

Why it matters

Search Engine Land's 107 000-page analysis reframes Core Web Vitals as a gatekeeper rather than a lever. Poor performance correlates negatively with AI Overviews visibility (correlation between -0.12 and -0.18), but pushing an already-passable score higher does not produce a measurable boost. The mechanism still matters: the RAG systems behind generative engines run with tight per-source timeouts, so a slow page is silently dropped before the model ever sees it. Aim to clear the bar, not to win it.

Implementation tip

Run PageSpeed Insights on your ten most-cited pages first. WebP or AVIF for hero images, deferred third-party scripts, edge caching, and a critical CSS subset usually move the needle more than micro-optimizations.

#4 Build mobile-first, then verify on real devices

Why it matters

Google's mobile-first index is now the only index. The ChatGPT mobile app and the Perplexity mobile app both render full pages before extracting citations; layout shift and unreadable typography reduce the chance that your page survives the citation filter.

Implementation tip

Check on three real devices, not just the responsive emulator. Tap targets ≥ 44px, no horizontal scroll, viewport meta in place, font ≥ 16px in body copy.

#5 Decide your robots.txt stance on AI crawlers, then write it down

Why it matters

GPTBot, ClaudeBot, PerplexityBot, GoogleOther, Applebot-Extended, Bytespider and the rest behave differently. Some respect Allow / Disallow strictly, others ignore it. Blocking GPTBot does not block ChatGPT's real-time browsing (that uses a different agent). The question is editorial: do you want your content to feed training corpora, real-time retrieval, both, or neither?

Implementation tip

Audit your current robots.txt. For most B2B publishers, allowing GPTBot, ClaudeBot, PerplexityBot and GoogleOther maximises citation surface. If you want to exclude training while keeping retrieval, document the policy publicly and use the user-agent strings each provider publishes.

2. Content structure

What the Princeton GEO paper actually validates, with the numbers

#6 Lead with the answer, then defend it

Why it matters

Generative engines extract the most extractable passage. A page that gives its core answer in the first 80-120 words is parseable. A page that builds suspense for 600 words before delivering the answer rarely makes it past the chunking step.

Implementation tip

For each page, write the answer in two sentences before you draft the rest. Put those two sentences right after the H1. Use the body to nuance, defend, and back the claim.

#7 Write H2/H3 as actual questions or sharp affirmations

Why it matters

Generative engines treat headings as semantic anchors. A heading like “Pricing” is ambiguous. “How much does an audit cost for a 50-page site” matches an actual prompt pattern. The Princeton paper notes meaningful gains on retrieval rank when section boundaries map to user-phrased questions.

Implementation tip

Pull your top 20 questions from sales calls, support tickets or AlsoAsked. Reuse the exact phrasing as H2s. Avoid keyword stuffing in headings, which Google's May 2026 guide explicitly calls out.

#8 Embed real FAQ sections built from real questions

Why it matters

Q/A structure mirrors the prompt format that generative engines optimise around. Made-up FAQ blocks built for SEO obfuscation no longer work and trigger Google's scaled-content classifier (May 2026 update). Real FAQs built from genuine recurring questions still help.

Implementation tip

Five to eight questions per page, sourced from sales transcripts or support data. Pair with FAQPage Schema. Update quarterly: question phrasing drifts faster than people think.

#9 Quote named sources, not anonymous ones

Why it matters

The Princeton GEO paper (KDD 2024, 10 000 queries across 25 domains) measured a +41% lift in Position-Adjusted Word Count when pages added quotations attributed to a named source. Generative engines weight provenance heavily when reconciling competing answers.

Implementation tip

One quote from a named expert, analyst or executive per major section. Include the role and the publication or date. Avoid “industry experts say” phrasing, which carries zero retrieval weight.

#10 Cite specific statistics with named sources

Why it matters

Princeton measured a +30 to +40% lift on Position-Adjusted Word Count for pages enriched with named statistics. The paper groups Cite Sources, Quotation Addition and Statistics Addition as the three top-performing methods on this metric. The mechanism is similar to action #9: provenance reduces model uncertainty, so the passage gets selected. Anonymous statistics produce no measurable lift.

Implementation tip

Every figure linked. Every figure dated. Never recycle a statistic without checking the original. This article is built that way on purpose, and it is reviewable by anyone.

See where you actually stand today

Before working through the remaining 15 actions, get a snapshot of how often your brand is cited on ChatGPT, Claude, Gemini and Perplexity. A free audit takes a few minutes and gives you a baseline to measure against.

Run a free audit

3. Authority & E-E-A-T

Provenance signals that survive both Google's ranking and LLM citation logic

#11 Ship author pages with verifiable credentials

Why it matters

Generative engines reconcile competing answers in part by author signal: identifiable humans with LinkedIn profiles, publication histories and credentials carry more retrieval weight than anonymous bylines. E-E-A-T remains the most useful frame Google has published, and Princeton's notion of source trustworthiness aligns with it.

Implementation tip

One page per contributor at /author/{slug}: real photo, 150-200 word biography, credentials, list of published pieces, link to LinkedIn. Cross-link from every article.

#12 Show first-hand experience, not theoretical write-ups

Why it matters

Case studies with measurable outcomes, named clients (where contractually allowed), proprietary datasets and screenshots from actual dashboards are weighted higher than generic explainers. This is the first “E” in E-E-A-T and the dimension hardest to fake.

Implementation tip

One case study per quarter per service line. Include the brief, the work, and the measurable outcome. If the data is anonymised, say so and explain why.

#13 Build topical authority clusters around three to five themes

Why it matters

An isolated article carries less weight than a coherent set of ten interconnected pieces covering one theme in depth. Generative engines look for breadth and internal consistency before privileging a source on a topic.

Implementation tip

Pick three to five themes that match your actual expertise. For each, write one pillar page (2 500+ words) plus eight to ten satellites (1 200-1 800 words). Systematic internal linking. Refresh the pillar every six months.

#14 Earn citations in publications the models already trust

Why it matters

Princeton measured a +30% lift on Position-Adjusted Word Count from citing external sources. More specifically, the Cite Sources method drove +115% visibility for websites that started in fifth position on the SERP, a result the paper frames as a democratisation effect. The reverse is also true: being cited by trade publications, sector reports or peer-reviewed work boosts retrieval probability.

Implementation tip

Target three to five publications per quarter for guest contributions, expert interviews or co-published research. Avoid mass guest-post platforms. Wikipedia is worth pursuing if you genuinely meet notability criteria.

#15 Refresh strategic pages on a 30-day cadence

Why it matters

SE Ranking data, surfaced by ZipTie, reports that around 76% of pages most frequently cited by ChatGPT had been substantively updated in the previous 30 days. Recency acts as a confidence proxy for the model when topics evolve quickly.

Implementation tip

Maintain a refresh calendar for your 20 most strategic pages. A refresh must be substantive (new section, updated statistics, removed obsolete claims), not a date change. Make the change visible by updating the on-page modified date.

4. Multi-LLM presence

Where the GEO/AEO discipline keeps its meaning, away from Google's perimeter

#16 Test your visibility on ChatGPT, Claude, Perplexity and Gemini, beyond Google

Why it matters

Google's May 2026 guide stops at Google's ecosystem. ChatGPT, Claude and Perplexity run on different retrieval stacks with different source preferences. Similarweb tracked ChatGPT at 56.7% of AI assistant traffic in March 2026, Gemini at around 18%, with the rest distributed across Claude, Perplexity, Copilot and Grok. A brand cited consistently on ChatGPT can be completely absent from Perplexity.

Implementation tip

Pick ten business-critical prompts your prospects might ask. Run them on each engine, manually if needed. Log whether you are cited, where in the answer, and what the surrounding context says. Identify the engine where you under-perform most.

#17 Monitor Google AI Overviews coverage in your verticals

Why it matters

BrightEdge (February 2026) reported AI Overviews on close to half of queries across nine commercial verticals, with peaks at 88% in Healthcare, 83% in Education and 82% in B2B Tech. Seer Interactive measured a 61% CTR drop on the page-1 results sitting below an AI Overview (3 119 queries, 25M impressions). seoClarity reports that 97% of AIO sources come from the top 20 organic results, meaning classic SEO still gates entry.

Implementation tip

Run your priority keywords on Google US desktop. Note which ones trigger an AIO. Capture the cited sources. Build a separate tracker for AIO presence, distinct from organic ranking.

#18 Build a real Reddit and Quora presence

Why it matters

ChatGPT-Search and Perplexity both index Reddit threads and Quora answers heavily. Authentic discussions, ranked answers and user reviews carry community-trust signals that generative engines lean on, particularly for product comparisons and how-to queries.

Implementation tip

Identify five subreddits and ten Quora topics that map to your prospect journey. Contribute genuinely. Mention your product only where it answers the question. Spam gets you banned and the bans propagate.

#19 List in the directories your sector actually uses

Why it matters

For comparison and recommendation queries, generative engines pull from sector directories (G2, Capterra, vertical lists). A complete profile on the right five directories outperforms presence on fifty random ones.

Implementation tip

Identify the five most-cited directories in your sector. Complete each profile to 100%. Solicit reviews from satisfied clients. Keep NAP data consistent across all sources.

#20 Match the conversational query patterns generative engines optimise around

Why it matters

Voice assistants and chat interfaces favour natural-language queries: longer, full sentences, often phrased as questions. A page that mirrors that phrasing in its content is easier to surface as an answer.

Implementation tip

Target multi-word natural-language queries (“how do I choose a CRM for a 20-person sales team” rather than “best CRM”). Keep the in-section answers to 40-60 words. Use full sentences. SpeakableSpecification Schema is worth shipping if you also publish audio content.

5. Measurement

Without measurement, every other action becomes vibes

#21 Set up AI visibility tracking before you change anything

Why it matters

Without a baseline, every subsequent change becomes anecdotal. Manual prompt logging works for the first 20 to 30 queries. Beyond that, a dedicated tracker is needed to keep methodology stable across weeks and engines.

Implementation tip

Configure tracking on 20 to 50 strategic prompts. Run them weekly across four engines. Log referral traffic from chatgpt.com, perplexity.ai, claude.ai and gemini.google.com in your analytics. Alert on new mentions or losses.

#22 Define the 20 to 50 prompts that matter for your business

Why it matters

You cannot track every possible query. You can track the ones that map to actual buying decisions. A well-chosen set of 30 prompts gives a sharper signal than a fuzzy set of 300.

Implementation tip

Pull from: pre-purchase questions in sales calls, competitor comparison queries, “best X for Y” patterns, vertical-specific questions. Categorise by buying stage. Review the list quarterly.

#23 Calculate your AI Share of Voice

Why it matters

AI Share of Voice measures the fraction of citations you capture versus competitors on a defined set of prompts. It is the closest analogue to ranking in the SEO world and the most stable KPI to track month over month.

Implementation tip

For each strategic prompt, log every brand cited. Compute your share as (your citations / total citations) × 100. Track monthly. Cross-reference with sentiment, since being mentioned negatively is not the same as being mentioned at all.

#24 A/B test content formats on a quarterly rhythm

Why it matters

What works in one vertical does not necessarily work in another. The only reliable way to know which formats win for your prompts is to test them: FAQ versus prose, numbered lists versus paragraphs, short answers versus extended ones.

Implementation tip

Hold each test for four to six weeks. Change one variable at a time. Compare citation frequency before and after on the prompts where the page targets. Generative engines update on slower cycles than Google, so short tests produce noise.

#25 Ship a monthly AI visibility report that survives stakeholder review

Why it matters

Regular reporting builds the institutional memory of what worked and the credibility to keep investing in the discipline. It is also the most efficient way to surface unexpected drift between two model updates.

Implementation tip

Two pages maximum. Share of Voice trend, new citations gained, citations lost, key wins and losses, planned actions for the next month. Same template every month so changes are visible at a glance.

The five categories at a glance

Technical (1-5): Schema as table stakes, llms.txt only for agents, LCP under 2.5s, mobile-first, deliberate robots.txt
Content (6-10): answer-first, question-form headings, real FAQ, named quotations (+41%), named statistics (+37-40%)
Authority (11-15): author pages, first-hand experience, topical clusters, external citations (+30 to +115%), 30-day refresh
Multi-LLM (16-20): test four engines, monitor AIO, Reddit/Quora, sector directories, conversational phrasing
Measurement (21-25): tracking, prompt set, Share of Voice, A/B tests, monthly reporting

Closing note

The best argument against guessing is measuring. The discipline of GEO and AEO, whatever Google calls it, lives or dies on whether you can show a stakeholder that something moved between month one and month two. That requires named prompts, named engines, and a method that does not change every time a model gets retrained. The 25 actions above are the bare minimum to get there.

Article reworked in May 2026 following Google's AI Optimization Guide. Every statistic links to its primary source. Method, dataset and limitations are reviewable. If a number proves wrong or a study is superseded, we update.

About the author

Davy Abderrahman

Founder & CEO at AI Labs Audit

Specialist in AI visibility (AEO/GEO/LLMO), I help agencies and consultants measure and optimize their clients' presence on ChatGPT, Claude, Gemini, Perplexity and other AI answer engines. Pioneer in AI visibility auditing since 2024.

AEO GEO LLMO AI Visibility AI Audits

Only 16% of brands appear when their customers ask AIs. Does yours?

Every question asked to ChatGPT without your name in the answer is a competitor recommended instead of you — measured across 6,820 real AI answers.

Discover the platform Try it for free You are a brand? Free pre-diagnosis on AI Labs Radar

Was this article helpful?

- (0 votes)

AI Visibility Checklist 2026: 25 Actions for ChatGPT, Claude, Gemini, Perplexity

The 2026 baseline

How to use this list

1. Technical foundations

2. Content structure

See where you actually stand today

3. Authority & E-E-A-T

4. Multi-LLM presence

5. Measurement

The five categories at a glance

Closing note

Davy Abderrahman

Related articles

Was this article helpful?