TL;DR: 25 concrete actions to improve how often ChatGPT, Claude, Gemini and Perplexity cite your brand. Every statistic is sourced. Action #2 (llms.txt) has been rewritten to reflect what four independent studies, and Google itself, now say about it.
On 15 May 2026, Google published its AI Optimization Guide. The position is blunt: for Google Search, optimizing for generative AI is still SEO, and tactics like llms.txt, AI-specific Schema.org, content chunking and inauthentic mentions are dismissed. Google has a commercial reason to draw that line. Its ecosystem (Search, AI Overviews, Ads) depends on remaining the single entry point. Tools that measure visibility on ChatGPT, Claude or Perplexity sit outside that perimeter.
The empirical picture is more nuanced than either reading suggests. Limy (515 million bot events), OtterlyAI (90 days of data), ALLMO (94 614 cited URLs) and SE Ranking (300 000 domains) all converge: llms.txt has essentially no impact on AI citations today. Google is right on that specific point. But the Princeton GEO paper (KDD 2024) shows that other techniques, ones Google never names, produce measurable lifts. Quotation Addition is the single top-performing technique tested: +41% on Position-Adjusted Word Count, +28% on Subjective Impression. Statistics Addition and Cite Sources follow at +30 to +40% on the same metric. Cite Sources alone drove a +115% visibility increase for websites that started in fifth position on the SERP, what the paper calls a democratisation effect.
The checklist below is built from that empirical layer, not from vendor positioning.
The 2026 baseline
How to use this list
Work through the five categories in order. Technical foundations come first because they gate everything downstream. Then sharpen content structure, build authority signals, extend reach beyond Google, and finally measure. The middle of the list points back to one specific tool because, beyond a certain volume, manual tracking stops scaling. The rest is method, not product.
1. Technical foundations
The crawl, render and discovery layer that conditions everything elseThe honest version: Ahrefs analysed 1 885 pages in May 2026 and found no measurable lift on AI Overviews citations from Schema.org, and a slight negative correlation in some segments. Google's May 2026 guide says the same. The reason to ship structured data is unchanged though: rich results in classic SERPs, better parsing by Bing and DuckDuckGo, cleaner ingestion by downstream knowledge graphs. Treat it as table stakes, not as a GEO lever.
Start with Organization on the homepage and Article on every editorial page. Add FAQPage where you already have a real FAQ block. Validate with Google's Rich Results Test. Skip the AI-specific extensions Google explicitly tells you to ignore.
Four independent measurements published between late 2025 and early 2026 reach the same conclusion. Limy logged 408 hits on /llms.txt out of 515 million bot events (statistically zero). OtterlyAI saw 84 hits across 62 100 bot visits (0.1%). ALLMO found llms.txt in 0.00106% of cited URLs across 11 867 AI responses. SE Ranking ran an XGBoost model on 300 000 domains and found no effect on citation frequency. The actual use case is agentic web: products consumed by Cursor, Claude Code, Copilot or MCP-aware tooling. Stripe, Vercel and Anthropic publish llms.txt for exactly that reason.
If you publish a developer-facing API or docs site, ship llms.txt and llms-full.txt with concise endpoint references. If your audience is end users searching on ChatGPT or Perplexity, your time is better spent on actions #6 to #10. KIME's nuanced read is worth the eight minutes.
Search Engine Land's 107 000-page analysis reframes Core Web Vitals as a gatekeeper rather than a lever. Poor performance correlates negatively with AI Overviews visibility (correlation between -0.12 and -0.18), but pushing an already-passable score higher does not produce a measurable boost. The mechanism still matters: the RAG systems behind generative engines run with tight per-source timeouts, so a slow page is silently dropped before the model ever sees it. Aim to clear the bar, not to win it.
Run PageSpeed Insights on your ten most-cited pages first. WebP or AVIF for hero images, deferred third-party scripts, edge caching, and a critical CSS subset usually move the needle more than micro-optimizations.
Google's mobile-first index is now the only index. The ChatGPT mobile app and the Perplexity mobile app both render full pages before extracting citations; layout shift and unreadable typography reduce the chance that your page survives the citation filter.
Check on three real devices, not just the responsive emulator. Tap targets ≥ 44px, no horizontal scroll, viewport meta in place, font ≥ 16px in body copy.
GPTBot, ClaudeBot, PerplexityBot, GoogleOther, Applebot-Extended, Bytespider and the rest behave differently. Some respect Allow / Disallow strictly, others ignore it. Blocking GPTBot does not block ChatGPT's real-time browsing (that uses a different agent). The question is editorial: do you want your content to feed training corpora, real-time retrieval, both, or neither?
Audit your current robots.txt. For most B2B publishers, allowing GPTBot, ClaudeBot, PerplexityBot and GoogleOther maximises citation surface. If you want to exclude training while keeping retrieval, document the policy publicly and use the user-agent strings each provider publishes.
2. Content structure
What the Princeton GEO paper actually validates, with the numbersGenerative engines extract the most extractable passage. A page that gives its core answer in the first 80-120 words is parseable. A page that builds suspense for 600 words before delivering the answer rarely makes it past the chunking step.
For each page, write the answer in two sentences before you draft the rest. Put those two sentences right after the H1. Use the body to nuance, defend, and back the claim.
Generative engines treat headings as semantic anchors. A heading like “Pricing” is ambiguous. “How much does an audit cost for a 50-page site” matches an actual prompt pattern. The Princeton paper notes meaningful gains on retrieval rank when section boundaries map to user-phrased questions.
Pull your top 20 questions from sales calls, support tickets or AlsoAsked. Reuse the exact phrasing as H2s. Avoid keyword stuffing in headings, which Google's May 2026 guide explicitly calls out.
Q/A structure mirrors the prompt format that generative engines optimise around. Made-up FAQ blocks built for SEO obfuscation no longer work and trigger Google's scaled-content classifier (May 2026 update). Real FAQs built from genuine recurring questions still help.
Five to eight questions per page, sourced from sales transcripts or support data. Pair with FAQPage Schema. Update quarterly: question phrasing drifts faster than people think.
The Princeton GEO paper (KDD 2024, 10 000 queries across 25 domains) measured a +41% lift in Position-Adjusted Word Count when pages added quotations attributed to a named source. Generative engines weight provenance heavily when reconciling competing answers.
One quote from a named expert, analyst or executive per major section. Include the role and the publication or date. Avoid “industry experts say” phrasing, which carries zero retrieval weight.
Princeton measured a +30 to +40% lift on Position-Adjusted Word Count for pages enriched with named statistics. The paper groups Cite Sources, Quotation Addition and Statistics Addition as the three top-performing methods on this metric. The mechanism is similar to action #9: provenance reduces model uncertainty, so the passage gets selected. Anonymous statistics produce no measurable lift.
Every figure linked. Every figure dated. Never recycle a statistic without checking the original. This article is built that way on purpose, and it is reviewable by anyone.
See where you actually stand today
Before working through the remaining 15 actions, get a snapshot of how often your brand is cited on ChatGPT, Claude, Gemini and Perplexity. A free audit takes a few minutes and gives you a baseline to measure against.
Run a free audit3. Authority & E-E-A-T
Provenance signals that survive both Google's ranking and LLM citation logicGenerative engines reconcile competing answers in part by author signal: identifiable humans with LinkedIn profiles, publication histories and credentials carry more retrieval weight than anonymous bylines. E-E-A-T remains the most useful frame Google has published, and Princeton's notion of source trustworthiness aligns with it.
One page per contributor at /author/{slug}: real photo, 150-200 word biography, credentials, list of published pieces, link to LinkedIn. Cross-link from every article.
Case studies with measurable outcomes, named clients (where contractually allowed), proprietary datasets and screenshots from actual dashboards are weighted higher than generic explainers. This is the first “E” in E-E-A-T and the dimension hardest to fake.
One case study per quarter per service line. Include the brief, the work, and the measurable outcome. If the data is anonymised, say so and explain why.
An isolated article carries less weight than a coherent set of ten interconnected pieces covering one theme in depth. Generative engines look for breadth and internal consistency before privileging a source on a topic.
Pick three to five themes that match your actual expertise. For each, write one pillar page (2 500+ words) plus eight to ten satellites (1 200-1 800 words). Systematic internal linking. Refresh the pillar every six months.
Princeton measured a +30% lift on Position-Adjusted Word Count from citing external sources. More specifically, the Cite Sources method drove +115% visibility for websites that started in fifth position on the SERP, a result the paper frames as a democratisation effect. The reverse is also true: being cited by trade publications, sector reports or peer-reviewed work boosts retrieval probability.
Target three to five publications per quarter for guest contributions, expert interviews or co-published research. Avoid mass guest-post platforms. Wikipedia is worth pursuing if you genuinely meet notability criteria.
SE Ranking data, surfaced by ZipTie, reports that around 76% of pages most frequently cited by ChatGPT had been substantively updated in the previous 30 days. Recency acts as a confidence proxy for the model when topics evolve quickly.
Maintain a refresh calendar for your 20 most strategic pages. A refresh must be substantive (new section, updated statistics, removed obsolete claims), not a date change. Make the change visible by updating the on-page modified date.
4. Multi-LLM presence
Where the GEO/AEO discipline keeps its meaning, away from Google's perimeterGoogle's May 2026 guide stops at Google's ecosystem. ChatGPT, Claude and Perplexity run on different retrieval stacks with different source preferences. Similarweb tracked ChatGPT at 56.7% of AI assistant traffic in March 2026, Gemini at around 18%, with the rest distributed across Claude, Perplexity, Copilot and Grok. A brand cited consistently on ChatGPT can be completely absent from Perplexity.
Pick ten business-critical prompts your prospects might ask. Run them on each engine, manually if needed. Log whether you are cited, where in the answer, and what the surrounding context says. Identify the engine where you under-perform most.
BrightEdge (February 2026) reported AI Overviews on close to half of queries across nine commercial verticals, with peaks at 88% in Healthcare, 83% in Education and 82% in B2B Tech. Seer Interactive measured a 61% CTR drop on the page-1 results sitting below an AI Overview (3 119 queries, 25M impressions). seoClarity reports that 97% of AIO sources come from the top 20 organic results, meaning classic SEO still gates entry.
Run your priority keywords on Google US desktop. Note which ones trigger an AIO. Capture the cited sources. Build a separate tracker for AIO presence, distinct from organic ranking.
ChatGPT-Search and Perplexity both index Reddit threads and Quora answers heavily. Authentic discussions, ranked answers and user reviews carry community-trust signals that generative engines lean on, particularly for product comparisons and how-to queries.
Identify five subreddits and ten Quora topics that map to your prospect journey. Contribute genuinely. Mention your product only where it answers the question. Spam gets you banned and the bans propagate.
For comparison and recommendation queries, generative engines pull from sector directories (G2, Capterra, vertical lists). A complete profile on the right five directories outperforms presence on fifty random ones.
Identify the five most-cited directories in your sector. Complete each profile to 100%. Solicit reviews from satisfied clients. Keep NAP data consistent across all sources.
Voice assistants and chat interfaces favour natural-language queries: longer, full sentences, often phrased as questions. A page that mirrors that phrasing in its content is easier to surface as an answer.
Target multi-word natural-language queries (“how do I choose a CRM for a 20-person sales team” rather than “best CRM”). Keep the in-section answers to 40-60 words. Use full sentences. SpeakableSpecification Schema is worth shipping if you also publish audio content.
5. Measurement
Without measurement, every other action becomes vibesWithout a baseline, every subsequent change becomes anecdotal. Manual prompt logging works for the first 20 to 30 queries. Beyond that, a dedicated tracker is needed to keep methodology stable across weeks and engines.
Configure tracking on 20 to 50 strategic prompts. Run them weekly across four engines. Log referral traffic from chatgpt.com, perplexity.ai, claude.ai and gemini.google.com in your analytics. Alert on new mentions or losses.
You cannot track every possible query. You can track the ones that map to actual buying decisions. A well-chosen set of 30 prompts gives a sharper signal than a fuzzy set of 300.
Pull from: pre-purchase questions in sales calls, competitor comparison queries, “best X for Y” patterns, vertical-specific questions. Categorise by buying stage. Review the list quarterly.
AI Share of Voice measures the fraction of citations you capture versus competitors on a defined set of prompts. It is the closest analogue to ranking in the SEO world and the most stable KPI to track month over month.
For each strategic prompt, log every brand cited. Compute your share as (your citations / total citations) × 100. Track monthly. Cross-reference with sentiment, since being mentioned negatively is not the same as being mentioned at all.
What works in one vertical does not necessarily work in another. The only reliable way to know which formats win for your prompts is to test them: FAQ versus prose, numbered lists versus paragraphs, short answers versus extended ones.
Hold each test for four to six weeks. Change one variable at a time. Compare citation frequency before and after on the prompts where the page targets. Generative engines update on slower cycles than Google, so short tests produce noise.
Regular reporting builds the institutional memory of what worked and the credibility to keep investing in the discipline. It is also the most efficient way to surface unexpected drift between two model updates.
Two pages maximum. Share of Voice trend, new citations gained, citations lost, key wins and losses, planned actions for the next month. Same template every month so changes are visible at a glance.
The five categories at a glance
- Technical (1-5): Schema as table stakes, llms.txt only for agents, LCP under 2.5s, mobile-first, deliberate robots.txt
- Content (6-10): answer-first, question-form headings, real FAQ, named quotations (+41%), named statistics (+37-40%)
- Authority (11-15): author pages, first-hand experience, topical clusters, external citations (+30 to +115%), 30-day refresh
- Multi-LLM (16-20): test four engines, monitor AIO, Reddit/Quora, sector directories, conversational phrasing
- Measurement (21-25): tracking, prompt set, Share of Voice, A/B tests, monthly reporting
Closing note
The best argument against guessing is measuring. The discipline of GEO and AEO, whatever Google calls it, lives or dies on whether you can show a stakeholder that something moved between month one and month two. That requires named prompts, named engines, and a method that does not change every time a model gets retrained. The 25 actions above are the bare minimum to get there.
Article reworked in May 2026 following Google's AI Optimization Guide. Every statistic links to its primary source. Method, dataset and limitations are reviewable. If a number proves wrong or a study is superseded, we update.
Every question asked to ChatGPT without your name in the answer is a competitor recommended instead of you — measured across 6,820 real AI answers.