The robots.txt file is a text file placed at the root of a website that tells crawl robots (crawlers) which pages they can or cannot explore, now including AI crawlers.
What is the Robots.txt file?
The robots.txt file is a standard text file placed at a website's root (example.com/robots.txt) that communicates access directives to crawl robots (crawlers): which parts of the site to explore and which to ignore. It's the first thing Googlebot, GPTBot and other crawlers check before exploring your site.
Robots.txt and AI crawlers
With the rise of AI, new crawlers have appeared:
- GPTBot: OpenAI's crawler for feeding ChatGPT
- ClaudeBot: Anthropic's crawler for feeding Claude
- PerplexityBot: Perplexity AI's crawler
- Google-Extended: Google's crawler for Gemini
- Bytespider: ByteDance's crawler (TikTok)
Recommended configuration for AI visibility
To maximize your AI visibility, your robots.txt should allow AI crawlers:
# Allow AI crawlers
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
# Sitemap
Sitemap: https://example.com/sitemap.xml
Common mistakes
- Blocking all bots: "User-agent: * / Disallow: /" also blocks AI crawlers
- Forgetting AI crawlers: If GPTBot is blocked, ChatGPT can't index your content
- No robots.txt: Crawlers explore everything without priority
Robots.txt and AI Labs Audit
AI Labs Audit's GEO Score automatically checks your robots.txt file and alerts you if important AI crawlers are blocked.