An XML sitemap is a file that lists all important pages of a website, helping search engines and AI crawlers discover and efficiently index content.
What is an XML Sitemap?
An XML sitemap is an XML-format file that lists the URLs of all important pages on a website. It provides search engines and AI crawlers with a map of your site, facilitating the discovery and indexation of your content.
Types of sitemaps
- Pages sitemap: Lists the main site pages
- Blog sitemap: Lists blog articles with publication and modification dates
- Image sitemap: References important images
- Video sitemap: References video content
- Sitemap index: Master file pointing to sub-sitemaps
Why is the XML Sitemap important for AI?
AI crawlers (GPTBot, ClaudeBot, PerplexityBot) use sitemaps the same way as Googlebot:
- Content discovery: AIs quickly find your important pages
- Freshness: Modification dates tell crawlers when to revisit a page
- Priority: You can indicate which pages are most important
- Completeness: Ensure all pages are accessible to crawlers
Best practices
- Declare the sitemap in the robots.txt file
- Automatically update the sitemap when content is added
- Include only published, indexable pages
- Limit to 50,000 URLs per sitemap file
- Submit the sitemap via Google Search Console and Bing Webmaster Tools
XML Sitemap and AI Labs Audit
AI Labs Audit's GEO Score checks the presence and validity of your XML sitemap, and ensures it's properly referenced in your robots.txt.