Free tool · No signup
AI-friendly robots.txt generator
Generate a robots.txt that explicitly allows the major AI crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, Bytespider — so your pages can be cited by ChatGPT, Claude, Perplexity, Gemini and AI Overviews.
Inputs
AI crawlers
Your robots.txt
# robots.txt — generated by Aapta GEO
# Drop this at the root of your site: https://yoursite.com/robots.txt
# AI crawlers — explicit Allow rules so bot-by-bot blocking can't creep in.
User-agent: GPTBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: CCBot
Allow: /
User-agent: Applebot-Extended
Allow: /
User-agent: Bytespider
Disallow: /
# Default — allow well-behaved crawlers, lock down admin areas.
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /private/
Sitemap: https://yoursite.com/sitemap.xml
Why your robots.txt matters more in 2026
Robots.txt has always controlled which crawlers can read your site. In 2024–2025, AI engines started shipping their own user-agents — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended. These are separate from classic search bots: blocking one doesn’t affect the others.
Most sites have outdated robots.txt files written years ago, often with overly broad Disallow: / rules that accidentally block AI crawlers. The fix is explicit Allow: / rules per AI bot — hence this tool.
If you want your site cited by AI engines, the floor is to let them read it. Adding the right robots.txt is the cheapest, fastest signal you can fix today.
The 8 AI crawlers worth allowing
- GPTBot— OpenAI’s crawler. Used by ChatGPT for browsing and (separately) for training data.
- Google-Extended— Google’s AI training and AI Overviews crawler. Independent of Googlebot.
- PerplexityBot— Perplexity’s crawler. Critical for being cited as a source in Perplexity answers.
- ClaudeBot— Anthropic’s web crawler for Claude.
- anthropic-ai — Older Anthropic user-agent, still used in some Claude surfaces.
- CCBot — Common Crawl. Many AI training corpora source from CCBot output.
- Applebot-Extended— Apple Intelligence and Siri’s AI crawler.
- Bytespider — ByteDance crawler feeding Doubao and TikTok Search. Optional — large APAC footprint but not always wanted.
FAQ
Why should I allow GPTBot, ClaudeBot and PerplexityBot?+
These are the crawlers AI engines use to index pages they can cite in answers. If they're blocked, your site is invisible to AI search — ChatGPT, Claude, Perplexity and others can't reference you. Allowing them is the first step in any GEO (Generative Engine Optimization) strategy.
What is Google-Extended and is it different from Googlebot?+
Google-Extended is the user-agent Google uses for AI training and Google AI Overviews. It's separate from Googlebot (which handles classic search). Blocking Googlebot keeps you out of search; blocking Google-Extended keeps you out of AI Overviews and Gemini training. You can allow one and block the other.
Can I block AI bots while still ranking in classic Google search?+
Yes. Block Google-Extended in robots.txt to opt out of Gemini and AI Overviews while keeping classic Googlebot allowed for the normal search results page. Most other AI crawlers (GPTBot, ClaudeBot, PerplexityBot) are independent of search ranking.
Does adding these rules guarantee my site gets cited?+
No — robots.txt only controls access. Getting cited also requires good structured data, AI-friendly content, authority signals and Core Web Vitals. The full Aapta GEO scan checks all 6 categories. This generator only solves the access layer.
Want a full AI-Readiness audit?
This tool fixes one signal. The full scan checks 40+ across 6 categories — bot access, structured data, content answerability, authority, technical foundations, and emerging AI standards.
Run the free AI-Readiness scan →