The Google-Extended myth: it doesn't opt you out of AI Overviews
Almost every SEO post gets this wrong. Google-Extended blocks Gemini training data, not AI Overview citation. The actual AI Overview opt-out is nosnippet - and it kills your regular SERP snippet too. Here's the breakdown.
If you’ve blocked Google-Extended in robots.txt to keep your content out of AI Overviews, here’s the bad news: you’re still being cited. The signal you set does something — just not the thing you wanted.
What Google-Extended actually does
Per Google’s own documentation, Google-Extended is a robots.txt user-agent token. Disallowing it removes your content from:
- Gemini training data. Future versions of Gemini won’t be trained on you.
- Gemini Apps (the chat product) generative responses.
It does not remove your content from:
- The regular Google Search index (that’s
Googlebot). - AI Overviews in Search results.
- Sitelinks, knowledge panels, or any other search surface.
AI Overviews are powered by the regular Googlebot index, not the Gemini-Extended training corpus. The model that generates the overview text draws from web search results — the same ones that rank in regular SERP.
So what is the AI Overview opt-out?
There’s exactly one signal that removes a page from AI Overview citation:
<meta name="robots" content="nosnippet">
Or its equivalent header X-Robots-Tag: nosnippet.
This tells Google: no text snippet from this page may be shown anywhere. AI Overview generation can’t quote you. Done.
The catch: it also removes your regular SERP snippet. Searchers see just a title and URL. Click-through rate drops by 20-40% depending on industry.
There is no signal in 2026 that opts out of AI Overviews while keeping regular snippets.
The four-cell decision matrix
| Goal | What to set |
|---|---|
| Stay in regular Search, opt out of Gemini training, accept AI Overview citation | Disallow: Google-Extended |
| Stay in regular Search, fully opt out of AI Overview citation | <meta name="robots" content="nosnippet"> (loses regular snippet) |
| Stay everywhere, full participation | Do nothing |
| Out of everything | Disallow: Googlebot (nuclear; tanks Search ranking) |
Most sites want the third option (full participation), realize they want the second (opt out of AI), set the first (training opt-out) by mistake, and then wonder why they still see their text in AI Overviews.
What about max-snippet:0?
Google has historically said max-snippet:0 is equivalent to nosnippet. It removes the snippet entirely. Same behavior, same trade-off.
Why this matters more in 2026
AI Overview citation has become a real traffic source — not as big as classic SERP click-through, but measurable. If you’re cited and the user clicks the citation, you get the visit. If you set the wrong opt-out and remove yourself from regular snippets, you lose 20-40% of your CTR without removing yourself from AI Overviews.
The cost of the misconfiguration is significant. The fact that most SEO content perpetuates the myth is a problem worth solving.
The honest checklist
- Decide what you actually want. Most sites want full participation.
- If you want training opt-out:
User-agent: Google-Extended+Disallow: /in robots.txt. - If you want AI Overview opt-out:
nosnippetmeta robots — but accept the regular snippet loss. - Don’t confuse the two.
- Re-audit. The Metaspry AI crawler signal panel surfaces this distinction explicitly.
The bigger picture
Google deliberately separated training and inclusion as two different controls. The reason isn’t malicious — it’s that training and citation are genuinely different things, with different commercial calculus per publisher. But the result is a control surface that’s easy to misuse.
If you publish content for a living and you’ve been touching robots.txt thinking you’re managing AI Overview citation, audit your setup this week. The chances you got the signal you wanted are roughly 50-50.
Further reading
- Metaspry docs: AI crawler signals
- llms.txt in 2026: a cargo cult most tools are afraid to call out
- Google Search Central docs on
Google-Extendedandnosnippet
Related posts
llms.txt in 2026: a cargo cult most tools are afraid to call out
Google does not use llms.txt. The 300k-domain study found zero citation lift. Here is who actually reads it, what it should contain, and why every SEO tool that shows MISSING llms.txt as a red error is participating in misinformation.
Google I/O 2026: every page is now an API endpoint for agents
Information Agents, Universal Commerce Protocol, Generative UI, audio glasses, and a model called Omni. The honest framing isn't 'Google killed the web' - it's 'your meta tags and JSON-LD are now the API agents consume'. With primary-source citations + what to ship this week.
Anthropic's 3-bot identity problem: blocking ClaudeBot doesn't block Claude
Anthropic now operates ClaudeBot, Claude-User, and Claude-SearchBot as three separate robots.txt identities. Sites that only blocked ClaudeBot still leak content via the other two. Here's the full pattern - and the equivalent OpenAI two-bot trap.