Googlebot Dominates AI Crawler Traffic: Key Insights from Cloudflare’s Yearly Analysis

4

Googlebot Dominates AI Crawler Traffic, Cloudflare Report Reveals

Googlebot crawled more than 200 times the web pages reached by PerplexityBot, according to Cloudflare's sixth annual Year in Review. The comprehensive report, which analyzes data from Cloudflare's network spanning 330+ cities across 125 countries, also revealed that global internet traffic grew 19% year-over-year in 2025.

The findings highlight Google's unique dual-purpose approach to web crawling, which combines both search indexing and AI training. This creates a dilemma for publishers who may want to block AI training crawlers but can't do so without risking their search visibility. Meanwhile, civil society organizations became the most-attacked sector for the first time.

AI Crawler Landscape Reveals Significant Disparities

Cloudflare's analysis of successful requests for HTML content during October and November 2025 showed striking differences in crawler reach. Googlebot accessed 11.6% of unique web pages, more than triple the 3.6% reached by OpenAI's GPTBot. PerplexityBot lagged far behind at just 0.06% of pages.

Bingbot secured third place at 2.6%, while Meta-ExternalAgent and ClaudeBot tied at 2.4% each. The report emphasizes the strategic advantage Google holds by using the same crawler for both search indexing and AI training.

"Because Googlebot is used to crawl content for both search indexing and AI model training, and because of Google's long-established dominance in search, website operators are essentially unable to block Googlebot's AI training without risking search discoverability," Cloudflare noted in the report.

Throughout 2025, AI bots (excluding Googlebot) accounted for an average of 4.2% of HTML requests across Cloudflare's customer base. This figure fluctuated between 2.4% in early April and 6.4% in late June. Notably, Googlebot alone was responsible for 4.5% of HTML requests, slightly exceeding the combined total of all other AI bots.

The distribution of HTML traffic showed interesting shifts throughout the year. Human-generated traffic started 2025 at seven percentage points below non-AI bot traffic. By September, human traffic occasionally surpassed non-AI bot traffic. As of December 2, humans generated 47% of HTML requests while non-AI bots generated 44%.

This growing disparity in crawler reach has significant implications for website traffic generation strategies and search engine optimization, particularly for businesses that rely on organic search visibility for customer acquisition.

Crawl-to-Refer Ratios Show AI Platforms Take More Than They Give

One of the most revealing metrics in Cloudflare's report is the crawl-to-refer ratio, which measures how often AI and search platforms send traffic back to sites relative to how frequently they crawl them. A higher ratio indicates heavy crawling with minimal referral traffic.

Anthropic exhibited the highest ratios among AI platforms, ranging from approximately 25,000:1 to 100,000:1 during the second half of the year after stabilizing from earlier volatility. This means for every 25,000 to 100,000 pages Anthropic crawled, they sent users back to source sites only once.

OpenAI's ratios reached as high as 3,700:1 in March, though they showed some decline later in the year as ChatGPT search usage increased. Perplexity maintained the lowest ratios among leading AI platforms, generally below 400:1 and under 200:1 from September onward.

For comparison, Google's search crawl-to-refer ratio remained much lower, typically between 3:1 and 30:1 throughout the year, demonstrating a more balanced relationship between crawling and referring traffic.

The report also identified rapid growth in "user-action" crawling—when bots visit sites in response to user questions posed to chatbots. This category saw a more than 15-fold increase from January through early December, with OpenAI's ChatGPT-User bot showing a clear weekly usage pattern starting in mid-February.

These findings should prompt website owners to carefully consider how artificial intelligence systems impact their business operations and content strategy, especially as AI platforms continue to consume vast amounts of web content while returning minimal traffic.

AI Crawlers Most Commonly Blocked in Robots.txt

Cloudflare's analysis of robots.txt files across nearly 3,900 of the top 10,000 domains revealed that AI crawlers were the most frequently blocked user agents. GPTBot, ClaudeBot, and CCBot had the highest number of full disallow directives, indicating site owners' preference to block these crawlers entirely.

By contrast, Googlebot and Bingbot showed a different pattern, with their disallow directives primarily focused on partial blocks. This suggests site owners are mainly restricting these crawlers from accessing login endpoints and non-content areas rather than blocking them completely.

The selective blocking approach demonstrates how publishers are attempting to maintain search visibility while limiting AI training access, a challenging balancing act given Google's dual-purpose crawling strategy. According to Google's documentation on crawlers, there's no official way to allow Googlebot to crawl for search while preventing it from using content for AI training.

Security Landscape: Civil Society Organizations Now Most Targeted

For the first time in Cloudflare's reporting history, organizations in the "People and Society" vertical became the most targeted by attacks. This category, which includes religious institutions, nonprofits, civic organizations, and libraries, received 4.4% of global mitigated traffic, up from under 2% at the start of the year.

Attack share for this sector jumped dramatically to over 17% in late March and peaked at 23.2% in early July. Many of these organizations receive protection through Cloudflare's Project Galileo.

The gambling and games vertical, previously the most-attacked sector in 2024, saw its share drop by more than half to 2.6%.

This alarming trend highlights the increasing importance of implementing comprehensive website security measures for all organizations, particularly those in the nonprofit and civic sectors that may have previously considered themselves at lower risk.

Additional Key Findings

Cloudflare's comprehensive report included several other significant findings:

  • Global internet traffic grew 19% year-over-year, with growth remaining relatively flat through mid-April before accelerating after mid-August
  • Post-quantum encryption now secures 52% of human traffic to Cloudflare, nearly double the 29% share at the start of the year
  • ChatGPT maintained its position as the top generative AI service globally
  • Google Gemini, Windsurf AI, Grok/xAI, and DeepSeek emerged as new entrants to the top 10 AI services
  • Starlink traffic doubled in 2025, with service launching in more than 20 new countries
  • Nearly half of the 174 major internet outages observed globally resulted from government-directed shutdowns
  • Cable cut outages decreased by nearly 50%, while power failure outages doubled
  • European countries dominated internet quality metrics, with Spain topping the list for overall internet quality and average download speeds exceeding 300 Mbps

How This Information Can Help You

The findings from Cloudflare's report have several practical implications for website owners, publishers, and digital marketers:

  1. Reevaluate your crawler policies: With AI crawlers being the most blocked user agents, consider whether your current robots.txt directives align with your content strategy and business goals.

  2. Understand the Google dilemma: Recognize that blocking Googlebot for AI training means potentially sacrificing search visibility. Consider whether the trade-off makes sense for your content.

  3. Monitor referral traffic from AI services: The wide variation in crawl-to-refer ratios suggests that some AI platforms may provide more value in terms of traffic than others. Track which services send users to your site.

  4. Enhance security for nonprofit websites: If you manage websites for civil society organizations, be aware of the increased targeting and consider additional security measures.

  5. Prepare for continued traffic growth: With global internet traffic growing 19% year-over-year, ensure your infrastructure can scale to handle increasing demands.

As AI continues to reshape the digital landscape, Cloudflare expects these metrics to evolve further. The company has added several new AI-related datasets to this year's report that weren't available in previous editions, signaling the growing importance of understanding AI's impact on web traffic and content consumption.

You might also like
404