Cloudflare AI Crawler Rules Can Block Googlebot
Cloudflare announced on September 15th that all websites can now manage AI crawlers, categorizing them as Search, Agent, or Training. This update provides granular control over how AI bots interact with website content. The default settings for these rules are designed to offer a baseline level of protection and management for AI-driven web traffic.
Crucially, the new rules include the capability to block Googlebot if a site owner chooses to disallow training data collection. This means that websites configured to prevent AI models from using their content for training purposes may inadvertently block Google's primary search crawler. This potential blocking mechanism highlights the evolving landscape of web crawling and AI data acquisition.
The implications of this change are significant for website owners concerned about their content being used for AI model training. By offering explicit controls, Cloudflare empowers users to define their stance on data usage for AI development. However, it also necessitates careful configuration to avoid unintended consequences, such as disrupting organic search visibility.
Search Engine Journal reported on this development, emphasizing the potential impact on search engine optimization (SEO) strategies. The ability to block specific AI crawlers, or to categorize them in ways that might lead to blocking, introduces a new layer of complexity for website administrators and SEO professionals. The default settings are intended to be a starting point, but customization will be key.
Original source — read the full reporting at the publisher:
Read on Search Engine Journal