Home/News/More News Sites Default To Blocking AI Crawlers via @sejournal, @MattGSouthern
Search Engine Journal2 min read

More News Sites Default To Blocking AI Crawlers via @sejournal, @MattGSouthern

Reuters and Time began blocking AI web crawlers by default this week, joining a growing number of news organizations that are implementing stricter controls on how their content is accessed and utilized by artificial intelligence models. These publishers are now opting to allow only explicitly approved crawlers through curated allowlists, a move designed to prevent unauthorized scraping of their articles and data. This shift signifies a broader trend among news outlets to exert greater control over their digital assets in the face of escalating AI development and data consumption. The decision by major entities like Reuters and Time suggests a potential increase in friction for AI companies seeking to train their models on vast datasets of news content. Publishers are increasingly concerned about the intellectual property and potential revenue implications of their work being used without compensation or attribution. The strategy of using allowlists provides a more granular approach to managing access, enabling publishers to selectively grant permissions to specific AI developers or research institutions they deem legitimate or beneficial. This approach contrasts with the more open, default-access policies that were common in the earlier stages of web crawling.

Original source — read the full reporting at the publisher:

Read on Search Engine Journal

Read next