A 13-word edit can steer what deep-research AI agents recommend

Cornell Tech researchers discovered that deep-research AI agents can be manipulated through brief edits to publicly accessible web pages, a vulnerability they termed Web Agent Retrieval Poisoning (WARP). This attack allows malicious actors to inject short text snippets, as brief as 13 words, into user-generated content like Reddit threads or Wikipedia pages. When AI agents performing web searches and compiling cited reports retrieve these "poisoned" pages, they can inadvertently cite and repeat the attacker's fabricated information, recommending fake products, services, or entities. The researchers demonstrated this by successfully inserting a recommendation for a fictitious cryptocurrency, BananaCoin, into a Co-STORM report as a legitimate investment option. User-generated platforms, particularly Reddit, represent a significant entry point for such attacks, accounting for 54% to 71% of user-generated URLs retrieved by tested open-source AI systems like STORM, Co-STORM, and OmniThink. The WARP attack does not require direct access to the AI model, its prompts, or the search engine itself, relying solely on the manipulation of content that the AI agent is likely to retrieve during its research process.