Home/News/WebBrain Launches Open-Source AI Browser Agent
MarkTechPost3 min read

WebBrain Launches Open-Source AI Browser Agent

WebBrain, a new free and open-source browser agent, was launched this week for Chrome and Firefox. Developed by Emre Sokullu and licensed under MIT, the agent is designed to read web pages, extract data, and automate multi-step tasks directly within the browser. A key feature of WebBrain is its ability to run entirely on a local AI model, meaning no page data leaves the user's machine. This local-first approach enhances privacy and security. For users requiring more advanced capabilities, WebBrain can also connect to cloud-based AI APIs.

The agent operates from the browser's side panel. In Chrome, it utilizes Manifest V3 and the sidePanel API, while in Firefox, it employs Manifest V2 and sidebar_action. Each browser tab maintains its own conversation history, and the extension functions within the user's existing authenticated sessions, mirroring the user's logged-in accounts without storing data externally or collecting telemetry. WebBrain is available in English, Spanish, French, Turkish, and Chinese, with automatic browser language detection upon first launch.

WebBrain offers two primary modes: Ask mode and Act mode. Ask mode is read-only and cannot alter web pages. It reads pages using standard content scripts. Act mode, conversely, can perform actions such as clicking, typing, scrolling, navigating, and executing workflows. To achieve this, Act mode interacts with the page via the Chrome DevTools Protocol through the chrome.debugger API, generating trusted input events that are recognized by modern websites. This method also allows access to cross-origin iframes and shadow DOM elements that content scripts cannot reach. The debugger is attached per tab only when an action requires it, and Chrome displays a banner indicating when WebBrain is debugging.

However, Act mode's functionality differs between browsers. While Chrome's debugger API provides robust capabilities, Firefox lacks an equivalent Chrome DevTools Protocol (CDP) implementation, making its Act mode "meaningfully weaker." The agent uses specific temperature settings for predictability: 0.15 for Act mode, 0.3 for Ask mode, and 0 for dedicated vision screenshot descriptions. The design also addresses security concerns inherent in browser agents, which can be vulnerable to prompt injection attacks. WebBrain's architecture is intended to mitigate these risks by carefully controlling its interaction with web pages.

Original source — read the full reporting at the publisher:

Read on MarkTechPost

Read next