Skip to content

A0 browser use extension#1069

Open
TerminallyLazy wants to merge 13 commits intoagent0ai:developmentfrom
TerminallyLazy:a0-browser-use-extension
Open

A0 browser use extension#1069
TerminallyLazy wants to merge 13 commits intoagent0ai:developmentfrom
TerminallyLazy:a0-browser-use-extension

Conversation

@TerminallyLazy
Copy link
Contributor

Summary

  • Adds a complete plugins/browser_use/ plugin implementing browser automation for Agent Zero's upcoming plugin system (PR feat: Unified Plugin System & Memory Plugin PoC #998)
  • Two tools: browser_step (13 deterministic actions: open, click, type, scroll, etc.) and browser_auto (autonomous browser-use Agent wrapper with configurable max_steps, vision, flash_mode)
  • Shared SessionManager with double-checked locking manages a single headed Chromium session per agent context
  • WebUI browser viewer modal with screenshot polling, URL bar, agent-busy overlay, and navigation
  • CDP proxy with whitelisted methods (screencast + input only) for future real-time streaming
  • Settings tab for browser mode, max steps, vision, flash mode, screencast quality, and window size
  • Sidebar globe button, agent_init cleanup extension, and 3 API handlers (connect, interact, settings)

Files (14 plugin files + 2 docs)

  plugins/browser_use/
  ├── api/
  │   ├── browser_use_connect.py      # Session lifecycle (start/stop/status)
  │   ├── browser_use_interact.py     # HTTP interaction (navigate/screenshot/state)
  │   └── browser_use_settings.py     # Settings CRUD
  ├── tools/
  │   ├── browser_step.py             # 13-action deterministic tool
  │   └── browser_auto.py             # Autonomous browser-use Agent wrapper
  ├── helpers/
  │   ├── session_manager.py          # Shared browser session lifecycle
  │   └── cdp_proxy.py                # CDP WebSocket proxy with whitelisting
  ├── extensions/
  │   ├── python/agent_init/
  │   │   └── _10_browser_cleanup.py  # Cleanup on agent init/reset
  │   └── webui/sidebar-quick-actions-main-start/
  │       └── browser-entry.html      # Sidebar globe button
  ├── webui/
  │   ├── browser-viewer.html         # Viewer modal (canvas + URL bar + overlay)
  │   ├── browser-viewer-store.js     # Alpine store for viewer state
  │   ├── browser-settings.html       # Settings tab component
  │   └── browser-settings-store.js   # Alpine store for settings
  └── prompts/
      └── agent.system.tool.browser_step.md

Dependencies

TerminallyLazy and others added 13 commits February 18, 2026 00:49
Design for a new browser-use plugin that provides step-by-step and
autonomous browser tools with CDP screencast viewer in the WebUI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
12-task implementation plan with complete code for all 14 plugin files,
covering session manager, tools, API handlers, CDP proxy, extensions,
and WebUI components.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Foundation for the browser_use plugin. Manages a single headed Chromium
session per agent context with asyncio.Lock for tool/viewer concurrency,
agent.get_data/set_data persistence, and clean shutdown.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Step-by-step browser control tool where each call performs exactly one
action and returns the result. Uses the shared SessionManager for
browser lifecycle. Includes: open, state, click, type, input,
screenshot, scroll, back, keys, select, extract, eval, close.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wraps browser-use's Agent class with shared SessionManager, configurable
vision/max_steps/flash_mode params, secrets masking, cancellation via
InterventionException, and asyncio.wait_for timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Browser viewer store manages connection lifecycle, polling for
screenshots/status, navigation, and cleanup. Modal HTML provides
URL bar with status indicator, canvas for live screenshots, agent
busy overlay, and footer with screenshot/close actions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SessionManager.ensure_started(): add double-checked locking to prevent
  concurrent callers from launching duplicate browser instances
- Remove __del__ method that unsafely created a new event loop via
  asyncio.set_event_loop(), which could corrupt the running app loop.
  Cleanup is already handled by the agent_init extension and explicit close()
- Move x-create from outer div to inner template container so $refs.screenCanvas
  exists when open() fires, and add null guard in store

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments