Documentation Index
Fetch the complete documentation index at: https://docs.openbrowser.me/llms.txt
Use this file to discover all available pages before exploring further.
Examples:
- deterministic clicks
- file handling
- calling APIs
- human-in-the-loop
- browser interactions
- calling LLMs
- get 2fa codes
- send emails
- Playwright integration (see GitHub example)
- …
Simply add @tools.action(...) to your function.
from openbrowser import Tools, Agent
tools = Tools()
@tools.action(description='Ask human for help with a question')
def ask_human(question: str) -> ActionResult:
answer = input(f'{question} > ')
return f'The human responded with: {answer}'
agent = Agent(task='...', llm=llm, tools=tools)
description (required) - What the tool does, the LLM uses this to decide when to call it.
allowed_domains - List of domains where tool can run (e.g. ['*.example.com']), defaults to all domains
The Agent fills your function parameters based on their names, type hints, & defaults.
Available Objects
Your function has access to these objects:
browser_session: BrowserSession - Current browser session for CDP access
cdp_client - Direct Chrome DevTools Protocol client
page_extraction_llm: BaseChatModel - The LLM you pass into agent. This can be used to do a custom llm call here.
file_system: FileSystem - File system access
available_file_paths: list[str] - Available files for upload/processing
has_sensitive_data: bool - Whether action contains sensitive data
Browser Interaction Examples
You can use browser_session to directly interact with page elements using CSS selectors:
from openbrowser import Tools, Agent, ActionResult, BrowserSession
tools = Tools()
@tools.action(description='Click the submit button using CSS selector')
async def click_submit_button(browser_session: BrowserSession):
# Get the current page
page = await browser_session.must_get_current_page()
# Get element(s) by CSS selector
elements = await page.get_elements_by_css_selector('button[type="submit"]')
if not elements:
return ActionResult(extracted_content='No submit button found')
# Click the first matching element
await elements[0].click()
return ActionResult(extracted_content='Submit button clicked!')
Available methods on Page:
get_elements_by_css_selector(selector: str) - Returns list of matching elements
get_element_by_prompt(prompt: str, llm) - Returns element or None using LLM
must_get_element_by_prompt(prompt: str, llm) - Returns element or raises error
Available methods on Element:
click() - Click the element
type(text: str) - Type text into the element
get_text() - Get element text content
- See
openbrowser/actor/element.py for more methods
You can use Pydantic for the tool parameters:
from pydantic import BaseModel
class Cars(BaseModel):
name: str = Field(description='The name of the car, e.g. "Toyota Camry"')
price: int = Field(description='The price of the car as int in USD, e.g. 25000')
@tools.action(description='Save cars to file')
def save_cars(cars: list[Cars]) -> str:
with open('cars.json', 'w') as f:
json.dump(cars, f)
return f'Saved {len(cars)} cars to file'
task = "find cars and save them to file"
Domain Restrictions
Limit tools to specific domains:
@tools.action(
description='Fill out banking forms',
allowed_domains=['https://mybank.com']
)
def fill_bank_form(account_number: str) -> str:
# Only works on mybank.com
return f'Filled form for account {account_number}'
Advanced Example
For a comprehensive example of custom tools with Playwright integration, see:
Playwright Integration Example
This shows how to create custom actions that use Playwright’s precise browser automation alongside OpenBrowser.