OpenBrowser supports a wide variety of LLM providers. Choose the one that best fits your needs.
GEMINI_API_KEY is deprecated and should be named GOOGLE_API_KEY as of 2025-05.
from openbrowser import Agent, ChatGoogle
from dotenv import load_dotenv
# Read GOOGLE_API_KEY into env
load_dotenv()
# Initialize the model
llm = ChatGoogle(model='gemini-flash-latest')
# Create agent with the model
agent = Agent(
task="Your task here",
llm=llm
)
Required environment variables:
O3 model is recommended for best accuracy.
from openbrowser import Agent, ChatOpenAI
# Initialize the model
llm = ChatOpenAI(
model="o3",
)
# Create agent with the model
agent = Agent(
task="...", # Your task here
llm=llm
)
Required environment variables:
You can use any OpenAI compatible model by passing the model name to the
ChatOpenAI class using a custom URL (or any other parameter that would go
into the normal OpenAI API call).
from openbrowser import Agent, ChatAnthropic
# Initialize the model
llm = ChatAnthropic(
model="claude-sonnet-4-0",
)
# Create agent with the model
agent = Agent(
task="...", # Your task here
llm=llm
)
And add the variable:
from openbrowser import Agent, ChatAzureOpenAI
from pydantic import SecretStr
import os
# Initialize the model
llm = ChatAzureOpenAI(
model="o4-mini",
)
# Create agent with the model
agent = Agent(
task="...", # Your task here
llm=llm
)
Required environment variables:
AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com/
AZURE_OPENAI_API_KEY=
AWS Bedrock provides access to multiple model providers through a single API. We support both a general AWS Bedrock client and provider-specific convenience classes.
General AWS Bedrock (supports all providers)
from openbrowser import Agent, ChatAWSBedrock
# Works with any Bedrock model (Anthropic, Meta, AI21, etc.)
llm = ChatAWSBedrock(
model="anthropic.claude-3-5-sonnet-20240620-v1:0", # or any Bedrock model
aws_region="us-east-1",
)
# Create agent with the model
agent = Agent(
task="Your task here",
llm=llm
)
Anthropic Claude via AWS Bedrock (convenience class)
from openbrowser import Agent, ChatAnthropicBedrock
# Anthropic-specific class with Claude defaults
llm = ChatAnthropicBedrock(
model="anthropic.claude-3-5-sonnet-20240620-v1:0",
aws_region="us-east-1",
)
# Create agent with the model
agent = Agent(
task="Your task here",
llm=llm
)
AWS Authentication
Required environment variables:
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=us-east-1
You can also use AWS profiles or IAM roles instead of environment variables. The implementation supports:
- Environment variables (
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION)
- AWS profiles and credential files
- IAM roles (when running on EC2)
- Session tokens for temporary credentials
- AWS SSO authentication (
aws_sso_auth=True)
from openbrowser import Agent, ChatGroq
llm = ChatGroq(model="meta-llama/llama-4-maverick-17b-128e-instruct")
agent = Agent(
task="Your task here",
llm=llm
)
Required environment variables:
Oracle Cloud Infrastructure (OCI) example
OCI provides access to various generative AI models including Meta Llama, Cohere, and other providers through their Generative AI service.
from openbrowser import Agent, ChatOCIRaw
# Initialize the OCI model
llm = ChatOCIRaw(
model_id="ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceya...",
service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
compartment_id="ocid1.tenancy.oc1..aaaaaaaayeiis5uk2nuubznrekd...",
provider="meta", # or "cohere"
temperature=0.7,
max_tokens=800,
top_p=0.9,
auth_type="API_KEY",
auth_profile="DEFAULT"
)
# Create agent with the model
agent = Agent(
task="Your task here",
llm=llm
)
Required setup:
- Set up OCI configuration file at
~/.oci/config
- Have access to OCI Generative AI models in your tenancy
- Install the OCI Python SDK:
uv add oci or pip install oci
Authentication methods supported:
API_KEY: Uses API key authentication (default)
INSTANCE_PRINCIPAL: Uses instance principal authentication
RESOURCE_PRINCIPAL: Uses resource principal authentication
Ollama
- Install Ollama: https://github.com/ollama/ollama
- Run
ollama serve to start the server
- In a new terminal, install the model you want to use:
ollama pull llama3.1:8b (this has 4.9GB)
from openbrowser import Agent, ChatOllama
llm = ChatOllama(model="llama3.1:8b")
Langchain
Example on how to use Langchain with OpenBrowser.
Currently, only qwen-vl-max is recommended for OpenBrowser. Other Qwen models, including qwen-max, have issues with the action schema format.
Smaller Qwen models may return incorrect action schema formats (e.g., actions: [{"navigate": "google.com"}] instead of [{"navigate": {"url": "google.com"}}]). If you want to use other models, add concrete examples of the correct action format to your prompt.
from openbrowser import Agent, ChatOpenAI
from dotenv import load_dotenv
import os
load_dotenv()
# Get API key from https://modelstudio.console.alibabacloud.com/?tab=playground#/api-key
api_key = os.getenv('ALIBABA_CLOUD')
base_url = 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
llm = ChatOpenAI(model='qwen-vl-max', api_key=api_key, base_url=base_url)
agent = Agent(
task="Your task here",
llm=llm,
use_vision=True
)
Required environment variables:
from openbrowser import Agent, ChatOpenAI
from dotenv import load_dotenv
import os
load_dotenv()
# Get API key from https://www.modelscope.cn/docs/model-service/API-Inference/intro
api_key = os.getenv('MODELSCOPE_API_KEY')
base_url = 'https://api-inference.modelscope.cn/v1/'
llm = ChatOpenAI(model='Qwen/Qwen2.5-VL-72B-Instruct', api_key=api_key, base_url=base_url)
agent = Agent(
task="Your task here",
llm=llm,
use_vision=True
)
Required environment variables:
Other models (DeepSeek, Novita, OpenRouter…)
We support all other models that can be called via OpenAI compatible API. We are open to PRs for more providers.
from openbrowser import Agent
from openbrowser.llm import ChatDeepSeek
import os
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
llm = ChatDeepSeek(
base_url='https://api.deepseek.com/v1',
model='deepseek-chat',
api_key=deepseek_api_key,
)
agent = Agent(
task='Your task here',
llm=llm,
use_vision=False,
)
Required environment variables:
from openbrowser import Agent, ChatOpenAI
import os
api_key = os.getenv('NOVITA_API_KEY')
agent = Agent(
task='Your task here',
llm=ChatOpenAI(
base_url='https://api.novita.ai/v3/openai',
model='deepseek/deepseek-v3-0324',
api_key=api_key,
),
use_vision=False,
)
Required environment variables:
from openbrowser import Agent, ChatOpenAI
import os
llm = ChatOpenAI(
model='x-ai/grok-4',
base_url='https://openrouter.ai/api/v1',
api_key=os.getenv('OPENROUTER_API_KEY'),
)
agent = Agent(
task='Your task here',
llm=llm,
)
Required environment variables:
ChatBrowserUse() is an external LLM service from browser-use.com optimized for browser automation tasks.
from openbrowser import Agent, ChatBrowserUse
# Initialize the model
llm = ChatBrowserUse()
# Create agent with the model
agent = Agent(
task="...", # Your task here
llm=llm
)
Required environment variables:
Get your API key from browser-use.com.