Capability · Framework — agents

browser-use

browser-use turns any LLM into a web-navigating agent by exposing Playwright-driven browser actions as tools. It flattens the DOM into a numbered element list the model can reference, handles login flows, iframes, and shadow DOM, and is widely used to bootstrap web automation research and shipping agents.

Framework facts

Category: agents
Language: Python
License: MIT
Repository: https://github.com/browser-use/browser-use

Install

pip install browser-use
playwright install chromium --with-deps

Quickstart

import asyncio
from browser_use import Agent
from langchain_anthropic import ChatAnthropic

async def main():
    agent = Agent(
        task='Find the top Hacker News story and summarise it.',
        llm=ChatAnthropic(model='claude-opus-4-7'),
    )
    await agent.run()

asyncio.run(main())

Alternatives

Skyvern — self-hosted browser agents with vision-first flows
LaVague — action engine with a world model
Stagehand — TypeScript browser agent by Browserbase
AgentQL — query-language layer over Playwright

Frequently asked questions

Does browser-use need vision models?

No. It works with text-only LLMs by flattening the DOM into a numbered list of interactive elements. Vision mode adds screenshots and improves grounding on complex sites but costs more.

Is it production-safe?

For many workflows yes, but login-gated or bot-protected sites may still require human handoff. Always add retries, human-in-the-loop checkpoints, and rate limits.

Sources

browser-use — GitHub — accessed 2026-04-20
browser-use — docs — accessed 2026-04-20