Capability · Framework — agents
browser-use
browser-use turns any LLM into a web-navigating agent by exposing Playwright-driven browser actions as tools. It flattens the DOM into a numbered element list the model can reference, handles login flows, iframes, and shadow DOM, and is widely used to bootstrap web automation research and shipping agents.
Framework facts
- Category
- agents
- Language
- Python
- License
- MIT
- Repository
- https://github.com/browser-use/browser-use
Install
pip install browser-use
playwright install chromium --with-deps Quickstart
import asyncio
from browser_use import Agent
from langchain_anthropic import ChatAnthropic
async def main():
agent = Agent(
task='Find the top Hacker News story and summarise it.',
llm=ChatAnthropic(model='claude-opus-4-7'),
)
await agent.run()
asyncio.run(main()) Alternatives
- Skyvern — self-hosted browser agents with vision-first flows
- LaVague — action engine with a world model
- Stagehand — TypeScript browser agent by Browserbase
- AgentQL — query-language layer over Playwright
Frequently asked questions
Does browser-use need vision models?
No. It works with text-only LLMs by flattening the DOM into a numbered list of interactive elements. Vision mode adds screenshots and improves grounding on complex sites but costs more.
Is it production-safe?
For many workflows yes, but login-gated or bot-protected sites may still require human handoff. Always add retries, human-in-the-loop checkpoints, and rate limits.
Sources
- browser-use — GitHub — accessed 2026-04-20
- browser-use — docs — accessed 2026-04-20