-
Couldn't load subscription status.
- Fork 15
Open
Description
Goal: which tool can I extract the page content?
extract_structured_data is removed at browser-use/browser-use#3167.
code investigation
extract_clean_markdown at agent/page.py
async def _extract_clean_markdown(self, extract_links: bool = False) -> tuple[str, dict]:
"""Extract clean markdown from the current page using enhanced DOM tree.
Uses the shared markdown extractor for consistency with tools/service.py.
"""
from browser_use.dom.markdown_extractor import extract_clean_markdown
dom_service = self.dom_service
return await extract_clean_markdown(dom_service=dom_service, target_id=self._target_id, extract_links=extract_links)- recommend execute_js for full-page semantic extraction.
- Magnus updated system prompt: "You can call execute to gather structured semantic information from the entire page, including parts not currently visible."
https://github.com/browser-use/browser-use/blob/5e4d75a3bbf1d811dfc90845844d808c5f27dcf2/browser_use/agent/system_prompt.md?plain=1#L72C1-L73C166
Metadata
Metadata
Assignees
Labels
No labels