-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Description
crawl4ai version
0.6.3
Expected Behavior
If no page exists in the persistent context, a new page should be created via:
page = await context.new_page()
Current Behavior
When using BrowserConfig(use_persistent_context=True, user_data_dir=...), the crawler crashes during concurrent runs when context.pages is empty.
This happens inside BrowserManager.get_page() when attempting to access context.pages[0] directly without checking if the list is empty:
This becomes especially problematic during concurrent execution (arun_many) where multiple URLs are processed in parallel, and a new page is expected but not guaranteed to exist.
Or it tells:
[ERROR]... × https://www.ft.com/cont...-4879-a9ab-78a576f9f474 | Error: Unexpected error in _crawl_web at line 528
in wrap_api_call (.venv\Lib\site-packages\playwright_impl_connection.py):
Error: BrowserContext.new_page: Target page, context or browser has been closed
Code context:
523 parsed_st = _extract_stack_trace_information_from_stack(st, is_internal)
524 self._api_zone.set(parsed_st)
525 try:
526 return await cb()
527 except Exception as error:
528 → raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None
529 finally:
530 self._api_zone.set(None)
531
532 def wrap_api_call_sync(
533 self, cb: Callable[[], Any], is_internal: bool = False
Is it a limitation with Playwright that identity-based crawling does not support async requests?
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
OS
Windows
Python version
3.13.3
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response