RecursiveUrlLoader question #18886
Replies: 4 comments 1 reply
-
Hey there @ArturPevko! Nice to see you diving into another challenge with LangChain. How's it going on your end? Based on the context provided, there could be several reasons why the
In addition, there are two similar issues in the LangChain repository that might be related to your problem:
Please try these solutions and let me know if they help resolve your issue. If not, please provide more information about your setup and the exact error message you're seeing, and I'll be happy to assist further. Sources
|
Beta Was this translation helpful? Give feedback.
-
@ArturPevko from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader(["https://sotkaonline.ru/", "https://sotkaonline.ru/historia"])
document = loader.load()
print(len(document)) I think that's what you need if you do not want to change the website structure |
Beta Was this translation helpful? Give feedback.
-
I tried to crawl complete website using RecursiveUrlLoader with max_depth=None, but it's only crawling first page linked URL only, it never went to other pages. but I tried max_depth with any number then crawling other pages, can any one guide me how can I crawl complete website. |
Beta Was this translation helpful? Give feedback.
-
its not crawling the complete website. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
Good afternoon,
I have a code that I would like to use to load the HTML page of the website "https://sotkaonline.ru/", but it is returning an empty document.
Could you please help me with this issue? I would appreciate any assistance you can provide.
Thank you for your time and attention.
System Info
pip install langchain
Beta Was this translation helpful? Give feedback.
All reactions