Skip to content

Commit b1d3804

Browse files
authored
Merge pull request #585 from ScrapeGraphAI/anthropic-refactoring
Anthropic refactoring
2 parents 88e76ce + 37a4a8a commit b1d3804

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+363
-232
lines changed

CHANGELOG.md

Lines changed: 31 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,44 @@
1-
## [1.14.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.14.0...v1.14.1) (2024-08-24)
1+
## [1.15.0-beta.3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.15.0-beta.2...v1.15.0-beta.3) (2024-08-24)
2+
3+
4+
5+
### Bug Fixes
6+
7+
* update abstract graph ([86fe5fc](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/86fe5fcaf1a6ba28786678874378f07fba1db40f))
8+
9+
## [1.15.0-beta.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.15.0-beta.1...v1.15.0-beta.2) (2024-08-23)
210

311

412
### Bug Fixes
513

6-
* add claude3.5 sonnet ([ee8f8b3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/ee8f8b31ecfe4ffd311528d2f48cb055e4609d99))
14+
* abstract graph ([cf1fada](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cf1fada36a6716cb0e24bbc5da7509446a964145))
15+
716

817

918
### Docs
1019

1120
* added sponsors ([b3a2d0d](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/b3a2d0d65a41f6e645fac3fc84f702fdf64b951c))
1221

22+
## [1.15.0-beta.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.14.1-beta.1...v1.15.0-beta.1) (2024-08-23)
23+
24+
25+
### Features
26+
27+
* ligthweigthing the library ([62f32e9](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/62f32e994bcb748dfef4f7e1b2e5213a989c33cc))
28+
29+
30+
### Bug Fixes
31+
32+
* Azure OpenAI issue ([a92b9c6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/a92b9c6970049a4ba9dbdf8eff3eeb7f98c6c639))
33+
34+
## [1.14.1-beta.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.14.0...v1.14.1-beta.1) (2024-08-21)
35+
36+
37+
### Bug Fixes
38+
39+
* **models_tokens:** add llama2 and llama3 sizes explicitly ([b05ec16](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/b05ec16b252d00c9c9ee7c6d4605b420851c7754))
40+
41+
1342
## [1.14.0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.3...v1.14.0) (2024-08-20)
1443

1544

README.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,28 @@ playwright install
3232

3333
**Note**: it is recommended to install the library in a virtual environment to avoid conflicts with other libraries 🐱
3434

35+
By the way if you to use not mandatory modules it is necessary to install by yourself with the following command:
36+
37+
### Installing "Other Language Models"
38+
39+
This group allows you to use additional language models like Fireworks, Groq, Anthropic, Hugging Face, and Nvidia AI Endpoints.
40+
```bash
41+
pip install scrapegraphai[other-language-models]
42+
43+
```
44+
### Installing "More Semantic Options"
45+
46+
This group includes tools for advanced semantic processing, such as Graphviz.
47+
```bash
48+
pip install scrapegraphai[more-semantic-options]
49+
```
50+
### Installing "More Browser Options"
51+
52+
This group includes additional browser management options, such as BrowserBase.
53+
```bash
54+
pip install scrapegraphai[more-browser-options]
55+
```
56+
3557
## 💻 Usage
3658
There are multiple standard scraping pipelines that can be used to extract information from a website (or local file).
3759

docs/README.md

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,6 @@ markmap:
99

1010
## **Short-Term Goals**
1111

12-
- Integration with more llm APIs
13-
14-
- Test proxy rotation implementation
15-
16-
- Add more search engines inside the SearchInternetNode
17-
1812
- Improve the documentation (ReadTheDocs)
1913
- [Issue #102](https://github.com/VinciGit00/Scrapegraph-ai/issues/102)
2014

@@ -23,9 +17,6 @@ markmap:
2317
## **Medium-Term Goals**
2418

2519
- Node for handling API requests
26-
27-
- Improve SearchGraph to look into the first 5 results of the search engine
28-
2920
- Make scraping more deterministic
3021
- Create DOM tree of the website
3122
- HTML tag text embeddings with tags metadata
@@ -70,5 +61,3 @@ markmap:
7061
- Automatic generation of scraping pipelines from a given prompt
7162

7263
- Create API for the library
73-
74-
- Finetune a LLM for html content

docs/source/scrapers/llm.rst

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,3 +194,35 @@ We can also pass a model instance for the chat model and the embedding model. Fo
194194
"model_instance": embedder_model_instance
195195
}
196196
}
197+
198+
Other LLM models
199+
^^^^^^^^^^^^^^^^
200+
201+
We can also pass a model instance for the chat model and the embedding model through the **model_instance** parameter.
202+
This feature enables you to utilize a Langchain model instance.
203+
You will discover the model you require within the provided list:
204+
205+
- `chat model list <https://python.langchain.com/v0.2/docs/integrations/chat/#all-chat-models>`_
206+
- `embedding model list <https://python.langchain.com/v0.2/docs/integrations/text_embedding/#all-embedding-models>`_.
207+
208+
For instance, consider **chat model** Moonshot. We can integrate it in the following manner:
209+
210+
.. code-block:: python
211+
212+
from langchain_community.chat_models.moonshot import MoonshotChat
213+
214+
# The configuration parameters are contingent upon the specific model you select
215+
llm_instance_config = {
216+
"model": "moonshot-v1-8k",
217+
"base_url": "https://api.moonshot.cn/v1",
218+
"moonshot_api_key": "MOONSHOT_API_KEY",
219+
}
220+
221+
llm_model_instance = MoonshotChat(**llm_instance_config)
222+
graph_config = {
223+
"llm": {
224+
"model_instance": llm_model_instance,
225+
"model_tokens": 5000
226+
},
227+
}
228+

examples/anthropic/csv_scraper_haiku.py renamed to examples/anthropic/csv_scraper_anthropic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
graph_config = {
3333
"llm": {
3434
"api_key": os.getenv("ANTHROPIC_API_KEY"),
35-
"model": "claude-3-haiku-20240307",
35+
"model": "anthropic/claude-3-haiku-20240307",
3636
"max_tokens": 4000
3737
},
3838
}

examples/anthropic/csv_scraper_graph_multi_haiku.py renamed to examples/anthropic/csv_scraper_graph_multi_anthropic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
graph_config = {
2727
"llm": {
2828
"api_key": os.getenv("ANTHROPIC_API_KEY"),
29-
"model": "claude-3-haiku-20240307",
29+
"model": "anthropic/claude-3-haiku-20240307",
3030
"max_tokens": 4000},
3131
}
3232

examples/anthropic/custom_graph_haiku.py renamed to examples/anthropic/custom_graph_anthropic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
graph_config = {
1919
"llm": {
2020
"api_key": os.getenv("ANTHROPIC_API_KEY"),
21-
"model": "claude-3-haiku-20240307",
21+
"model": "anthropic/claude-3-haiku-20240307",
2222
"max_tokens": 4000
2323
},
2424
}

examples/anthropic/json_scraper_haiku.py renamed to examples/anthropic/json_scraper_anthropic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
graph_config = {
2727
"llm": {
2828
"api_key": os.getenv("ANTHROPIC_API_KEY"),
29-
"model": "claude-3-haiku-20240307",
29+
"model": "anthropic/claude-3-haiku-20240307",
3030
"max_tokens": 4000
3131
},
3232
}

examples/anthropic/json_scraper_multi_haiku.py renamed to examples/anthropic/json_scraper_multi_anthropic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
graph_config = {
1212
"llm": {
1313
"api_key": os.getenv("ANTHROPIC_API_KEY"),
14-
"model": "claude-3-haiku-20240307",
14+
"model": "anthropic/claude-3-haiku-20240307",
1515
"max_tokens": 4000
1616
},
1717
}

examples/anthropic/pdf_scraper_graph_haiku.py renamed to examples/anthropic/pdf_scraper_graph_anthropic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
graph_config = {
1515
"llm": {
1616
"api_key": os.getenv("ANTHROPIC_API_KEY"),
17-
"model": "claude-3-haiku-20240307",
17+
"model": "anthropic/claude-3-haiku-20240307",
1818
"max_tokens": 4000
1919
},
2020
}

0 commit comments

Comments
 (0)