-
-
-
\ No newline at end of file
diff --git a/examples/benchmarks/GenerateScraper/inputs/example_2.txt b/examples/benchmarks/GenerateScraper/inputs/example_2.txt
deleted file mode 100644
index b7810eed..00000000
--- a/examples/benchmarks/GenerateScraper/inputs/example_2.txt
+++ /dev/null
@@ -1,400 +0,0 @@
-WIRED - The Latest in Technology, Science, Culture and Business | WIRED
Cyber Army of Russia Reborn, a group with ties to the Kremlin’s Sandworm unit, is crossing lines even that notorious cyberwarfare unit wouldn’t dare to.
Originally published February 2015: More than five years before Microsoft invested its first $1 billion in OpenAI, its engineers were hard at work on something that they believed would transform consumer computing, and it wasn’t artificial intelligence.
The company’s latest gaming furniture is designed to bring you closer to the action when you’re playing, and blend into your living room when you’re not.
We and our 167 partners store and/or access information on a device, such as unique IDs in cookies to process personal data. You may accept or manage your choices by clicking below or at any time in the privacy policy page. These choices will be signaled to our partners and will not affect browsing data.More Information
We and our partners process data to provide:
Use precise geolocation data. Actively scan device characteristics for identification. Store and/or access information on a device. Personalised advertising and content, advertising and content measurement, audience research and services development.
- WIRED - The Latest in Technology, Science, Culture and Business | WIRED
-
Cyber Army of Russia Reborn, a group with ties to the Kremlin’s Sandworm unit, is crossing lines even that notorious cyberwarfare unit wouldn’t dare to.
Originally published February 2015: More than five years before Microsoft invested its first $1 billion in OpenAI, its engineers were hard at work on something that they believed would transform consumer computing, and it wasn’t artificial intelligence.
The company’s latest gaming furniture is designed to bring you closer to the action when you’re playing, and blend into your living room when you’re not.
We and our 167 partners store and/or access information on a device, such as unique IDs in cookies to process personal data. You may accept or manage your choices by clicking below or at any time in the privacy policy page. These choices will be signaled to our partners and will not affect browsing data.More Information
We and our partners process data to provide:
Use precise geolocation data. Actively scan device characteristics for identification. Store and/or access information on a device. Personalised advertising and content, advertising and content measurement, audience research and services development.
-WIRED - The Latest in Technology, Science, Culture and Business | WIRED
Cyber Army of Russia Reborn, a group with ties to the Kremlin’s Sandworm unit, is crossing lines even that notorious cyberwarfare unit wouldn’t dare to.
Originally published February 2015: More than five years before Microsoft invested its first $1 billion in OpenAI, its engineers were hard at work on something that they believed would transform consumer computing, and it wasn’t artificial intelligence.
The company’s latest gaming furniture is designed to bring you closer to the action when you’re playing, and blend into your living room when you’re not.
We and our 167 partners store and/or access information on a device, such as unique IDs in cookies to process personal data. You may accept or manage your choices by clicking below or at any time in the privacy policy page. These choices will be signaled to our partners and will not affect browsing data.More Information
We and our partners process data to provide:
Use precise geolocation data. Actively scan device characteristics for identification. Store and/or access information on a device. Personalised advertising and content, advertising and content measurement, audience research and services development.
\ No newline at end of file
diff --git a/examples/benchmarks/SmartScraper/.env.example b/examples/benchmarks/SmartScraper/.env.example
deleted file mode 100644
index 599a2397..00000000
--- a/examples/benchmarks/SmartScraper/.env.example
+++ /dev/null
@@ -1 +0,0 @@
-OPENAI_APIKEY="your openai key here"
\ No newline at end of file
diff --git a/examples/benchmarks/SmartScraper/Readme.md b/examples/benchmarks/SmartScraper/Readme.md
deleted file mode 100644
index 9c9f9c37..00000000
--- a/examples/benchmarks/SmartScraper/Readme.md
+++ /dev/null
@@ -1,42 +0,0 @@
-# Local models
-# Local models
-The two websites benchmark are:
-- Example 1: https://perinim.github.io/projects
-- Example 2: https://www.wired.com (at 17/4/2024)
-
-Both are strored locally as txt file in .txt format because in this way we do not have to think about the internet connection
-
-| Hardware | Model | Example 1 | Example 2 |
-| ---------------------- | --------------------------------------- | --------- | --------- |
-| Macbook 14' m1 pro | Mistral on Ollama with nomic-embed-text | 16.291s | 38.74s |
-| Macbook m2 max | Mistral on Ollama with nomic-embed-text | | |
-| Macbook 14' m1 pro | Llama3 on Ollama with nomic-embed-text | 12.88s | 13.84s |
-| Macbook m2 max | Llama3 on Ollama with nomic-embed-text | | |
-
-**Note**: the examples on Docker are not runned on other devices than the Macbook because the performance are to slow (10 times slower than Ollama). Indeed the results are the following:
-
-| Hardware | Example 1 | Example 2 |
-| ------------------ | --------- | --------- |
-| Macbook 14' m1 pro | 139.89 | Too long |
-# Performance on APIs services
-### Example 1: personal portfolio
-**URL**: https://perinim.github.io/projects
-**Task**: List me all the projects with their description.
-
-| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD |
-| ------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- |
-| gpt-3.5-turbo | 4.132s | 438 | 303 | 135 | 1 | 0.000724 |
-| gpt-4-turbo-preview | 6.965s | 442 | 303 | 139 | 1 | 0.0072 |
-| gpt-4-o | 4.446s | 444 | 305 | 139 | 1 | 0 |
-| Grooq with nomic-embed-text | 1.335s | 648 | 482 | 166 | 1 | 0 |
-
-### Example 2: Wired
-**URL**: https://www.wired.com
-**Task**: List me all the articles with their description.
-
-| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD |
-| ------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- |
-| gpt-3.5-turbo | 8.836s | 1167 | 726 | 441 | 1 | 0.001971 |
-| gpt-4-turbo-preview | 21.53s | 1205 | 726 | 479 | 1 | 0.02163 |
-| gpt-4-o | 15.27s | 1400 | 715 | 685 | 1 | 0 |
-| Grooq with nomic-embed-text | 3.82s | 2459 | 2192 | 267 | 1 | 0 |
diff --git a/examples/benchmarks/SmartScraper/benchmark_docker.py b/examples/benchmarks/SmartScraper/benchmark_docker.py
deleted file mode 100644
index e5754c4b..00000000
--- a/examples/benchmarks/SmartScraper/benchmark_docker.py
+++ /dev/null
@@ -1,51 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper from text
-"""
-
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-files = ["inputs/example_1.txt", "inputs/example_2.txt"]
-tasks = ["List me all the projects with their description.",
- "List me all the articles with their description."]
-
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-graph_config = {
- "llm": {
- "model": "ollama/mistral",
- "temperature": 0,
- "format": "json", # Ollama needs the format to be specified explicitly
- # "model_tokens": 2000, # set context length arbitrarily
- },
- "embeddings": {
- "model": "ollama/nomic-embed-text",
- "temperature": 0,
- }
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-for i in range(0, 2):
- with open(files[i], 'r', encoding="utf-8") as file:
- text = file.read()
-
- smart_scraper_graph = SmartScraperGraph(
- prompt=tasks[i],
- source=text,
- config=graph_config
- )
-
- result = smart_scraper_graph.run()
- print(result)
- # ************************************************
- # Get graph execution info
- # ************************************************
-
- graph_exec_info = smart_scraper_graph.get_execution_info()
- print(prettify_exec_info(graph_exec_info))
diff --git a/examples/benchmarks/SmartScraper/benchmark_groq.py b/examples/benchmarks/SmartScraper/benchmark_groq.py
deleted file mode 100644
index e769ee52..00000000
--- a/examples/benchmarks/SmartScraper/benchmark_groq.py
+++ /dev/null
@@ -1,57 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper from text
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-files = ["inputs/example_1.txt", "inputs/example_2.txt"]
-tasks = ["List me all the projects with their description.",
- "List me all the articles with their description."]
-
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-groq_key = os.getenv("GROQ_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "groq/gemma-7b-it",
- "api_key": groq_key,
- "temperature": 0
- },
- "embeddings": {
- "model": "ollama/nomic-embed-text",
- "temperature": 0,
- "base_url": "http://localhost:11434", # set ollama URL arbitrarily
- },
- "headless": False
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-for i in range(0, 2):
- with open(files[i], 'r', encoding="utf-8") as file:
- text = file.read()
-
- smart_scraper_graph = SmartScraperGraph(
- prompt=tasks[i],
- source=text,
- config=graph_config
- )
-
- result = smart_scraper_graph.run()
- print(result)
- # ************************************************
- # Get graph execution info
- # ************************************************
-
- graph_exec_info = smart_scraper_graph.get_execution_info()
- print(prettify_exec_info(graph_exec_info))
diff --git a/examples/benchmarks/SmartScraper/benchmark_llama3.py b/examples/benchmarks/SmartScraper/benchmark_llama3.py
deleted file mode 100644
index 2b182f20..00000000
--- a/examples/benchmarks/SmartScraper/benchmark_llama3.py
+++ /dev/null
@@ -1,53 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper from text
-"""
-
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-files = ["inputs/example_1.txt", "inputs/example_2.txt"]
-tasks = ["List me all the projects with their description.",
- "List me all the articles with their description."]
-
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-graph_config = {
- "llm": {
- "model": "ollama/llama3",
- "temperature": 0,
- "format": "json", # Ollama needs the format to be specified explicitly
- # "model_tokens": 2000, # set context length arbitrarily
- "base_url": "http://localhost:11434",
- },
- "embeddings": {
- "model": "ollama/nomic-embed-text",
- "temperature": 0,
- "base_url": "http://localhost:11434",
- }
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-for i in range(0, 2):
- with open(files[i], 'r', encoding="utf-8") as file:
- text = file.read()
-
- smart_scraper_graph = SmartScraperGraph(
- prompt=tasks[i],
- source=text,
- config=graph_config
- )
-
- result = smart_scraper_graph.run()
- print(result)
- # ************************************************
- # Get graph execution info
- # ************************************************
-
- graph_exec_info = smart_scraper_graph.get_execution_info()
- print(prettify_exec_info(graph_exec_info))
diff --git a/examples/benchmarks/SmartScraper/benchmark_mistral.py b/examples/benchmarks/SmartScraper/benchmark_mistral.py
deleted file mode 100644
index 0e6e53e5..00000000
--- a/examples/benchmarks/SmartScraper/benchmark_mistral.py
+++ /dev/null
@@ -1,54 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper from text
-"""
-
-import os
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-files = ["inputs/example_1.txt", "inputs/example_2.txt"]
-tasks = ["List me all the projects with their description.",
- "List me all the articles with their description."]
-
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-graph_config = {
- "llm": {
- "model": "ollama/mistral",
- "temperature": 0,
- "format": "json", # Ollama needs the format to be specified explicitly
- # "model_tokens": 2000, # set context length arbitrarily
- "base_url": "http://localhost:11434",
- },
- "embeddings": {
- "model": "ollama/nomic-embed-text",
- "temperature": 0,
- "base_url": "http://localhost:11434",
- }
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-for i in range(0, 2):
- with open(files[i], 'r', encoding="utf-8") as file:
- text = file.read()
-
- smart_scraper_graph = SmartScraperGraph(
- prompt=tasks[i],
- source=text,
- config=graph_config
- )
-
- result = smart_scraper_graph.run()
- print(result)
- # ************************************************
- # Get graph execution info
- # ************************************************
-
- graph_exec_info = smart_scraper_graph.get_execution_info()
- print(prettify_exec_info(graph_exec_info))
diff --git a/examples/benchmarks/SmartScraper/benchmark_openai_gpt35.py b/examples/benchmarks/SmartScraper/benchmark_openai_gpt35.py
deleted file mode 100644
index 659d2c78..00000000
--- a/examples/benchmarks/SmartScraper/benchmark_openai_gpt35.py
+++ /dev/null
@@ -1,52 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper from text
-"""
-
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-load_dotenv()
-
-# ************************************************
-# Read the text file
-# ************************************************
-files = ["inputs/example_1.txt", "inputs/example_2.txt"]
-tasks = ["List me all the projects with their description.",
- "List me all the articles with their description."]
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-openai_key = os.getenv("OPENAI_APIKEY")
-
-graph_config = {
- "llm": {
- "api_key": openai_key,
- "model": "openai/gpt-3.5-turbo",
- },
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-for i in range(0, 2):
- with open(files[i], 'r', encoding="utf-8") as file:
- text = file.read()
-
- smart_scraper_graph = SmartScraperGraph(
- prompt=tasks[i],
- source=text,
- config=graph_config
- )
-
- result = smart_scraper_graph.run()
- print(result)
- # ************************************************
- # Get graph execution info
- # ************************************************
-
- graph_exec_info = smart_scraper_graph.get_execution_info()
- print(prettify_exec_info(graph_exec_info))
diff --git a/examples/benchmarks/SmartScraper/benchmark_openai_gpt4.py b/examples/benchmarks/SmartScraper/benchmark_openai_gpt4.py
deleted file mode 100644
index a23901a9..00000000
--- a/examples/benchmarks/SmartScraper/benchmark_openai_gpt4.py
+++ /dev/null
@@ -1,53 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper from text
-"""
-
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-load_dotenv()
-
-# ************************************************
-# Read the text file
-# ************************************************
-files = ["inputs/example_1.txt", "inputs/example_2.txt"]
-tasks = ["List me all the projects with their description.",
- "List me all the articles with their description."]
-
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-openai_key = os.getenv("OPENAI_APIKEY")
-
-graph_config = {
- "llm": {
- "api_key": openai_key,
- "model": "openai/gpt-4-turbo",
- },
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-for i in range(0, 2):
- with open(files[i], 'r', encoding="utf-8") as file:
- text = file.read()
-
- smart_scraper_graph = SmartScraperGraph(
- prompt=tasks[i],
- source=text,
- config=graph_config
- )
-
- result = smart_scraper_graph.run()
- print(result)
- # ************************************************
- # Get graph execution info
- # ************************************************
-
- graph_exec_info = smart_scraper_graph.get_execution_info()
- print(prettify_exec_info(graph_exec_info))
diff --git a/examples/benchmarks/SmartScraper/benchmark_openai_gpt4o.py b/examples/benchmarks/SmartScraper/benchmark_openai_gpt4o.py
deleted file mode 100644
index 8b2da6d7..00000000
--- a/examples/benchmarks/SmartScraper/benchmark_openai_gpt4o.py
+++ /dev/null
@@ -1,53 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper from text
-"""
-
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-load_dotenv()
-
-# ************************************************
-# Read the text file
-# ************************************************
-files = ["inputs/example_1.txt", "inputs/example_2.txt"]
-tasks = ["List me all the projects with their description.",
- "List me all the articles with their description."]
-
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-openai_key = os.getenv("OPENAI_APIKEY")
-
-graph_config = {
- "llm": {
- "api_key": openai_key,
- "model": "openai/gpt-4o",
- },
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-for i in range(0, 2):
- with open(files[i], 'r', encoding="utf-8") as file:
- text = file.read()
-
- smart_scraper_graph = SmartScraperGraph(
- prompt=tasks[i],
- source=text,
- config=graph_config
- )
-
- result = smart_scraper_graph.run()
- print(result)
- # ************************************************
- # Get graph execution info
- # ************************************************
-
- graph_exec_info = smart_scraper_graph.get_execution_info()
- print(prettify_exec_info(graph_exec_info))
diff --git a/examples/benchmarks/SmartScraper/inputs/example_1.txt b/examples/benchmarks/SmartScraper/inputs/example_1.txt
deleted file mode 100644
index 78f814ae..00000000
--- a/examples/benchmarks/SmartScraper/inputs/example_1.txt
+++ /dev/null
@@ -1,105 +0,0 @@
-
-
-
-
-
-
-
-
-
\ No newline at end of file
diff --git a/examples/benchmarks/SmartScraper/inputs/example_2.txt b/examples/benchmarks/SmartScraper/inputs/example_2.txt
deleted file mode 100644
index b7810eed..00000000
--- a/examples/benchmarks/SmartScraper/inputs/example_2.txt
+++ /dev/null
@@ -1,400 +0,0 @@
-WIRED - The Latest in Technology, Science, Culture and Business | WIRED
Cyber Army of Russia Reborn, a group with ties to the Kremlin’s Sandworm unit, is crossing lines even that notorious cyberwarfare unit wouldn’t dare to.
Originally published February 2015: More than five years before Microsoft invested its first $1 billion in OpenAI, its engineers were hard at work on something that they believed would transform consumer computing, and it wasn’t artificial intelligence.
The company’s latest gaming furniture is designed to bring you closer to the action when you’re playing, and blend into your living room when you’re not.
We and our 167 partners store and/or access information on a device, such as unique IDs in cookies to process personal data. You may accept or manage your choices by clicking below or at any time in the privacy policy page. These choices will be signaled to our partners and will not affect browsing data.More Information
We and our partners process data to provide:
Use precise geolocation data. Actively scan device characteristics for identification. Store and/or access information on a device. Personalised advertising and content, advertising and content measurement, audience research and services development.
- WIRED - The Latest in Technology, Science, Culture and Business | WIRED
-
Cyber Army of Russia Reborn, a group with ties to the Kremlin’s Sandworm unit, is crossing lines even that notorious cyberwarfare unit wouldn’t dare to.
Originally published February 2015: More than five years before Microsoft invested its first $1 billion in OpenAI, its engineers were hard at work on something that they believed would transform consumer computing, and it wasn’t artificial intelligence.
The company’s latest gaming furniture is designed to bring you closer to the action when you’re playing, and blend into your living room when you’re not.
We and our 167 partners store and/or access information on a device, such as unique IDs in cookies to process personal data. You may accept or manage your choices by clicking below or at any time in the privacy policy page. These choices will be signaled to our partners and will not affect browsing data.More Information
We and our partners process data to provide:
Use precise geolocation data. Actively scan device characteristics for identification. Store and/or access information on a device. Personalised advertising and content, advertising and content measurement, audience research and services development.
-WIRED - The Latest in Technology, Science, Culture and Business | WIRED
Cyber Army of Russia Reborn, a group with ties to the Kremlin’s Sandworm unit, is crossing lines even that notorious cyberwarfare unit wouldn’t dare to.
Originally published February 2015: More than five years before Microsoft invested its first $1 billion in OpenAI, its engineers were hard at work on something that they believed would transform consumer computing, and it wasn’t artificial intelligence.
The company’s latest gaming furniture is designed to bring you closer to the action when you’re playing, and blend into your living room when you’re not.
We and our 167 partners store and/or access information on a device, such as unique IDs in cookies to process personal data. You may accept or manage your choices by clicking below or at any time in the privacy policy page. These choices will be signaled to our partners and will not affect browsing data.More Information
We and our partners process data to provide:
Use precise geolocation data. Actively scan device characteristics for identification. Store and/or access information on a device. Personalised advertising and content, advertising and content measurement, audience research and services development.
\ No newline at end of file
diff --git a/examples/benchmarks/readme.md b/examples/benchmarks/readme.md
deleted file mode 100644
index ca672ad0..00000000
--- a/examples/benchmarks/readme.md
+++ /dev/null
@@ -1,4 +0,0 @@
-These 2 subfolders contain all the scripts and performance documents for the 2 graphs used for the scrapers.
-In particular:
-* __GenerateScraper__: contains the benchmarks for GenerateScraper class
-* __SmartScraper__: contains the benchamrks for SmartScraper class
\ No newline at end of file
diff --git a/examples/code_generator_graph/.env.example b/examples/code_generator_graph/.env.example
new file mode 100644
index 00000000..a93912e4
--- /dev/null
+++ b/examples/code_generator_graph/.env.example
@@ -0,0 +1,14 @@
+# OpenAI API Configuration
+OPENAI_API_KEY=your-openai-api-key-here
+
+# Optional Configurations
+MAX_TOKENS=4000
+MODEL_NAME=gpt-4-1106-preview
+TEMPERATURE=0.7
+
+# Code Generator Settings
+DEFAULT_LANGUAGE=python
+GENERATE_TESTS=true
+ADD_DOCUMENTATION=true
+CODE_STYLE=pep8
+TYPE_CHECKING=true
\ No newline at end of file
diff --git a/examples/code_generator_graph/README.md b/examples/code_generator_graph/README.md
new file mode 100644
index 00000000..bc4b5dec
--- /dev/null
+++ b/examples/code_generator_graph/README.md
@@ -0,0 +1,30 @@
+# Code Generator Graph Example
+
+This example demonstrates how to use Scrapegraph-ai to generate code based on specifications and requirements.
+
+## Features
+
+- Code generation from specifications
+- Multiple programming languages support
+- Code documentation
+- Best practices implementation
+
+## Setup
+
+1. Install required dependencies
+2. Copy `.env.example` to `.env`
+3. Configure your API keys in the `.env` file
+
+## Usage
+
+```python
+from scrapegraphai.graphs import CodeGeneratorGraph
+
+graph = CodeGeneratorGraph()
+code = graph.generate("code specification")
+```
+
+## Environment Variables
+
+Required environment variables:
+- `OPENAI_API_KEY`: Your OpenAI API key
\ No newline at end of file
diff --git a/examples/local_models/code_generator_graph_ollama.py b/examples/code_generator_graph/ollama/code_generator_graph_ollama.py
similarity index 100%
rename from examples/local_models/code_generator_graph_ollama.py
rename to examples/code_generator_graph/ollama/code_generator_graph_ollama.py
diff --git a/examples/openai/code_generator_graph_openai.py b/examples/code_generator_graph/openai/code_generator_graph_openai.py
similarity index 100%
rename from examples/openai/code_generator_graph_openai.py
rename to examples/code_generator_graph/openai/code_generator_graph_openai.py
diff --git a/examples/csv_scraper_graph/.env.example b/examples/csv_scraper_graph/.env.example
new file mode 100644
index 00000000..1917f9aa
--- /dev/null
+++ b/examples/csv_scraper_graph/.env.example
@@ -0,0 +1,11 @@
+# OpenAI API Configuration
+OPENAI_API_KEY=your-openai-api-key-here
+
+# Optional Configurations
+MAX_TOKENS=4000
+MODEL_NAME=gpt-4-1106-preview
+TEMPERATURE=0.7
+
+# CSV Scraper Settings
+CSV_DELIMITER=,
+MAX_ROWS=1000
\ No newline at end of file
diff --git a/examples/csv_scraper_graph/README.md b/examples/csv_scraper_graph/README.md
new file mode 100644
index 00000000..d39858b0
--- /dev/null
+++ b/examples/csv_scraper_graph/README.md
@@ -0,0 +1,30 @@
+# CSV Scraper Graph Example
+
+This example demonstrates how to use Scrapegraph-ai to extract data from web sources and save it in CSV format.
+
+## Features
+
+- Table data extraction
+- CSV formatting
+- Data cleaning
+- Structured output
+
+## Setup
+
+1. Install required dependencies
+2. Copy `.env.example` to `.env`
+3. Configure your API keys in the `.env` file
+
+## Usage
+
+```python
+from scrapegraphai.graphs import CsvScraperGraph
+
+graph = CsvScraperGraph()
+csv_data = graph.scrape("https://example.com/table")
+```
+
+## Environment Variables
+
+Required environment variables:
+- `OPENAI_API_KEY`: Your OpenAI API key
\ No newline at end of file
diff --git a/examples/local_models/csv_scraper_graph_multi_ollama.py b/examples/csv_scraper_graph/ollama/csv_scraper_graph_multi_ollama.py
similarity index 86%
rename from examples/local_models/csv_scraper_graph_multi_ollama.py
rename to examples/csv_scraper_graph/ollama/csv_scraper_graph_multi_ollama.py
index fb6bce51..558a876f 100644
--- a/examples/local_models/csv_scraper_graph_multi_ollama.py
+++ b/examples/csv_scraper_graph/ollama/csv_scraper_graph_multi_ollama.py
@@ -3,9 +3,9 @@
"""
import os
-import pandas as pd
+
from scrapegraphai.graphs import CSVScraperMultiGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
+from scrapegraphai.utils import prettify_exec_info
# ************************************************
# Read the CSV file
@@ -15,7 +15,8 @@
curr_dir = os.path.dirname(os.path.realpath(__file__))
file_path = os.path.join(curr_dir, FILE_NAME)
-text = pd.read_csv(file_path)
+with open(file_path, "r") as file:
+ text = file.read()
# ************************************************
# Define the configuration for the graph
@@ -44,7 +45,7 @@
csv_scraper_graph = CSVScraperMultiGraph(
prompt="List me all the last names",
source=[str(text), str(text)],
- config=graph_config
+ config=graph_config,
)
result = csv_scraper_graph.run()
@@ -56,7 +57,3 @@
graph_exec_info = csv_scraper_graph.get_execution_info()
print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/local_models/csv_scraper_ollama.py b/examples/csv_scraper_graph/ollama/csv_scraper_ollama.py
similarity index 86%
rename from examples/local_models/csv_scraper_ollama.py
rename to examples/csv_scraper_graph/ollama/csv_scraper_ollama.py
index 8d1edbd7..d6e6eab2 100644
--- a/examples/local_models/csv_scraper_ollama.py
+++ b/examples/csv_scraper_graph/ollama/csv_scraper_ollama.py
@@ -3,9 +3,9 @@
"""
import os
-import pandas as pd
+
from scrapegraphai.graphs import CSVScraperGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
+from scrapegraphai.utils import prettify_exec_info
# ************************************************
# Read the CSV file
@@ -15,7 +15,8 @@
curr_dir = os.path.dirname(os.path.realpath(__file__))
file_path = os.path.join(curr_dir, FILE_NAME)
-text = pd.read_csv(file_path)
+with open(file_path, "r") as file:
+ text = file.read()
# ************************************************
# Define the configuration for the graph
@@ -44,7 +45,7 @@
csv_scraper_graph = CSVScraperGraph(
prompt="List me all the last names",
source=str(text), # Pass the content of the file, not the file object
- config=graph_config
+ config=graph_config,
)
result = csv_scraper_graph.run()
@@ -56,7 +57,3 @@
graph_exec_info = csv_scraper_graph.get_execution_info()
print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/anthropic/inputs/username.csv b/examples/csv_scraper_graph/ollama/inputs/username.csv
similarity index 100%
rename from examples/anthropic/inputs/username.csv
rename to examples/csv_scraper_graph/ollama/inputs/username.csv
diff --git a/examples/openai/csv_scraper_graph_multi_openai.py b/examples/csv_scraper_graph/openai/csv_scraper_graph_multi_openai.py
similarity index 83%
rename from examples/openai/csv_scraper_graph_multi_openai.py
rename to examples/csv_scraper_graph/openai/csv_scraper_graph_multi_openai.py
index 6ed33c90..b7bc83ae 100644
--- a/examples/openai/csv_scraper_graph_multi_openai.py
+++ b/examples/csv_scraper_graph/openai/csv_scraper_graph_multi_openai.py
@@ -1,11 +1,13 @@
"""
Basic example of scraping pipeline using CSVScraperMultiGraph from CSV documents
"""
+
import os
+
from dotenv import load_dotenv
-import pandas as pd
+
from scrapegraphai.graphs import CSVScraperMultiGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
+from scrapegraphai.utils import prettify_exec_info
load_dotenv()
# ************************************************
@@ -16,7 +18,8 @@
curr_dir = os.path.dirname(os.path.realpath(__file__))
file_path = os.path.join(curr_dir, FILE_NAME)
-text = pd.read_csv(file_path)
+with open(file_path, "r") as file:
+ text = file.read()
# ************************************************
# Define the configuration for the graph
@@ -24,7 +27,7 @@
openai_key = os.getenv("OPENAI_APIKEY")
graph_config = {
- "llm": {
+ "llm": {
"api_key": openai_key,
"model": "openai/gpt-4o",
},
@@ -37,7 +40,7 @@
csv_scraper_graph = CSVScraperMultiGraph(
prompt="List me all the last names",
source=[str(text), str(text)],
- config=graph_config
+ config=graph_config,
)
result = csv_scraper_graph.run()
@@ -49,7 +52,3 @@
graph_exec_info = csv_scraper_graph.get_execution_info()
print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/openai/csv_scraper_openai.py b/examples/csv_scraper_graph/openai/csv_scraper_openai.py
similarity index 84%
rename from examples/openai/csv_scraper_openai.py
rename to examples/csv_scraper_graph/openai/csv_scraper_openai.py
index d9527b86..a0abd714 100644
--- a/examples/openai/csv_scraper_openai.py
+++ b/examples/csv_scraper_graph/openai/csv_scraper_openai.py
@@ -1,11 +1,13 @@
"""
Basic example of scraping pipeline using CSVScraperGraph from CSV documents
"""
+
import os
+
from dotenv import load_dotenv
-import pandas as pd
+
from scrapegraphai.graphs import CSVScraperGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
+from scrapegraphai.utils import prettify_exec_info
load_dotenv()
@@ -17,7 +19,8 @@
curr_dir = os.path.dirname(os.path.realpath(__file__))
file_path = os.path.join(curr_dir, FILE_NAME)
-text = pd.read_csv(file_path)
+with open(file_path, "r") as file:
+ text = file.read()
# ************************************************
# Define the configuration for the graph
@@ -39,7 +42,7 @@
csv_scraper_graph = CSVScraperGraph(
prompt="List me all the last names",
source=str(text), # Pass the content of the file, not the file object
- config=graph_config
+ config=graph_config,
)
result = csv_scraper_graph.run()
@@ -51,7 +54,3 @@
graph_exec_info = csv_scraper_graph.get_execution_info()
print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/azure/inputs/username.csv b/examples/csv_scraper_graph/openai/inputs/username.csv
similarity index 100%
rename from examples/azure/inputs/username.csv
rename to examples/csv_scraper_graph/openai/inputs/username.csv
diff --git a/examples/custom_graph/.env.example b/examples/custom_graph/.env.example
new file mode 100644
index 00000000..9eac4cb8
--- /dev/null
+++ b/examples/custom_graph/.env.example
@@ -0,0 +1,13 @@
+# OpenAI API Configuration
+OPENAI_API_KEY=your-openai-api-key-here
+
+# Optional Configurations
+MAX_TOKENS=4000
+MODEL_NAME=gpt-4-1106-preview
+TEMPERATURE=0.7
+
+# Custom Graph Settings
+CUSTOM_NODE_TIMEOUT=30
+MAX_NODES=10
+DEBUG_MODE=false
+LOG_LEVEL=info
\ No newline at end of file
diff --git a/examples/custom_graph/README.md b/examples/custom_graph/README.md
new file mode 100644
index 00000000..e6d3b88a
--- /dev/null
+++ b/examples/custom_graph/README.md
@@ -0,0 +1,31 @@
+# Custom Graph Example
+
+This example demonstrates how to create and implement custom graphs using Scrapegraph-ai.
+
+## Features
+
+- Custom node creation
+- Graph customization
+- Pipeline configuration
+- Custom data processing
+
+## Setup
+
+1. Install required dependencies
+2. Copy `.env.example` to `.env`
+3. Configure your API keys in the `.env` file
+
+## Usage
+
+```python
+from scrapegraphai.graphs import CustomGraph
+
+graph = CustomGraph()
+graph.add_node("custom_node", CustomNode())
+results = graph.process()
+```
+
+## Environment Variables
+
+Required environment variables:
+- `OPENAI_API_KEY`: Your OpenAI API key
\ No newline at end of file
diff --git a/examples/local_models/custom_graph_ollama.py b/examples/custom_graph/ollama/custom_graph_ollama.py
similarity index 100%
rename from examples/local_models/custom_graph_ollama.py
rename to examples/custom_graph/ollama/custom_graph_ollama.py
diff --git a/examples/openai/custom_graph_openai.py b/examples/custom_graph/openai/custom_graph_openai.py
similarity index 100%
rename from examples/openai/custom_graph_openai.py
rename to examples/custom_graph/openai/custom_graph_openai.py
diff --git a/examples/deepseek/.env.example b/examples/deepseek/.env.example
deleted file mode 100644
index 37511138..00000000
--- a/examples/deepseek/.env.example
+++ /dev/null
@@ -1 +0,0 @@
-DEEPSEEK_APIKEY="your api key"
\ No newline at end of file
diff --git a/examples/deepseek/code_generator_graph_deepseek.py b/examples/deepseek/code_generator_graph_deepseek.py
deleted file mode 100644
index f78a42b6..00000000
--- a/examples/deepseek/code_generator_graph_deepseek.py
+++ /dev/null
@@ -1,59 +0,0 @@
-"""
-Basic example of scraping pipeline using Code Generator with schema
-"""
-import os
-from typing import List
-from dotenv import load_dotenv
-from pydantic import BaseModel, Field
-from scrapegraphai.graphs import CodeGeneratorGraph
-
-load_dotenv()
-
-# ************************************************
-# Define the output schema for the graph
-# ************************************************
-
-class Project(BaseModel):
- title: str = Field(description="The title of the project")
- description: str = Field(description="The description of the project")
-
-class Projects(BaseModel):
- projects: List[Project]
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
- "headless": False,
- "reduction": 2,
- "max_iterations": {
- "overall": 10,
- "syntax": 3,
- "execution": 3,
- "validation": 3,
- "semantic": 3
- },
- "output_file_name": "extracted_data.py"
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-code_generator_graph = CodeGeneratorGraph(
- prompt="List me all the projects with their description",
- source="https://perinim.github.io/projects/",
- schema=Projects,
- config=graph_config
-)
-
-result = code_generator_graph.run()
-print(result)
diff --git a/examples/deepseek/csv_scraper_deepseek.py b/examples/deepseek/csv_scraper_deepseek.py
deleted file mode 100644
index 6ef0ac92..00000000
--- a/examples/deepseek/csv_scraper_deepseek.py
+++ /dev/null
@@ -1,56 +0,0 @@
-"""
-Basic example of scraping pipeline using CSVScraperGraph from CSV documents
-"""
-import os
-from dotenv import load_dotenv
-import pandas as pd
-from scrapegraphai.graphs import CSVScraperGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
-load_dotenv()
-
-# ************************************************
-# Read the CSV file
-# ************************************************
-
-FILE_NAME = "inputs/username.csv"
-curr_dir = os.path.dirname(os.path.realpath(__file__))
-file_path = os.path.join(curr_dir, FILE_NAME)
-
-text = pd.read_csv(file_path)
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-# ************************************************
-# Create the CSVScraperGraph instance and run it
-# ************************************************
-
-csv_scraper_graph = CSVScraperGraph(
- prompt="List me all the last names",
- source=str(text), # Pass the content of the file, not the file object
- config=graph_config
-)
-
-result = csv_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = csv_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/deepseek/csv_scraper_graph_multi_deepseek.py b/examples/deepseek/csv_scraper_graph_multi_deepseek.py
deleted file mode 100644
index 95474360..00000000
--- a/examples/deepseek/csv_scraper_graph_multi_deepseek.py
+++ /dev/null
@@ -1,56 +0,0 @@
-"""
-Basic example of scraping pipeline using CSVScraperMultiGraph from CSV documents
-"""
-import os
-from dotenv import load_dotenv
-import pandas as pd
-from scrapegraphai.graphs import CSVScraperMultiGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
-
-load_dotenv()
-# ************************************************
-# Read the CSV file
-# ************************************************
-
-FILE_NAME = "inputs/username.csv"
-curr_dir = os.path.dirname(os.path.realpath(__file__))
-file_path = os.path.join(curr_dir, FILE_NAME)
-
-text = pd.read_csv(file_path)
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-# ************************************************
-# Create the CSVScraperMultiGraph instance and run it
-# ************************************************
-
-csv_scraper_graph = CSVScraperMultiGraph(
- prompt="List me all the last names",
- source=[str(text), str(text)],
- config=graph_config
-)
-
-result = csv_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = csv_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/deepseek/depth_search_graph_deepseek.py b/examples/deepseek/depth_search_graph_deepseek.py
deleted file mode 100644
index 064690a5..00000000
--- a/examples/deepseek/depth_search_graph_deepseek.py
+++ /dev/null
@@ -1,30 +0,0 @@
-"""
-depth_search_graph_opeani example
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import DepthSearchGraph
-
-load_dotenv()
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
- "headless": False,
- "depth": 2,
- "only_inside_links": False,
-}
-
-search_graph = DepthSearchGraph(
- prompt="List me all the projects with their description",
- source="https://perinim.github.io",
- config=graph_config
-)
-
-result = search_graph.run()
-print(result)
diff --git a/examples/deepseek/document_scraper_deepseek.py b/examples/deepseek/document_scraper_deepseek.py
deleted file mode 100644
index e94826d3..00000000
--- a/examples/deepseek/document_scraper_deepseek.py
+++ /dev/null
@@ -1,44 +0,0 @@
-"""
-document_scraper example
-"""
-import os
-import json
-from dotenv import load_dotenv
-from scrapegraphai.graphs import DocumentScraperGraph
-
-load_dotenv()
-
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-
-source = """
- The Divine Comedy, Italian La Divina Commedia, original name La commedia, long narrative poem written in Italian
- circa 1308/21 by Dante. It is usually held to be one of the world s great works of literature.
- Divided into three major sections—Inferno, Purgatorio, and Paradiso—the narrative traces the journey of Dante
- from darkness and error to the revelation of the divine light, culminating in the Beatific Vision of God.
- Dante is guided by the Roman poet Virgil, who represents the epitome of human knowledge, from the dark wood
- through the descending circles of the pit of Hell (Inferno). He then climbs the mountain of Purgatory, guided
- by the Roman poet Statius, who represents the fulfilment of human knowledge, and is finally led by his lifelong love,
- the Beatrice of his earlier poetry, through the celestial spheres of Paradise.
-"""
-
-pdf_scraper_graph = DocumentScraperGraph(
- prompt="Summarize the text and find the main topics",
- source=source,
- config=graph_config,
-)
-result = pdf_scraper_graph.run()
-
-print(json.dumps(result, indent=4))
\ No newline at end of file
diff --git a/examples/deepseek/inputs/books.xml b/examples/deepseek/inputs/books.xml
deleted file mode 100644
index e3d1fe87..00000000
--- a/examples/deepseek/inputs/books.xml
+++ /dev/null
@@ -1,120 +0,0 @@
-
-
-
- Gambardella, Matthew
- XML Developer's Guide
- Computer
- 44.95
- 2000-10-01
- An in-depth look at creating applications
- with XML.
-
-
- Ralls, Kim
- Midnight Rain
- Fantasy
- 5.95
- 2000-12-16
- A former architect battles corporate zombies,
- an evil sorceress, and her own childhood to become queen
- of the world.
-
-
- Corets, Eva
- Maeve Ascendant
- Fantasy
- 5.95
- 2000-11-17
- After the collapse of a nanotechnology
- society in England, the young survivors lay the
- foundation for a new society.
-
-
- Corets, Eva
- Oberon's Legacy
- Fantasy
- 5.95
- 2001-03-10
- In post-apocalypse England, the mysterious
- agent known only as Oberon helps to create a new life
- for the inhabitants of London. Sequel to Maeve
- Ascendant.
-
-
- Corets, Eva
- The Sundered Grail
- Fantasy
- 5.95
- 2001-09-10
- The two daughters of Maeve, half-sisters,
- battle one another for control of England. Sequel to
- Oberon's Legacy.
-
-
- Randall, Cynthia
- Lover Birds
- Romance
- 4.95
- 2000-09-02
- When Carla meets Paul at an ornithology
- conference, tempers fly as feathers get ruffled.
-
-
- Thurman, Paula
- Splish Splash
- Romance
- 4.95
- 2000-11-02
- A deep sea diver finds true love twenty
- thousand leagues beneath the sea.
-
-
- Knorr, Stefan
- Creepy Crawlies
- Horror
- 4.95
- 2000-12-06
- An anthology of horror stories about roaches,
- centipedes, scorpions and other insects.
-
-
- Kress, Peter
- Paradox Lost
- Science Fiction
- 6.95
- 2000-11-02
- After an inadvertant trip through a Heisenberg
- Uncertainty Device, James Salway discovers the problems
- of being quantum.
-
-
- O'Brien, Tim
- Microsoft .NET: The Programming Bible
- Computer
- 36.95
- 2000-12-09
- Microsoft's .NET initiative is explored in
- detail in this deep programmer's reference.
-
-
- O'Brien, Tim
- MSXML3: A Comprehensive Guide
- Computer
- 36.95
- 2000-12-01
- The Microsoft MSXML3 parser is covered in
- detail, with attention to XML DOM interfaces, XSLT processing,
- SAX and more.
-
-
- Galos, Mike
- Visual Studio 7: A Comprehensive Guide
- Computer
- 49.95
- 2001-04-16
- Microsoft Visual Studio 7 is explored in depth,
- looking at how Visual Basic, Visual C++, C#, and ASP+ are
- integrated into a comprehensive development
- environment.
-
-
\ No newline at end of file
diff --git a/examples/deepseek/inputs/example.json b/examples/deepseek/inputs/example.json
deleted file mode 100644
index 2263184c..00000000
--- a/examples/deepseek/inputs/example.json
+++ /dev/null
@@ -1,182 +0,0 @@
-{
- "kind":"youtube#searchListResponse",
- "etag":"q4ibjmYp1KA3RqMF4jFLl6PBwOg",
- "nextPageToken":"CAUQAA",
- "regionCode":"NL",
- "pageInfo":{
- "totalResults":1000000,
- "resultsPerPage":5
- },
- "items":[
- {
- "kind":"youtube#searchResult",
- "etag":"QCsHBifbaernVCbLv8Cu6rAeaDQ",
- "id":{
- "kind":"youtube#video",
- "videoId":"TvWDY4Mm5GM"
- },
- "snippet":{
- "publishedAt":"2023-07-24T14:15:01Z",
- "channelId":"UCwozCpFp9g9x0wAzuFh0hwQ",
- "title":"3 Football Clubs Kylian Mbappe Should Avoid Signing ✍️❌⚽️ #football #mbappe #shorts",
- "description":"",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/TvWDY4Mm5GM/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/TvWDY4Mm5GM/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/TvWDY4Mm5GM/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"FC Motivate",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-24T14:15:01Z"
- }
- },
- {
- "kind":"youtube#searchResult",
- "etag":"0NG5QHdtIQM_V-DBJDEf-jK_Y9k",
- "id":{
- "kind":"youtube#video",
- "videoId":"aZM_42CcNZ4"
- },
- "snippet":{
- "publishedAt":"2023-07-24T16:09:27Z",
- "channelId":"UCM5gMM_HqfKHYIEJ3lstMUA",
- "title":"Which Football Club Could Cristiano Ronaldo Afford To Buy? 💰",
- "description":"Sign up to Sorare and get a FREE card: https://sorare.pxf.io/NellisShorts Give Soraredata a go for FREE: ...",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/aZM_42CcNZ4/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/aZM_42CcNZ4/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/aZM_42CcNZ4/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"John Nellis",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-24T16:09:27Z"
- }
- },
- {
- "kind":"youtube#searchResult",
- "etag":"WbBz4oh9I5VaYj91LjeJvffrBVY",
- "id":{
- "kind":"youtube#video",
- "videoId":"wkP3XS3aNAY"
- },
- "snippet":{
- "publishedAt":"2023-07-24T16:00:50Z",
- "channelId":"UC4EP1dxFDPup_aFLt0ElsDw",
- "title":"PAULO DYBALA vs THE WORLD'S LONGEST FREEKICK WALL",
- "description":"Can Paulo Dybala curl a football around the World's longest free kick wall? We met up with the World Cup winner and put him to ...",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/wkP3XS3aNAY/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/wkP3XS3aNAY/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/wkP3XS3aNAY/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"Shoot for Love",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-24T16:00:50Z"
- }
- },
- {
- "kind":"youtube#searchResult",
- "etag":"juxv_FhT_l4qrR05S1QTrb4CGh8",
- "id":{
- "kind":"youtube#video",
- "videoId":"rJkDZ0WvfT8"
- },
- "snippet":{
- "publishedAt":"2023-07-24T10:00:39Z",
- "channelId":"UCO8qj5u80Ga7N_tP3BZWWhQ",
- "title":"TOP 10 DEFENDERS 2023",
- "description":"SoccerKingz https://soccerkingz.nl Use code: 'ILOVEHOF' to get 10% off. TOP 10 DEFENDERS 2023 Follow us! • Instagram ...",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/rJkDZ0WvfT8/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/rJkDZ0WvfT8/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/rJkDZ0WvfT8/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"Home of Football",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-24T10:00:39Z"
- }
- },
- {
- "kind":"youtube#searchResult",
- "etag":"wtuknXTmI1txoULeH3aWaOuXOow",
- "id":{
- "kind":"youtube#video",
- "videoId":"XH0rtu4U6SE"
- },
- "snippet":{
- "publishedAt":"2023-07-21T16:30:05Z",
- "channelId":"UCwozCpFp9g9x0wAzuFh0hwQ",
- "title":"3 Things You Didn't Know About Erling Haaland ⚽️🇳🇴 #football #haaland #shorts",
- "description":"",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/XH0rtu4U6SE/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/XH0rtu4U6SE/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/XH0rtu4U6SE/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"FC Motivate",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-21T16:30:05Z"
- }
- }
- ]
-}
\ No newline at end of file
diff --git a/examples/deepseek/inputs/username.csv b/examples/deepseek/inputs/username.csv
deleted file mode 100644
index 006ac8e6..00000000
--- a/examples/deepseek/inputs/username.csv
+++ /dev/null
@@ -1,7 +0,0 @@
-Username; Identifier;First name;Last name
-booker12;9012;Rachel;Booker
-grey07;2070;Laura;Grey
-johnson81;4081;Craig;Johnson
-jenkins46;9346;Mary;Jenkins
-smith79;5079;Jamie;Smith
-
diff --git a/examples/deepseek/json_scraper_deepseek.py b/examples/deepseek/json_scraper_deepseek.py
deleted file mode 100644
index d714c1db..00000000
--- a/examples/deepseek/json_scraper_deepseek.py
+++ /dev/null
@@ -1,46 +0,0 @@
-"""
-Basic example of scraping pipeline using JSONScraperGraph from JSON documents
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import JSONScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Read the JSON file
-# ************************************************
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-FILE_NAME = "inputs/example.json"
-curr_dir = os.path.dirname(os.path.realpath(__file__))
-file_path = os.path.join(curr_dir, FILE_NAME)
-
-with open(file_path, 'r', encoding="utf-8") as file:
- text = file.read()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-# ************************************************
-# Create the JSONScraperGraph instance and run it
-# ************************************************
-
-json_scraper_graph = JSONScraperGraph(
- prompt="List me all the authors, title and genres of the books",
- source=text, # Pass the content of the file, not the file object
- config=graph_config
-)
-
-result = json_scraper_graph.run()
-print(result)
diff --git a/examples/deepseek/json_scraper_multi_deepseek.py b/examples/deepseek/json_scraper_multi_deepseek.py
deleted file mode 100644
index 893937cd..00000000
--- a/examples/deepseek/json_scraper_multi_deepseek.py
+++ /dev/null
@@ -1,37 +0,0 @@
-"""
-Module for showing how JSONScraperMultiGraph multi works
-"""
-import os
-import json
-from dotenv import load_dotenv
-from scrapegraphai.graphs import JSONScraperMultiGraph
-
-load_dotenv()
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-FILE_NAME = "inputs/example.json"
-curr_dir = os.path.dirname(os.path.realpath(__file__))
-file_path = os.path.join(curr_dir, FILE_NAME)
-
-with open(file_path, 'r', encoding="utf-8") as file:
- text = file.read()
-
-sources = [text, text]
-
-multiple_search_graph = JSONScraperMultiGraph(
- prompt= "List me all the authors, title and genres of the books",
- source= sources,
- schema=None,
- config=graph_config
-)
-
-result = multiple_search_graph.run()
-print(json.dumps(result, indent=4))
diff --git a/examples/deepseek/rate_limit_deepseek.py b/examples/deepseek/rate_limit_deepseek.py
deleted file mode 100644
index 16781f39..00000000
--- a/examples/deepseek/rate_limit_deepseek.py
+++ /dev/null
@@ -1,47 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper with a custom rate limit
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- "rate_limit": {
- "requests_per_second": 1
- }
- },
- "verbose": True,
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-smart_scraper_graph = SmartScraperGraph(
- prompt="List me all the projects with their description.",
- # also accepts a string with the already downloaded HTML code
- source="https://perinim.github.io/projects/",
- config=graph_config
-)
-
-result = smart_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = smart_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
diff --git a/examples/deepseek/scrape_plain_text_deepseek.py b/examples/deepseek/scrape_plain_text_deepseek.py
deleted file mode 100644
index 2b243d35..00000000
--- a/examples/deepseek/scrape_plain_text_deepseek.py
+++ /dev/null
@@ -1,54 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper from text
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Read the text file
-# ************************************************
-
-FILE_NAME = "inputs/plain_html_example.txt"
-curr_dir = os.path.dirname(os.path.realpath(__file__))
-file_path = os.path.join(curr_dir, FILE_NAME)
-
-# It could be also a http request using the request model
-with open(file_path, 'r', encoding="utf-8") as file:
- text = file.read()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-smart_scraper_graph = SmartScraperGraph(
- prompt="List me all the news with their description.",
- source=text,
- config=graph_config
-)
-
-result = smart_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = smart_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
diff --git a/examples/deepseek/script_generator_deepseek.py b/examples/deepseek/script_generator_deepseek.py
deleted file mode 100644
index 899c7a35..00000000
--- a/examples/deepseek/script_generator_deepseek.py
+++ /dev/null
@@ -1,44 +0,0 @@
-"""
-Basic example of scraping pipeline using ScriptCreatorGraph
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import ScriptCreatorGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "library": "beautifulsoup"
-}
-
-# ************************************************
-# Create the ScriptCreatorGraph instance and run it
-# ************************************************
-
-script_creator_graph = ScriptCreatorGraph(
- prompt="List me all the projects with their description.",
- # also accepts a string with the already downloaded HTML code
- source="https://perinim.github.io/projects",
- config=graph_config
-)
-
-result = script_creator_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = script_creator_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
diff --git a/examples/deepseek/script_multi_generator_deepseek.py b/examples/deepseek/script_multi_generator_deepseek.py
deleted file mode 100644
index 48ca2d20..00000000
--- a/examples/deepseek/script_multi_generator_deepseek.py
+++ /dev/null
@@ -1,53 +0,0 @@
-"""
-Basic example of scraping pipeline using ScriptCreatorGraph
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import ScriptCreatorMultiGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "library": "beautifulsoup"
-}
-
-# ************************************************
-# Create the ScriptCreatorGraph instance and run it
-# ************************************************
-
-urls=[
- "https://schultzbergagency.com/emil-raste-karlsen/",
- "https://schultzbergagency.com/johanna-hedberg/",
-]
-
-# ************************************************
-# Create the ScriptCreatorGraph instance and run it
-# ************************************************
-
-script_creator_graph = ScriptCreatorMultiGraph(
- prompt="Find information about actors",
- # also accepts a string with the already downloaded HTML code
- source=urls,
- config=graph_config
-)
-
-result = script_creator_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = script_creator_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
diff --git a/examples/deepseek/search_graph_deepseek.py b/examples/deepseek/search_graph_deepseek.py
deleted file mode 100644
index 7a3baf0d..00000000
--- a/examples/deepseek/search_graph_deepseek.py
+++ /dev/null
@@ -1,35 +0,0 @@
-"""
-Example of Search Graph
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SearchGraph
-
-load_dotenv()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "max_results": 2,
- "verbose": True,
-}
-
-# ************************************************
-# Create the SearchGraph instance and run it
-# ************************************************
-
-search_graph = SearchGraph(
- prompt="List me the best escursions near Trento",
- config=graph_config
-)
-
-result = search_graph.run()
-print(result)
diff --git a/examples/deepseek/search_graph_schema_deepseek.py b/examples/deepseek/search_graph_schema_deepseek.py
deleted file mode 100644
index f5f20e25..00000000
--- a/examples/deepseek/search_graph_schema_deepseek.py
+++ /dev/null
@@ -1,60 +0,0 @@
-"""
-Example of Search Graph
-"""
-import os
-from typing import List
-from dotenv import load_dotenv
-from pydantic import BaseModel, Field
-from scrapegraphai.graphs import SearchGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Define the output schema for the graph
-# ************************************************
-
-class Dish(BaseModel):
- name: str = Field(description="The name of the dish")
- description: str = Field(description="The description of the dish")
-
-class Dishes(BaseModel):
- dishes: List[Dish]
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-# ************************************************
-# Create the SearchGraph instance and run it
-# ************************************************
-
-search_graph = SearchGraph(
- prompt="List me Chioggia's famous dishes",
- config=graph_config,
- schema=Dishes
-)
-
-result = search_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = search_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
-
-# Save to json and csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/deepseek/search_link_graph_deepseek.py b/examples/deepseek/search_link_graph_deepseek.py
deleted file mode 100644
index dac13737..00000000
--- a/examples/deepseek/search_link_graph_deepseek.py
+++ /dev/null
@@ -1,46 +0,0 @@
-"""
-Example of Search Graph
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SearchGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-load_dotenv()
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-# ************************************************
-# Create the SearchGraph instance and run it
-# ************************************************
-
-search_graph = SearchGraph(
- prompt="List me the best escursions near Trento",
- config=graph_config
-)
-
-result = search_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = search_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
-
-# Save to json and csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/deepseek/smart_scraper_deepseek.py b/examples/deepseek/smart_scraper_deepseek.py
deleted file mode 100644
index 0eac94e8..00000000
--- a/examples/deepseek/smart_scraper_deepseek.py
+++ /dev/null
@@ -1,44 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-smart_scraper_graph = SmartScraperGraph(
- prompt="List me all the projects with their description.",
- # also accepts a string with the already downloaded HTML code
- source="https://perinim.github.io/projects/",
- config=graph_config
-)
-
-result = smart_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = smart_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
diff --git a/examples/deepseek/smart_scraper_lite_deepseek.py b/examples/deepseek/smart_scraper_lite_deepseek.py
deleted file mode 100644
index a70d76b0..00000000
--- a/examples/deepseek/smart_scraper_lite_deepseek.py
+++ /dev/null
@@ -1,31 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper
-"""
-import os
-import json
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperLiteGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-graph_config = {
- "llm": {
- "api_key": os.getenv("DEEPSEEK_API_KEY"),
- "model": "deepseek/deepseek-coder-33b-instruct",
- },
- "verbose": True,
- "headless": False,
-}
-
-smart_scraper_lite_graph = SmartScraperLiteGraph(
- prompt="Who is Marco Perini?",
- source="https://perinim.github.io/",
- config=graph_config
-)
-
-result = smart_scraper_lite_graph.run()
-print(json.dumps(result, indent=4))
-
-graph_exec_info = smart_scraper_lite_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
diff --git a/examples/deepseek/smart_scraper_multi_concat_deepseek.py b/examples/deepseek/smart_scraper_multi_concat_deepseek.py
deleted file mode 100644
index eeb1816c..00000000
--- a/examples/deepseek/smart_scraper_multi_concat_deepseek.py
+++ /dev/null
@@ -1,41 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper
-"""
-import os
-import json
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperMultiConcatGraph
-
-load_dotenv()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-
-# *******************************************************
-# Create the SmartScraperMultiGraph instance and run it
-# *******************************************************
-
-multiple_search_graph = SmartScraperMultiConcatGraph(
- prompt="Who is Marco Perini?",
- source= [
- "https://perinim.github.io/",
- "https://perinim.github.io/cv/"
- ],
- schema=None,
- config=graph_config
-)
-
-result = multiple_search_graph.run()
-print(json.dumps(result, indent=4))
diff --git a/examples/deepseek/smart_scraper_multi_deepseek.py b/examples/deepseek/smart_scraper_multi_deepseek.py
deleted file mode 100644
index 5923e302..00000000
--- a/examples/deepseek/smart_scraper_multi_deepseek.py
+++ /dev/null
@@ -1,41 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper
-"""
-import os
-import json
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperMultiGraph
-
-load_dotenv()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-
-# *******************************************************
-# Create the SmartScraperMultiGraph instance and run it
-# *******************************************************
-
-multiple_search_graph = SmartScraperMultiGraph(
- prompt="Who is Marco Perini?",
- source= [
- "https://perinim.github.io/",
- "https://perinim.github.io/cv/"
- ],
- schema=None,
- config=graph_config
-)
-
-result = multiple_search_graph.run()
-print(json.dumps(result, indent=4))
diff --git a/examples/deepseek/smart_scraper_multi_lite_deepseek.py b/examples/deepseek/smart_scraper_multi_lite_deepseek.py
deleted file mode 100644
index eb5eea01..00000000
--- a/examples/deepseek/smart_scraper_multi_lite_deepseek.py
+++ /dev/null
@@ -1,35 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper
-"""
-import os
-import json
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperMultiLiteGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-graph_config = {
- "llm": {
- "api_key": os.getenv("DEEPSEEK_API_KEY"),
- "model": "deepseek/deepseek-coder-33b-instruct",
- },
- "verbose": True,
- "headless": False,
-}
-
-smart_scraper_multi_lite_graph = SmartScraperMultiLiteGraph(
- prompt="Who is Marco Perini?",
- source= [
- "https://perinim.github.io/",
- "https://perinim.github.io/cv/"
- ],
- config=graph_config
-)
-
-result = smart_scraper_multi_lite_graph.run()
-print(json.dumps(result, indent=4))
-
-graph_exec_info = smart_scraper_multi_lite_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
-
diff --git a/examples/deepseek/smart_scraper_schema_deepseek.py b/examples/deepseek/smart_scraper_schema_deepseek.py
deleted file mode 100644
index fd87fbdc..00000000
--- a/examples/deepseek/smart_scraper_schema_deepseek.py
+++ /dev/null
@@ -1,58 +0,0 @@
-"""
-Basic example of scraping pipeline using SmartScraper
-"""
-import os
-from typing import List
-from pydantic import BaseModel, Field
-from dotenv import load_dotenv
-from scrapegraphai.graphs import SmartScraperGraph
-from scrapegraphai.utils import prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Define the output schema for the graph
-# ************************************************
-
-class Project(BaseModel):
- title: str = Field(description="The title of the project")
- description: str = Field(description="The description of the project")
-
-class Projects(BaseModel):
- projects: List[Project]
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-smart_scraper_graph = SmartScraperGraph(
- prompt="List me all the projects with their description.",
- # also accepts a string with the already downloaded HTML code
- source="https://perinim.github.io/projects/",
- schema=Projects,
- config=graph_config
-)
-
-result = smart_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = smart_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
diff --git a/examples/deepseek/xml_scraper_deepseek.py b/examples/deepseek/xml_scraper_deepseek.py
deleted file mode 100644
index d66b0eab..00000000
--- a/examples/deepseek/xml_scraper_deepseek.py
+++ /dev/null
@@ -1,59 +0,0 @@
-"""
-Basic example of scraping pipeline using XMLScraperGraph from XML documents
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import XMLScraperGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Read the XML file
-# ************************************************
-
-FILE_NAME = "inputs/books.xml"
-curr_dir = os.path.dirname(os.path.realpath(__file__))
-file_path = os.path.join(curr_dir, FILE_NAME)
-
-with open(file_path, 'r', encoding="utf-8") as file:
- text = file.read()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-
-# ************************************************
-# Create the XMLScraperGraph instance and run it
-# ************************************************
-
-xml_scraper_graph = XMLScraperGraph(
- prompt="List me all the authors, title and genres of the books",
- source=text, # Pass the content of the file, not the file object
- config=graph_config
-)
-
-result = xml_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = xml_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/deepseek/xml_scraper_graph_multi_deepseek.py b/examples/deepseek/xml_scraper_graph_multi_deepseek.py
deleted file mode 100644
index 2d190926..00000000
--- a/examples/deepseek/xml_scraper_graph_multi_deepseek.py
+++ /dev/null
@@ -1,57 +0,0 @@
-"""
-Basic example of scraping pipeline using XMLScraperMultiGraph from XML documents
-"""
-import os
-from dotenv import load_dotenv
-from scrapegraphai.graphs import XMLScraperMultiGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Read the XML file
-# ************************************************
-
-FILE_NAME = "inputs/books.xml"
-curr_dir = os.path.dirname(os.path.realpath(__file__))
-file_path = os.path.join(curr_dir, FILE_NAME)
-
-with open(file_path, 'r', encoding="utf-8") as file:
- text = file.read()
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-deepseek_key = os.getenv("DEEPSEEK_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "deepseek/deepseek-chat",
- "api_key": deepseek_key,
- },
- "verbose": True,
-}
-# ************************************************
-# Create the XMLScraperMultiGraph instance and run it
-# ************************************************
-
-xml_scraper_graph = XMLScraperMultiGraph(
- prompt="List me all the authors, title and genres of the books",
- source=[text, text], # Pass the content of the file, not the file object
- config=graph_config
-)
-
-result = xml_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = xml_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/depth_search_graph/.env.example b/examples/depth_search_graph/.env.example
new file mode 100644
index 00000000..8c10cfbb
--- /dev/null
+++ b/examples/depth_search_graph/.env.example
@@ -0,0 +1,14 @@
+# OpenAI API Configuration
+OPENAI_API_KEY=your-openai-api-key-here
+
+# Optional Configurations
+MAX_TOKENS=4000
+MODEL_NAME=gpt-4-1106-preview
+TEMPERATURE=0.7
+
+# Depth Search Settings
+MAX_DEPTH=5
+CRAWL_DELAY=1
+RESPECT_ROBOTS_TXT=true
+MAX_PAGES_PER_DOMAIN=100
+USER_AGENT=Mozilla/5.0
\ No newline at end of file
diff --git a/examples/depth_search_graph/README.md b/examples/depth_search_graph/README.md
new file mode 100644
index 00000000..c4ce05df
--- /dev/null
+++ b/examples/depth_search_graph/README.md
@@ -0,0 +1,30 @@
+# Depth Search Graph Example
+
+This example demonstrates how to use Scrapegraph-ai for deep web crawling and content exploration.
+
+## Features
+
+- Deep web crawling
+- Content discovery
+- Link analysis
+- Recursive search
+
+## Setup
+
+1. Install required dependencies
+2. Copy `.env.example` to `.env`
+3. Configure your API keys in the `.env` file
+
+## Usage
+
+```python
+from scrapegraphai.graphs import DepthSearchGraph
+
+graph = DepthSearchGraph()
+results = graph.search("https://example.com", depth=3)
+```
+
+## Environment Variables
+
+Required environment variables:
+- `OPENAI_API_KEY`: Your OpenAI API key
\ No newline at end of file
diff --git a/examples/local_models/depth_search_graph_ollama.py b/examples/depth_search_graph/ollama/depth_search_graph_ollama.py
similarity index 100%
rename from examples/local_models/depth_search_graph_ollama.py
rename to examples/depth_search_graph/ollama/depth_search_graph_ollama.py
diff --git a/examples/openai/depth_search_graph_openai.py b/examples/depth_search_graph/openai/depth_search_graph_openai.py
similarity index 100%
rename from examples/openai/depth_search_graph_openai.py
rename to examples/depth_search_graph/openai/depth_search_graph_openai.py
diff --git a/examples/document_scraper_graph/.env.example b/examples/document_scraper_graph/.env.example
new file mode 100644
index 00000000..2e7bab46
--- /dev/null
+++ b/examples/document_scraper_graph/.env.example
@@ -0,0 +1,13 @@
+# OpenAI API Configuration
+OPENAI_API_KEY=your-openai-api-key-here
+
+# Optional Configurations
+MAX_TOKENS=4000
+MODEL_NAME=gpt-4-1106-preview
+TEMPERATURE=0.7
+
+# Document Scraper Settings
+OCR_ENABLED=true
+EXTRACT_METADATA=true
+MAX_FILE_SIZE=10485760 # 10MB
+SUPPORTED_FORMATS=pdf,doc,docx,txt
\ No newline at end of file
diff --git a/examples/document_scraper_graph/README.md b/examples/document_scraper_graph/README.md
new file mode 100644
index 00000000..f8561ee7
--- /dev/null
+++ b/examples/document_scraper_graph/README.md
@@ -0,0 +1,30 @@
+# Document Scraper Graph Example
+
+This example demonstrates how to use Scrapegraph-ai to extract data from various document formats (PDF, DOC, DOCX, etc.).
+
+## Features
+
+- Multi-format document support
+- Text extraction
+- Document parsing
+- Metadata extraction
+
+## Setup
+
+1. Install required dependencies
+2. Copy `.env.example` to `.env`
+3. Configure your API keys in the `.env` file
+
+## Usage
+
+```python
+from scrapegraphai.graphs import DocumentScraperGraph
+
+graph = DocumentScraperGraph()
+content = graph.scrape("document.pdf")
+```
+
+## Environment Variables
+
+Required environment variables:
+- `OPENAI_API_KEY`: Your OpenAI API key
\ No newline at end of file
diff --git a/examples/local_models/document_scraper_ollama.py b/examples/document_scraper_graph/ollama/document_scraper_ollama.py
similarity index 100%
rename from examples/local_models/document_scraper_ollama.py
rename to examples/document_scraper_graph/ollama/document_scraper_ollama.py
diff --git a/examples/anthropic/inputs/plain_html_example.txt b/examples/document_scraper_graph/ollama/inputs/plain_html_example.txt
similarity index 100%
rename from examples/anthropic/inputs/plain_html_example.txt
rename to examples/document_scraper_graph/ollama/inputs/plain_html_example.txt
diff --git a/examples/openai/document_scraper_openai.py b/examples/document_scraper_graph/openai/document_scraper_openai.py
similarity index 100%
rename from examples/openai/document_scraper_openai.py
rename to examples/document_scraper_graph/openai/document_scraper_openai.py
diff --git a/examples/mistral/inputs/markdown_example.md b/examples/document_scraper_graph/openai/inputs/markdown_example.md
similarity index 100%
rename from examples/mistral/inputs/markdown_example.md
rename to examples/document_scraper_graph/openai/inputs/markdown_example.md
diff --git a/examples/bedrock/inputs/plain_html_example.txt b/examples/document_scraper_graph/openai/inputs/plain_html_example.txt
similarity index 100%
rename from examples/bedrock/inputs/plain_html_example.txt
rename to examples/document_scraper_graph/openai/inputs/plain_html_example.txt
diff --git a/examples/ernie/code_generator_graph_ernie.py b/examples/ernie/code_generator_graph_ernie.py
deleted file mode 100644
index 65b8e4b9..00000000
--- a/examples/ernie/code_generator_graph_ernie.py
+++ /dev/null
@@ -1,61 +0,0 @@
-"""
-Basic example of scraping pipeline using Code Generator with schema
-"""
-import os
-from typing import List
-from dotenv import load_dotenv
-from pydantic import BaseModel, Field
-from scrapegraphai.graphs import CodeGeneratorGraph
-
-load_dotenv()
-
-# ************************************************
-# Define the output schema for the graph
-# ************************************************
-
-class Project(BaseModel):
- title: str = Field(description="The title of the project")
- description: str = Field(description="The description of the project")
-
-class Projects(BaseModel):
- projects: List[Project]
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-openai_key = os.getenv("OPENAI_APIKEY")
-
-graph_config = {
- "llm": {
- "model": "ernie/ernie-bot-turbo",
- "ernie_client_id": "",
- "ernie_client_secret": "",
- "temperature": 0.1
- },
- "verbose": True,
- "headless": False,
- "reduction": 2,
- "max_iterations": {
- "overall": 10,
- "syntax": 3,
- "execution": 3,
- "validation": 3,
- "semantic": 3
- },
- "output_file_name": "extracted_data.py"
-}
-
-# ************************************************
-# Create the SmartScraperGraph instance and run it
-# ************************************************
-
-code_generator_graph = CodeGeneratorGraph(
- prompt="List me all the projects with their description",
- source="https://perinim.github.io/projects/",
- schema=Projects,
- config=graph_config
-)
-
-result = code_generator_graph.run()
-print(result)
\ No newline at end of file
diff --git a/examples/ernie/csv_scraper_ernie.py b/examples/ernie/csv_scraper_ernie.py
deleted file mode 100644
index 6f4335b6..00000000
--- a/examples/ernie/csv_scraper_ernie.py
+++ /dev/null
@@ -1,57 +0,0 @@
-"""
-Basic example of scraping pipeline using CSVScraperGraph from CSV documents
-"""
-import os
-from dotenv import load_dotenv
-import pandas as pd
-from scrapegraphai.graphs import CSVScraperGraph
-from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
-
-load_dotenv()
-
-# ************************************************
-# Read the CSV file
-# ************************************************
-
-FILE_NAME = "inputs/username.csv"
-curr_dir = os.path.dirname(os.path.realpath(__file__))
-file_path = os.path.join(curr_dir, FILE_NAME)
-
-text = pd.read_csv(file_path)
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-graph_config = {
- "llm": {
- "model": "ernie/ernie-bot-turbo",
- "ernie_client_id": "",
- "ernie_client_secret": "",
- "temperature": 0.1
- }
-}
-
-# ************************************************
-# Create the CSVScraperGraph instance and run it
-# ************************************************
-
-csv_scraper_graph = CSVScraperGraph(
- prompt="List me all the last names",
- source=str(text), # Pass the content of the file, not the file object
- config=graph_config
-)
-
-result = csv_scraper_graph.run()
-print(result)
-
-# ************************************************
-# Get graph execution info
-# ************************************************
-
-graph_exec_info = csv_scraper_graph.get_execution_info()
-print(prettify_exec_info(graph_exec_info))
-
-# Save to json or csv
-convert_to_csv(result, "result")
-convert_to_json(result, "result")
diff --git a/examples/ernie/custom_graph_ernie.py b/examples/ernie/custom_graph_ernie.py
deleted file mode 100644
index a987560e..00000000
--- a/examples/ernie/custom_graph_ernie.py
+++ /dev/null
@@ -1,106 +0,0 @@
-"""
-Example of custom graph using existing nodes
-"""
-from langchain_openai import OpenAIEmbeddings
-from langchain_openai import ChatOpenAI
-from scrapegraphai.graphs import BaseGraph
-from scrapegraphai.nodes import FetchNode, ParseNode, RAGNode, GenerateAnswerNode, RobotsNode
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-
-graph_config = {
- "llm": {
- "model": "ernie/ernie-bot-turbo",
- "ernie_client_id": "",
- "ernie_client_secret": "",
- "temperature": 0.1
- }
-}
-
-# ************************************************
-# Define the graph nodes
-# ************************************************
-
-llm_model = ChatOpenAI(graph_config["llm"])
-embedder = OpenAIEmbeddings(api_key=llm_model.openai_api_key)
-
-# define the nodes for the graph
-robot_node = RobotsNode(
- input="url",
- output=["is_scrapable"],
- node_config={
- "llm_model": llm_model,
- "force_scraping": True,
- "verbose": True,
- }
-)
-
-fetch_node = FetchNode(
- input="url | local_dir",
- output=["doc"],
- node_config={
- "verbose": True,
- "headless": True,
- }
-)
-parse_node = ParseNode(
- input="doc",
- output=["parsed_doc"],
- node_config={
- "chunk_size": 4096,
- "verbose": True,
- }
-)
-rag_node = RAGNode(
- input="user_prompt & (parsed_doc | doc)",
- output=["relevant_chunks"],
- node_config={
- "llm_model": llm_model,
- "embedder_model": embedder,
- "verbose": True,
- }
-)
-generate_answer_node = GenerateAnswerNode(
- input="user_prompt & (relevant_chunks | parsed_doc | doc)",
- output=["answer"],
- node_config={
- "llm_model": llm_model,
- "verbose": True,
- }
-)
-
-# ************************************************
-# Create the graph by defining the connections
-# ************************************************
-
-graph = BaseGraph(
- nodes=[
- robot_node,
- fetch_node,
- parse_node,
- rag_node,
- generate_answer_node,
- ],
- edges=[
- (robot_node, fetch_node),
- (fetch_node, parse_node),
- (parse_node, rag_node),
- (rag_node, generate_answer_node)
- ],
- entry_point=robot_node
-)
-
-# ************************************************
-# Execute the graph
-# ************************************************
-
-result, execution_info = graph.execute({
- "user_prompt": "Describe the content",
- "url": "https://example.com/"
-})
-
-# get the answer from the result
-result = result.get("answer", "No answer found.")
-print(result)
diff --git a/examples/ernie/depth_search_graph_ernie.py b/examples/ernie/depth_search_graph_ernie.py
deleted file mode 100644
index 99470d8d..00000000
--- a/examples/ernie/depth_search_graph_ernie.py
+++ /dev/null
@@ -1,26 +0,0 @@
-"""
-depth_search_graph_opeani example
-"""
-from scrapegraphai.graphs import DepthSearchGraph
-
-graph_config = {
- "llm": {
- "model": "ernie/ernie-bot-turbo",
- "ernie_client_id": "",
- "ernie_client_secret": "",
- "temperature": 0.1
- },
- "verbose": True,
- "headless": False,
- "depth": 2,
- "only_inside_links": False,
-}
-
-search_graph = DepthSearchGraph(
- prompt="List me all the projects with their description",
- source="https://perinim.github.io",
- config=graph_config
-)
-
-result = search_graph.run()
-print(result)
diff --git a/examples/ernie/document_scraper_anthropic_ernie.py b/examples/ernie/document_scraper_anthropic_ernie.py
deleted file mode 100644
index 74d91be1..00000000
--- a/examples/ernie/document_scraper_anthropic_ernie.py
+++ /dev/null
@@ -1,39 +0,0 @@
-"""
-document_scraper example
-"""
-import os
-import json
-from scrapegraphai.graphs import DocumentScraperGraph
-
-# ************************************************
-# Define the configuration for the graph
-# ************************************************
-graph_config = {
- "llm": {
- "model": "ernie/ernie-bot-turbo",
- "ernie_client_id": "",
- "ernie_client_secret": "",
- "temperature": 0.1
- }
-}
-
-
-source = """
- The Divine Comedy, Italian La Divina Commedia, original name La commedia, long narrative poem written in Italian
- circa 1308/21 by Dante. It is usually held to be one of the world s great works of literature.
- Divided into three major sections—Inferno, Purgatorio, and Paradiso—the narrative traces the journey of Dante
- from darkness and error to the revelation of the divine light, culminating in the Beatific Vision of God.
- Dante is guided by the Roman poet Virgil, who represents the epitome of human knowledge, from the dark wood
- through the descending circles of the pit of Hell (Inferno). He then climbs the mountain of Purgatory, guided
- by the Roman poet Statius, who represents the fulfilment of human knowledge, and is finally led by his lifelong love,
- the Beatrice of his earlier poetry, through the celestial spheres of Paradise.
-"""
-
-pdf_scraper_graph = DocumentScraperGraph(
- prompt="Summarize the text and find the main topics",
- source=source,
- config=graph_config,
-)
-result = pdf_scraper_graph.run()
-
-print(json.dumps(result, indent=4))
\ No newline at end of file
diff --git a/examples/ernie/inputs/books.xml b/examples/ernie/inputs/books.xml
deleted file mode 100644
index e3d1fe87..00000000
--- a/examples/ernie/inputs/books.xml
+++ /dev/null
@@ -1,120 +0,0 @@
-
-
-
- Gambardella, Matthew
- XML Developer's Guide
- Computer
- 44.95
- 2000-10-01
- An in-depth look at creating applications
- with XML.
-
-
- Ralls, Kim
- Midnight Rain
- Fantasy
- 5.95
- 2000-12-16
- A former architect battles corporate zombies,
- an evil sorceress, and her own childhood to become queen
- of the world.
-
-
- Corets, Eva
- Maeve Ascendant
- Fantasy
- 5.95
- 2000-11-17
- After the collapse of a nanotechnology
- society in England, the young survivors lay the
- foundation for a new society.
-
-
- Corets, Eva
- Oberon's Legacy
- Fantasy
- 5.95
- 2001-03-10
- In post-apocalypse England, the mysterious
- agent known only as Oberon helps to create a new life
- for the inhabitants of London. Sequel to Maeve
- Ascendant.
-
-
- Corets, Eva
- The Sundered Grail
- Fantasy
- 5.95
- 2001-09-10
- The two daughters of Maeve, half-sisters,
- battle one another for control of England. Sequel to
- Oberon's Legacy.
-
-
- Randall, Cynthia
- Lover Birds
- Romance
- 4.95
- 2000-09-02
- When Carla meets Paul at an ornithology
- conference, tempers fly as feathers get ruffled.
-
-
- Thurman, Paula
- Splish Splash
- Romance
- 4.95
- 2000-11-02
- A deep sea diver finds true love twenty
- thousand leagues beneath the sea.
-
-
- Knorr, Stefan
- Creepy Crawlies
- Horror
- 4.95
- 2000-12-06
- An anthology of horror stories about roaches,
- centipedes, scorpions and other insects.
-
-
- Kress, Peter
- Paradox Lost
- Science Fiction
- 6.95
- 2000-11-02
- After an inadvertant trip through a Heisenberg
- Uncertainty Device, James Salway discovers the problems
- of being quantum.
-
-
- O'Brien, Tim
- Microsoft .NET: The Programming Bible
- Computer
- 36.95
- 2000-12-09
- Microsoft's .NET initiative is explored in
- detail in this deep programmer's reference.
-
-
- O'Brien, Tim
- MSXML3: A Comprehensive Guide
- Computer
- 36.95
- 2000-12-01
- The Microsoft MSXML3 parser is covered in
- detail, with attention to XML DOM interfaces, XSLT processing,
- SAX and more.
-
-
- Galos, Mike
- Visual Studio 7: A Comprehensive Guide
- Computer
- 49.95
- 2001-04-16
- Microsoft Visual Studio 7 is explored in depth,
- looking at how Visual Basic, Visual C++, C#, and ASP+ are
- integrated into a comprehensive development
- environment.
-
-
\ No newline at end of file
diff --git a/examples/ernie/inputs/example.json b/examples/ernie/inputs/example.json
deleted file mode 100644
index 2263184c..00000000
--- a/examples/ernie/inputs/example.json
+++ /dev/null
@@ -1,182 +0,0 @@
-{
- "kind":"youtube#searchListResponse",
- "etag":"q4ibjmYp1KA3RqMF4jFLl6PBwOg",
- "nextPageToken":"CAUQAA",
- "regionCode":"NL",
- "pageInfo":{
- "totalResults":1000000,
- "resultsPerPage":5
- },
- "items":[
- {
- "kind":"youtube#searchResult",
- "etag":"QCsHBifbaernVCbLv8Cu6rAeaDQ",
- "id":{
- "kind":"youtube#video",
- "videoId":"TvWDY4Mm5GM"
- },
- "snippet":{
- "publishedAt":"2023-07-24T14:15:01Z",
- "channelId":"UCwozCpFp9g9x0wAzuFh0hwQ",
- "title":"3 Football Clubs Kylian Mbappe Should Avoid Signing ✍️❌⚽️ #football #mbappe #shorts",
- "description":"",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/TvWDY4Mm5GM/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/TvWDY4Mm5GM/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/TvWDY4Mm5GM/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"FC Motivate",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-24T14:15:01Z"
- }
- },
- {
- "kind":"youtube#searchResult",
- "etag":"0NG5QHdtIQM_V-DBJDEf-jK_Y9k",
- "id":{
- "kind":"youtube#video",
- "videoId":"aZM_42CcNZ4"
- },
- "snippet":{
- "publishedAt":"2023-07-24T16:09:27Z",
- "channelId":"UCM5gMM_HqfKHYIEJ3lstMUA",
- "title":"Which Football Club Could Cristiano Ronaldo Afford To Buy? 💰",
- "description":"Sign up to Sorare and get a FREE card: https://sorare.pxf.io/NellisShorts Give Soraredata a go for FREE: ...",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/aZM_42CcNZ4/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/aZM_42CcNZ4/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/aZM_42CcNZ4/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"John Nellis",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-24T16:09:27Z"
- }
- },
- {
- "kind":"youtube#searchResult",
- "etag":"WbBz4oh9I5VaYj91LjeJvffrBVY",
- "id":{
- "kind":"youtube#video",
- "videoId":"wkP3XS3aNAY"
- },
- "snippet":{
- "publishedAt":"2023-07-24T16:00:50Z",
- "channelId":"UC4EP1dxFDPup_aFLt0ElsDw",
- "title":"PAULO DYBALA vs THE WORLD'S LONGEST FREEKICK WALL",
- "description":"Can Paulo Dybala curl a football around the World's longest free kick wall? We met up with the World Cup winner and put him to ...",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/wkP3XS3aNAY/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/wkP3XS3aNAY/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/wkP3XS3aNAY/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"Shoot for Love",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-24T16:00:50Z"
- }
- },
- {
- "kind":"youtube#searchResult",
- "etag":"juxv_FhT_l4qrR05S1QTrb4CGh8",
- "id":{
- "kind":"youtube#video",
- "videoId":"rJkDZ0WvfT8"
- },
- "snippet":{
- "publishedAt":"2023-07-24T10:00:39Z",
- "channelId":"UCO8qj5u80Ga7N_tP3BZWWhQ",
- "title":"TOP 10 DEFENDERS 2023",
- "description":"SoccerKingz https://soccerkingz.nl Use code: 'ILOVEHOF' to get 10% off. TOP 10 DEFENDERS 2023 Follow us! • Instagram ...",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/rJkDZ0WvfT8/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/rJkDZ0WvfT8/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/rJkDZ0WvfT8/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"Home of Football",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-24T10:00:39Z"
- }
- },
- {
- "kind":"youtube#searchResult",
- "etag":"wtuknXTmI1txoULeH3aWaOuXOow",
- "id":{
- "kind":"youtube#video",
- "videoId":"XH0rtu4U6SE"
- },
- "snippet":{
- "publishedAt":"2023-07-21T16:30:05Z",
- "channelId":"UCwozCpFp9g9x0wAzuFh0hwQ",
- "title":"3 Things You Didn't Know About Erling Haaland ⚽️🇳🇴 #football #haaland #shorts",
- "description":"",
- "thumbnails":{
- "default":{
- "url":"https://i.ytimg.com/vi/XH0rtu4U6SE/default.jpg",
- "width":120,
- "height":90
- },
- "medium":{
- "url":"https://i.ytimg.com/vi/XH0rtu4U6SE/mqdefault.jpg",
- "width":320,
- "height":180
- },
- "high":{
- "url":"https://i.ytimg.com/vi/XH0rtu4U6SE/hqdefault.jpg",
- "width":480,
- "height":360
- }
- },
- "channelTitle":"FC Motivate",
- "liveBroadcastContent":"none",
- "publishTime":"2023-07-21T16:30:05Z"
- }
- }
- ]
-}
\ No newline at end of file
diff --git a/examples/ernie/inputs/plain_html_example.txt b/examples/ernie/inputs/plain_html_example.txt
deleted file mode 100644
index 78f814ae..00000000
--- a/examples/ernie/inputs/plain_html_example.txt
+++ /dev/null
@@ -1,105 +0,0 @@
-
-
-
-
-
-