PokeSpider

A web crawler designed to scrape pokemon card prices from TCGPlayer.com and export them to .csv files.

Installation

Install Python 3, if you do not have it already.
Create a new virtual environment:
```
python -m venv venv
```

Enter the virtual environment:

Powershell:

. .venv\Scripts\Activate.ps1

cmd.exe:

. .venv\Scripts\activate.bat

Linux:

source .venv/bin/activate

Install dependencies:

pip install -r requirements
playwright install

Running

Enter the virtual environment, if you are not in it already. (See step 3 of the installation instructions)
Run the crawler with the following command:
```
scrapy crawl 'main`
```
A window will pop up with a list of sets that can be scrapped. Check the ones that you want and then close the window.
Wait and eventually it should complete.

Other Notes:

Important Files for Making edits

File	Purpose
settings.py	Settings for Scrapy and the spider
pipelines.py	Pipeline that takes items and outputs them to CSV files.
items.py	The data structure for the scraped data
spiders/main_spider.py	The spider code that handles requesting and parsing data.

Dependencies:

Dependency	Min Version	Reason Used	Notes
scrapy	2.11.0	Framework that orchestrates the scraping process and provides a CLI tool for running the scaper.
playwright	1.15	Runs a headless browser that downloads dynamic content.
scrapy-playwright	Special	Implements a Scrapy download handler that lets scrapy download pages using playwright.	This project uses a fork of scrapy-playwright that lets it run on Windows, rather than just Linux. This is included in source form in this project rather than as a submodule
wxPython	4.2.1	Used to implement the set selector window

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
pokespider		pokespider
scrapy_playwright		scrapy_playwright
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
LICENSE_SCRAPY_PLAYWRIGHT		LICENSE_SCRAPY_PLAYWRIGHT
README.md		README.md
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

PokeSpider

Installation

Running

Other Notes:

Important Files for Making edits

Dependencies:

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Licenses found

Richard-Stump/pokespider

Folders and files

Latest commit

History

Repository files navigation

PokeSpider

Installation

Running

Other Notes:

Important Files for Making edits

Dependencies:

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages