Skip to content

Azlx/crawl_book

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

crawl_book

爬取豆瓣指定页数书单

1.初始化数据库

python manage.py rebuild_db

2.爬取

python manage.py crawl [num]

num:(可选)表示要爬取的页数,默认为100

3.环境

xpath、lxml、SQLAlchemy、postgresql

注:爬取代码所使用的自动获取代理脚本地址:ProxyPool

About

爬取豆瓣书单

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages