-
Notifications
You must be signed in to change notification settings - Fork 762
Description
Last month, I ran the wayback_machine_downloader normally ok ,But starting from yesterday,I tried many domain names, each returned result was a connection refused,
The command like this : wayback_machine_downloader http://huzhan.com --concurrency 3 -t 20220525005404 -a
The corresponding result like this take a look below:
https://www.huzhan.com/code/goods377071.html -> websites/huzhan.com/code/goods377071.html (280/112619)
https://www.huzhan.com/serve/goods14529.html -> websites/huzhan.com/serve/goods14529.html (281/112619)
https://www.huzhan.com/serve/goods12899.html # Connection refused - connect(2)
https://www.huzhan.com/serve/goods12899.html -> websites/huzhan.com/serve/goods12899.html (282/112619)
https://www.huzhan.com/ishop42980/ # Connection refused - connect(2)
https://www.huzhan.com/ishop42980/ -> websites/huzhan.com/ishop42980/index.html (283/112619)
https://www.huzhan.com/code/goods421671.html # Connection refused - connect(2)
https://www.huzhan.com/code/goods421671.html -> websites/huzhan.com/code/goods421671.html (284/112619)
https://www.huzhan.com/serve/goods15588.html # Connection refused - connect(2)
https://www.huzhan.com/serve/goods15588.html -> websites/huzhan.com/serve/goods15588.html (285/112619)
https://www.huzhan.com/serve/goods15287.html # Connection refused - connect(2)
https://www.huzhan.com/serve/goods15287.html -> websites/huzhan.com/serve/goods15287.html (286/112619)
https://www.huzhan.com/code/goods420832.html # Connection refused - connect(2)
https://www.huzhan.com/code/goods420832.html -> websites/huzhan.com/code/goods420832.html (287/112619)
https://www.huzhan.com/ishop37725/ # Connection refused - connect(2)
https://www.huzhan.com/ishop37725/ -> websites/huzhan.com/ishop37725/index.html (288/112619)
https://www.huzhan.com/code/goods372252.html # Connection refused - connect(2)
https://www.huzhan.com/code/goods372252.html -> websites/huzhan.com/code/goods372252.html (289/112619)
https://www.huzhan.com/code/goods418192.html # Connection refused - connect(2)
https://www.huzhan.com/ishop21789/ # Connection refused - connect(2)
https://www.huzhan.com/code/goods418192.html -> websites/huzhan.com/code/goods418192.html (290/112619)
https://www.huzhan.com/ishop21789/ -> websites/huzhan.com/ishop21789/index.html (291/112619)
https://www.huzhan.com/code/goods354759.html # Connection refused - connect(2)
https://www.huzhan.com/code/goods354759.html -> websites/huzhan.com/code/goods354759.html (292/112619)
https://www.huzhan.com/code/goods421676.html # Connection refused - connect(2)
https://www.huzhan.com/code/goods421676.html -> websites/huzhan.com/code/goods421676.html (293/112619)
https://www.huzhan.com/code/goods412576.html # Connection refused - connect(2)
https://www.huzhan.com/ishop40294/ # Connection refused - connect(2)
https://www.huzhan.com/code/goods412576.html -> websites/huzhan.com/code/goods412576.html (294/112619)
https://www.huzhan.com/ishop40294/ -> websites/huzhan.com/ishop40294/index.html (295/112619)
https://www.huzhan.com/ishop40283/ # Connection refused - connect(2)
https://www.huzhan.com/ishop40283/ -> websites/huzhan.com/ishop40283/index.html (296/112619)
https://www.huzhan.com/serve/goods15226.html # Connection refused - connect(2)
https://www.huzhan.com/serve/goods15226.html -> websites/huzhan.com/serve/goods15226.html (297/112619)
https://www.huzhan.com/ishop44505/ # Connection refused - connect(2)
https://www.huzhan.com/ishop44505/ -> websites/huzhan.com/ishop44505/index.html (298/112619)
https://www.huzhan.com/code/goods410194.html # Connection refused - connect(2)
https://www.huzhan.com/code/goods410194.html -> websites/huzhan.com/code/goods410194.html (299/112619)
https://www.huzhan.com/ishop41272/ # Connection refused - connect(2)
https://www.huzhan.com/serve/goods15735.html # Connection refused - connect(2)
https://www.huzhan.com/ishop41272/ -> websites/huzhan.com/ishop41272/index.html (300/112619)
https://www.huzhan.com/serve/goods15735.html -> websites/huzhan.com/serve/goods15735.html (301/112619)
https://www.huzhan.com/code/goods420725.html # Connection refused - connect(2)
https://www.huzhan.com/code/goods420725.html -> websites/huzhan.com/code/goods420725.html (302/112619)
https://www.huzhan.com/ishop43261/ # Connection refused - connect(2)
https://www.huzhan.com/ishop43261/ -> websites/huzhan.com/ishop43261/index.html (303/112619)
https://www.huzhan.com/serve/goods15565.html # Connection refused - connect(2)
https://www.huzhan.com/serve/goods15565.html -> websites/huzhan.com/serve/goods15565.html (304/112619)
https://www.huzhan.com/ishop44358/ # Connection refused - connect(2)
https://www.huzhan.com/ishop44358/ -> websites/huzhan.com/ishop44358/index.html (305/112619)
https://www.huzhan.com/code/page/4 # Connection refused - connect(2)
https://www.huzhan.com/ishop7456/ # Connection refused - connect(2)
then,I get lots files is empty, Did the archive website implement controls to prevent crawling? Because I can access it normally using a browser,Similarly I can also obtain the files by Wget tool,Thank you for following this issue !