You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-5Lines changed: 2 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -31,9 +31,7 @@ Internet-archive is a nice source for several OSINT-information. This script is
31
31
32
32
This script allows you to download content from the Wayback Machine (archive.org). You can use it to download either the latest version or all versions of web page snapshots within a specified range.
33
33
34
-
## Info
35
-
36
-
- The script will only request status code 200 snapshots (for now) - but this can differ from the status code when downloading the file.
34
+
<!-- ## Info -->
37
35
38
36
### Arguments
39
37
@@ -65,8 +63,7 @@ Specify the range in years or a specific timestamp either start, end or both. If
65
63
66
64
#### Additional
67
65
68
-
-`--redirect`: Follow redirects of snapshots. Default is False. If a source has not statuscode 200, archive.org will redirect to the closest snapshot. So when setting this to `true`, parts of a timestamp-folder may not truly belong to the given timestamp.
69
-
<!-- - `--harvest`: The downloaded files are scanned for locations on the same domain. These locations (mostly resources) are then tried to be accessed within the same timestamp. Setting this to `true` may result in identical files in different timestamps but you may get a more complete snapshot of the website. -->
66
+
-`--no-redirect`: Do not follow redirects of snapshots. Archive.org sometimes redirects to a different snapshot for several reasons. Downloading redirects may lead to timestamp-folders which contain some files with a different timestamp. This does not matter if you only want to download the latest version (`-c`).
70
67
-`--verbosity [LEVEL]`: Set the verbosity: json (print json response), progress (show progress bar) or standard (default).
71
68
-`--retry [RETRY_FAILED]`: Retry failed downloads. You can specify the number of retry attempts as an integer.
72
69
-`--worker [AMOUNT]`: The number of worker to use for downloading (simultaneous downloads). Default is 1. Beware: Using too many worker will lead into refused connections from the Wayback Machine. Duration about 1.5 minutes.
0 commit comments