Skip to content

Commit 5712945

Browse files
tw4lShrinks99
andauthored
Update usage docs section on creating web archives (#899)
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
1 parent 2fd6190 commit 5712945

File tree

1 file changed

+14
-6
lines changed

1 file changed

+14
-6
lines changed

docs/manual/usage.rst

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -154,20 +154,20 @@ To enable auto-indexing, run with ``wayback -a`` or ``wayback -a --auto-interval
154154
Creating a Web Archive
155155
----------------------
156156

157-
Using Webrecorder
158-
^^^^^^^^^^^^^^^^^
157+
Using ArchiveWeb.page
158+
^^^^^^^^^^^^^^^^^^^^^
159159

160-
If you do not have a web archive to test, one easy way to create one is to use `Webrecorder <https://webrecorder.io>`_
160+
If you do not have a web archive to test, one easy way to create one is to use the `ArchiveWeb.page <https://archiveweb.page>`_ browser extension for Chrome and other Chromium-based browsers such as Brave Browser. ArchiveWeb.page records pages visited during an archiving session in the browser, and provides means of both replaying and downloading the archived items created.
161161

162-
After recording, you can click **Stop** and then click `Download Collection` to receive a WARC (`.warc.gz`) file.
162+
Follow the instructions in `How To Create Web Archives with ArchiveWeb.page <https://archiveweb.page/en/usage/>`_. After recording, press **Stop** and then `download your collection <https://archiveweb.page/en/download/>`_ to receive a WARC (`.warc.gz`) file. If you choose to download your collection in the WACZ format, the WARC files can be found inside the zipped WACZ in the ``archive/`` directory.
163163

164-
You can then use this with work with pywb.
164+
You can then use your WARCs to work with pywb.
165165

166166

167167
Using pywb Recorder
168168
^^^^^^^^^^^^^^^^^^^
169169

170-
The core recording functionality in Webrecorder is also part of :mod:`pywb`. If you want to create a WARC locally, this can be
170+
Recording functionality is also part of :mod:`pywb`. If you want to create a WARC locally, this can be
171171
done by directly recording into your pywb collection:
172172

173173
1. Create a collection: ``wb-manager init my-web-archive`` (if you haven't already created a web archive collection)
@@ -180,6 +180,14 @@ In this configuration, the indexing happens every 10 seconds.. After 10 seconds,
180180
``http://localhost:8080/my-web-archive/http://example.com/``
181181

182182

183+
Using Browsertrix
184+
^^^^^^^^^^^^^^^^^
185+
186+
For a more automated browser-based web archiving experience, `Browsertrix <https://browsertrix.com/>`_ provides a web interface for configuring, scheduling, running, reviewing, and curating crawls of web content. Crawl activity is shown in a live screencast of the browsers used for crawling and all web archives created in Browsertrix can be easily downloaded from the application in the WACZ format.
187+
188+
`Browsertrix Crawler <https://crawler.docs.browsertrix.com/>`_, which provides the underlying crawling functionality of Browsertrix, can also be run standalone in a Docker container on your local computer.
189+
190+
183191
HTTP/S Proxy Mode Access
184192
------------------------
185193

0 commit comments

Comments
 (0)