From cbfb8c7a862c9af5391eee910b50fc66baca0740 Mon Sep 17 00:00:00 2001 From: DaleLore Date: Wed, 28 May 2025 22:58:22 -0400 Subject: [PATCH 01/10] Draft doc for Pausing and Resuming Crawl section --- frontend/docs/docs/user-guide/running-crawl.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index df68c37137..15da53e41c 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -18,7 +18,7 @@ A crawl workflow that is in progress can be in one of the following states: ## Watch Crawl -You can watch the current state of the browser windows as the crawler visit pages in the **Watch** tab of **Latest Crawl**. A list of queued URLs are displayed below in the **Upcoming Pages** section. +You can watch the current state of the browser windows as the crawler visits pages in the **Watch** tab of **Latest Crawl**. A list of queued URLs are displayed below in the **Upcoming Pages** section. ## Live Exclusion Editing @@ -34,6 +34,16 @@ Like exclusions, the number of [browser windows](workflow-setup.md#browser-windo Unlike exclusions, this change will not be applied to future workflow runs. +## Pausing and Resuming Crawls + +If you need to pause your crawl at any point after it has started, whether to reassess, rescope, or for any other reason, you now have the option to do so. + +Pausing a crawl will halt the timer and prevent any increase in the number of pages crawled or the overall size. Your Status will change from Running to Pausing to eventually Paused to signify that the crawler is no longer crawling. + +When you're ready to resume, simply click the resume button. Your status will update from Resuming to Running to indicate that the crawler has started crawling again. + +NOTE: While your crawl is paused, you are unable to replay the crawl until it is completed. + ## End a Crawl If a crawl workflow is not crawling websites as intended it may be preferable to end crawling operations and update the crawl workflow's settings before trying again. There are two operations to end crawls, available both on the workflow's details page, or as part of the actions menu in the workflow list. From 391143a78d5f4e725be06d6ebb25ee232a7a310b Mon Sep 17 00:00:00 2001 From: DaleLore Date: Mon, 2 Jun 2025 14:09:40 -0400 Subject: [PATCH 02/10] Adding Admonitions --- frontend/docs/docs/user-guide/running-crawl.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index 15da53e41c..2fff53310e 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -42,7 +42,8 @@ Pausing a crawl will halt the timer and prevent any increase in the number of pa When you're ready to resume, simply click the resume button. Your status will update from Resuming to Running to indicate that the crawler has started crawling again. -NOTE: While your crawl is paused, you are unable to replay the crawl until it is completed. +???+ Note + While your crawl is paused, you are unable to replay the crawl until it is completed. ## End a Crawl From f24b4a45b5396ef76c17a0eefbb0cb6aa18ce0ce Mon Sep 17 00:00:00 2001 From: DaleLore Date: Tue, 10 Jun 2025 12:51:45 -0400 Subject: [PATCH 03/10] Paused crawls can still be viewed --- frontend/docs/docs/user-guide/running-crawl.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index 2fff53310e..ea3bd16459 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -43,7 +43,7 @@ Pausing a crawl will halt the timer and prevent any increase in the number of pa When you're ready to resume, simply click the resume button. Your status will update from Resuming to Running to indicate that the crawler has started crawling again. ???+ Note - While your crawl is paused, you are unable to replay the crawl until it is completed. + While your crawl is paused, you can replay and view the data captured so far. However, if the crawl isn’t resumed within 7 days of being paused, it will automatically switch to a stopped state. Once stopped, replay functionality will be unavailable until the crawl is fully completed. ## End a Crawl From cb544f5896d60b6799a398d2539904f76a3a0e5a Mon Sep 17 00:00:00 2001 From: Ilya Kreymer Date: Tue, 10 Jun 2025 10:21:59 -0700 Subject: [PATCH 04/10] Minor tweaks --- frontend/docs/docs/user-guide/running-crawl.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index ea3bd16459..cca04f32b4 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -38,12 +38,12 @@ Unlike exclusions, this change will not be applied to future workflow runs. If you need to pause your crawl at any point after it has started, whether to reassess, rescope, or for any other reason, you now have the option to do so. -Pausing a crawl will halt the timer and prevent any increase in the number of pages crawled or the overall size. Your Status will change from Running to Pausing to eventually Paused to signify that the crawler is no longer crawling. +Pausing a crawl will halt the timer and prevent any increase in the number of pages crawled or the overall size. The crawl status will change from Running to Pausing to eventually Paused to signify that the crawler is no longer crawling. -When you're ready to resume, simply click the resume button. Your status will update from Resuming to Running to indicate that the crawler has started crawling again. +When you're ready to resume, simply click the Resume button. Your status will update from Resuming to Running to indicate that the crawler has started crawling again. ???+ Note - While your crawl is paused, you can replay and view the data captured so far. However, if the crawl isn’t resumed within 7 days of being paused, it will automatically switch to a stopped state. Once stopped, replay functionality will be unavailable until the crawl is fully completed. + While your crawl is paused, you can replay and view the data captured so far. However, if the crawl isn’t resumed within 7 days of being paused, it will automatically switch to a stopped state. Once stopped, the crawl is finished and can no longer be resumed. ## End a Crawl From 8544c00149f8b35a80bc30f22ac5fd465ef0a07e Mon Sep 17 00:00:00 2001 From: Tessa Walsh Date: Tue, 10 Jun 2025 13:46:10 -0400 Subject: [PATCH 05/10] Apply changes to match docs style guide --- frontend/docs/docs/user-guide/running-crawl.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index cca04f32b4..c09c7f1bab 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -36,14 +36,14 @@ Unlike exclusions, this change will not be applied to future workflow runs. ## Pausing and Resuming Crawls -If you need to pause your crawl at any point after it has started, whether to reassess, rescope, or for any other reason, you now have the option to do so. -Pausing a crawl will halt the timer and prevent any increase in the number of pages crawled or the overall size. The crawl status will change from Running to Pausing to eventually Paused to signify that the crawler is no longer crawling. +To pause a running crawl, click the *Pause* button. The crawl status will change from Running to Pausing as in-progress pages are completed, and then to Paused once the crawler is successful paused. Paused crawls do not continue to accrue execution time. +While a crawl is paused, it is possible to replay the pages crawled up to that point and to download the WACZ files from the *Latest Crawl* tab. -When you're ready to resume, simply click the Resume button. Your status will update from Resuming to Running to indicate that the crawler has started crawling again. +To resume a paused crawl, simply click the *Resume* button. The crawl status will update from Resuming to Running to indicate that the crawler has started crawling again. Any changes to the workflow settings will be applied in the the resumed crawl. ???+ Note - While your crawl is paused, you can replay and view the data captured so far. However, if the crawl isn’t resumed within 7 days of being paused, it will automatically switch to a stopped state. Once stopped, the crawl is finished and can no longer be resumed. + Paused crawls that are not resumed within 7 days of being paused are automatically stopped. Once stopped, the crawl is finished and can no longer be resumed. ## End a Crawl From 803c540b94baefdfdedb6c209b4d0d28b574e70c Mon Sep 17 00:00:00 2001 From: Tessa Walsh Date: Tue, 10 Jun 2025 13:46:50 -0400 Subject: [PATCH 06/10] Fix spacing --- frontend/docs/docs/user-guide/running-crawl.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index c09c7f1bab..176fdb888a 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -35,9 +35,8 @@ Like exclusions, the number of [browser windows](workflow-setup.md#browser-windo Unlike exclusions, this change will not be applied to future workflow runs. ## Pausing and Resuming Crawls - - To pause a running crawl, click the *Pause* button. The crawl status will change from Running to Pausing as in-progress pages are completed, and then to Paused once the crawler is successful paused. Paused crawls do not continue to accrue execution time. + While a crawl is paused, it is possible to replay the pages crawled up to that point and to download the WACZ files from the *Latest Crawl* tab. To resume a paused crawl, simply click the *Resume* button. The crawl status will update from Resuming to Running to indicate that the crawler has started crawling again. Any changes to the workflow settings will be applied in the the resumed crawl. From b73573e1a6c26b196283158a0daeae51d434d464 Mon Sep 17 00:00:00 2001 From: Tessa Walsh Date: Tue, 10 Jun 2025 13:47:16 -0400 Subject: [PATCH 07/10] Add line break --- frontend/docs/docs/user-guide/running-crawl.md | 1 + 1 file changed, 1 insertion(+) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index 176fdb888a..60ed7d463c 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -35,6 +35,7 @@ Like exclusions, the number of [browser windows](workflow-setup.md#browser-windo Unlike exclusions, this change will not be applied to future workflow runs. ## Pausing and Resuming Crawls + To pause a running crawl, click the *Pause* button. The crawl status will change from Running to Pausing as in-progress pages are completed, and then to Paused once the crawler is successful paused. Paused crawls do not continue to accrue execution time. While a crawl is paused, it is possible to replay the pages crawled up to that point and to download the WACZ files from the *Latest Crawl* tab. From 6e52f17c2a2a2386e7f0930e21bcfdf751aed4c4 Mon Sep 17 00:00:00 2001 From: Tessa Walsh Date: Tue, 10 Jun 2025 15:13:28 -0400 Subject: [PATCH 08/10] Add back simplified first sentence --- frontend/docs/docs/user-guide/running-crawl.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index 60ed7d463c..b53cb7a66a 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -36,6 +36,8 @@ Unlike exclusions, this change will not be applied to future workflow runs. ## Pausing and Resuming Crawls +If you need to reassess or rescope your crawl at any point after it has started, you can pause the running crawl. + To pause a running crawl, click the *Pause* button. The crawl status will change from Running to Pausing as in-progress pages are completed, and then to Paused once the crawler is successful paused. Paused crawls do not continue to accrue execution time. While a crawl is paused, it is possible to replay the pages crawled up to that point and to download the WACZ files from the *Latest Crawl* tab. From 3ff374fe310fd2f4ec0b2593f32a7dd3d4150cd4 Mon Sep 17 00:00:00 2001 From: Ilya Kreymer Date: Tue, 10 Jun 2025 13:32:30 -0700 Subject: [PATCH 09/10] Apply formatting suggestions from code review --- frontend/docs/docs/user-guide/running-crawl.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index b53cb7a66a..a9bb501080 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -38,14 +38,14 @@ Unlike exclusions, this change will not be applied to future workflow runs. If you need to reassess or rescope your crawl at any point after it has started, you can pause the running crawl. -To pause a running crawl, click the *Pause* button. The crawl status will change from Running to Pausing as in-progress pages are completed, and then to Paused once the crawler is successful paused. Paused crawls do not continue to accrue execution time. +To pause a running crawl, click the *Pause* button. The crawl status will change from *Running* to *Pausing* as in-progress pages are completed, and then to *Paused* once the crawler is successful paused. Paused crawls do not continue to accrue execution time. While a crawl is paused, it is possible to replay the pages crawled up to that point and to download the WACZ files from the *Latest Crawl* tab. -To resume a paused crawl, simply click the *Resume* button. The crawl status will update from Resuming to Running to indicate that the crawler has started crawling again. Any changes to the workflow settings will be applied in the the resumed crawl. +To resume a paused crawl, simply click the *Resume* button. The crawl status will update from *Resuming* to *Running* to indicate that the crawler has started crawling again. Any changes to the workflow settings will be applied in the the resumed crawl. ???+ Note - Paused crawls that are not resumed within 7 days of being paused are automatically stopped. Once stopped, the crawl is finished and can no longer be resumed. + Paused crawls that are not resumed within 7 days of being paused are automatically updated to *Stopped*. Once stopped, the crawl is finished and can no longer be resumed. ## End a Crawl From 43fc89585c2f0eb04f6162f1c0b5c5334ce0bb1f Mon Sep 17 00:00:00 2001 From: Ilya Kreymer Date: Tue, 10 Jun 2025 17:18:46 -0700 Subject: [PATCH 10/10] icons: add pausing/paused/resuming icons + states to crawl states rename 'Modifying Running Crawls' -> Running Crawls --- .../.icons/bootstrap/pause-circle.svg | 4 ++++ .../.icons/bootstrap/play-circle.svg | 4 ++++ frontend/docs/docs/stylesheets/extra.css | 13 ++++++++----- .../docs/docs/user-guide/running-crawl.md | 19 +++++++++++-------- 4 files changed, 27 insertions(+), 13 deletions(-) create mode 100644 frontend/docs/docs/overrides/.icons/bootstrap/pause-circle.svg create mode 100644 frontend/docs/docs/overrides/.icons/bootstrap/play-circle.svg diff --git a/frontend/docs/docs/overrides/.icons/bootstrap/pause-circle.svg b/frontend/docs/docs/overrides/.icons/bootstrap/pause-circle.svg new file mode 100644 index 0000000000..6d3aeff01d --- /dev/null +++ b/frontend/docs/docs/overrides/.icons/bootstrap/pause-circle.svg @@ -0,0 +1,4 @@ + + + + \ No newline at end of file diff --git a/frontend/docs/docs/overrides/.icons/bootstrap/play-circle.svg b/frontend/docs/docs/overrides/.icons/bootstrap/play-circle.svg new file mode 100644 index 0000000000..a1d742e00f --- /dev/null +++ b/frontend/docs/docs/overrides/.icons/bootstrap/play-circle.svg @@ -0,0 +1,4 @@ + + + + \ No newline at end of file diff --git a/frontend/docs/docs/stylesheets/extra.css b/frontend/docs/docs/stylesheets/extra.css index 264c0f2e87..57bf9e7914 100644 --- a/frontend/docs/docs/stylesheets/extra.css +++ b/frontend/docs/docs/stylesheets/extra.css @@ -1,4 +1,4 @@ -@import './theme.css'; +@import "./theme.css"; /* Font style definitions */ @font-face { @@ -8,9 +8,9 @@ font-display: swap; src: url("https://cdn.webrecorder.net/fonts/recursive/recursive-latin.woff2") format("woff2"); - unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, - U+02DA, U+02DC, U+0304, U+0308, U+0329, U+2000-206F, U+2074, U+20AC, - U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD; + unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, U+02DA, + U+02DC, U+0304, U+0308, U+0329, U+2000-206F, U+2074, U+20AC, U+2122, U+2191, + U+2193, U+2212, U+2215, U+FEFF, U+FFFD; } @font-face { @@ -141,7 +141,10 @@ h3 { } .md-typeset { - font-feature-settings: "ss04" off,"ss07" on,"ss08" on; + font-feature-settings: + "ss04" off, + "ss07" on, + "ss08" on; } /* Custom badge classes, applies custom overrides to inline-code blocks */ diff --git a/frontend/docs/docs/user-guide/running-crawl.md b/frontend/docs/docs/user-guide/running-crawl.md index a9bb501080..6c11658034 100644 --- a/frontend/docs/docs/user-guide/running-crawl.md +++ b/frontend/docs/docs/user-guide/running-crawl.md @@ -1,4 +1,4 @@ -# Modifying Running Crawls +# Running Crawls Running crawls can be modified from the crawl workflow **Latest Crawl** tab. You may want to modify a running crawl if you find that the workflow is crawling pages that you didn't intend to archive, or if you want a boost of speed. @@ -8,13 +8,16 @@ A crawl workflow that is in progress can be in one of the following states: | Status | Description | | ---- | ---- | -| :bootstrap-hourglass-split: Waiting | The workflow can't start running yet but it is queued to run when resources are available. | -| :btrix-status-dot: Starting | New resources are starting up. Crawling should begin shortly.| -| :btrix-status-dot: Running | The crawler is finding and capturing pages! | -| :btrix-status-dot: Stopping | A user has instructed this workflow to stop. Finishing capture of the current pages.| -| :btrix-status-dot: Finishing Downloads | The workflow has finished crawling and is finalizing downloads.| -| :btrix-status-dot: Generating WACZ | Data is being packaged into WACZ files.| -| :btrix-status-dot: Uploading WACZ | WACZ files have been created and are being transferred to storage.| +| :bootstrap-hourglass-split: Waiting | The workflow can't start running yet but it is queued to run when resources are available. | +| :btrix-status-dot: Starting | New resources are starting up. Crawling should begin shortly.| +| :btrix-status-dot: Running | The crawler is finding and capturing pages! | +| :bootstrap-pause-circle: Pausing | The workflow is in the process of being paused. | +| :bootstrap-pause-circle: Paused | The workflow is currently paused. | +| :bootstrap-play-circle: Resuming | The workflow is in the process of resuming after being paused. | +| :btrix-status-dot: Stopping | A user has instructed this workflow to stop. Finishing capture of the current pages.| +| :btrix-status-dot: Finishing Downloads | The workflow has finished crawling and is finalizing downloads.| +| :btrix-status-dot: Generating WACZ | Data is being packaged into WACZ files.| +| :btrix-status-dot: Uploading WACZ | WACZ files have been created and are being transferred to storage.| ## Watch Crawl