-
Notifications
You must be signed in to change notification settings - Fork 7
Description
I am trying to recover a website that was created for a non-profit organization with WordPress. It was hosted on a third-party site but the organization has lost its admin access and somehow broke the site. I'm trying to recover the site as it was in January of 2024 when the site was working. When trying to recover the website from archive.org, I ran the CLI utility, but it didn't download all the pages I was expecting.
I ran: wayback_machine_downloader http://sorensonlegacyfoundation.org --to 20240101
. It downloads 250 files, but there are still lots of HTML pages that are missing. Like the entry file index.html
is there, but /what-we-fund
, how-to-apply
, and other about 10 other pages are not there.
Looking over the raw text files in vs code, I confirmed that these pages are missing and not just nested away somewhere by searching for specific text unique to each page.
Is there something I'm missing or should I just download each page individually from archive.org?