Maybe save pages as they go and then merge at the end with resume support? How would this work if data was changed midway through? Can the REST API lists be made to show stuff before a certain date?