Skip to content

Recovering from tablespace(s) being accidentally deleted

Yaotian Feng edited this page Jul 30, 2024 · 7 revisions

First things First

When you realize you've accidentally deleted important data, it's crucial to act quickly and correctly to maximize your chances of successful recovery. Follow these steps immediately:

  1. Stop All Database Activity

    Immediately stop all write operations to the database. Shut down the MySQL server if possible.

  2. Unmount the File System

    If the database files are on a separate partition or drive, unmount it immediately. This prevents any further writes to the disk that could overwrite deleted data.

  3. Create a Disk Image

    Create a bit-by-bit copy (disk image) of the entire drive or partition containing the database files. Use tools like dd on Unix-like systems or disk imaging software.

One can do this with dd (be extremely careful with this):

make sure /path/to/disk_image.img is not on the same drive as the one you are trying to recover from...

sudo dd if=/dev/sdb of=/path/to/database_image.img bs=1M status=progress

Scanning for pages

Let's first recover any pages that still have a valid checksum from the disk. This can be done using page_extractor from this package.

page_extractor --by-tablespace [-o <Output Directory>] <DiskImage>

It might take a while to scan through the entire disk image, it assumes the underlying file system has a 4K or smaller block size, and the 16K InnoDB pages are stored physically contiguously on the disk. Note that this may become not true in some circumstances, but recovering those might be extremely difficult and or time consuming.

After this, you should see <Output Directory>/BY_TABLESAPCE folder has a bunch of pages file inside. each file contain the set of pages that belong to that tablespace.

Scanning for index within each page

If you have file_per_table=on in your MySQL settings, you should have exactly 1 table on each per-table tablespace. We now want to recover the primary index from each tablespace. This information can be obtained by parsing the system tablespace, but if that information is unfortunately lost, we can do some "guess work": usually the largest index in a table space would be the primary index as it stores all the columns.

For each tablespace, you should run page_extractor again, using the default option of extracting different kinds of pages (mostly the index pages)

page_extractor [-o <index-tablespace_1>] <00001.pages>

Parsing Tables

Once the extraction is done, you should have multiple indices in index-tablespace_XXXX folder for each table space. You can now use page_explorer along with a SQL file with a single create table statement to explore the data as if it comes from that table.

To inspect a couple pages, try running with the following

page_explorer --limit 1 -t <path/to/create_table.sql> -v <INDEX PAGE FILE>

Once it looks correct, you want to extract all entries to a single json file:

page_explorer -o <OUTPUT.json> -t <path/to/create_table.sql> <INDEX PAGE FILE>
Clone this wiki locally