-
Notifications
You must be signed in to change notification settings - Fork 77
Description
If a table t
has a single column id bigint(20)
(also the PK) with records -1, 0, 1, 2, 3. Ghostferry will silently not copy data where id <= 0. There are two related causes:
- For the negative number case, this is because Ghostferry assumes that the PK is an
uint64
. - For the 0 case, the
DefaultBuildSelect
selects rows where PK > 0 in the first iteration, which misses the record for 0.
This only happens if Ghostferry is started and these records already exists in the database as only the DataIterator misses these records. The BinlogStreamer will stream these records if they are inserted. This also means the InlineVerifier will miss these records as well and will not emit an error unless these records are also UPDATEd during the course of the transfer. When the record is updated, since it doesn't exist on the target, the BinlogWriter won't be able to update it (as the previous state on the target database doesn't match what the binlog event suggests). This means the row will be added to a reverify queue. The subsequent fingerprinting attempt will always yield different result from the source and the target. This means that there is a chance for the InlineVerifier to catch this (which is how we found it), but it is not guaranteed.
The first case is hard to fix. Perhaps we should detect such record exists and raise an error during table loading. For the 0 case, we need to review the BuildSelect functions to make sure that 0 can be included.