Tolerate b1 shard index inconsistencies during TSM conversion #5647
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
These commits address data loss issues caused by a shard index inconsistency caused by a defect that was fixed (3348dab) in version 0.9.3. The data loss problem this inconsistency caused was reported with #5606.
The changes have been tested with an instance of a 47GB database of 32 shards and 1.1B points. Prior to the fix, around 600M points (4GB) were unexpectedly dropped from the converted database. With the fix, the full 1.1B points were preserved resulting in a converted database size of 8.8GB.
Feedback is welcome about whether the list of orphaned series should be dumped to the output and whether the tracker should be used to to dump the warning messages.
Other areas of improvement might be the means that the RepairIndex configuration is associated with the b1 reader. Should I pass this as a parameter to the reader constructor or pass an reader specific options/configuration object?
A similar change should probably also be made to the bz1 reader, but I didn't have a test case I could use to verify the integrity of such a change. I am happy to extend the change to the bz1 reader if that is desired.
It is probably worth noting that any influx database that has data that was created with a version less than 0.9.3 may be susceptible to silent data loss during TSM data conversion until this fix is applied.