Skip over unallocated spaces during send #228

tasket · 2025-01-02T01:19:46Z

Wyng send will currently examine all unallocated portions of a volume under certain conditions, such as during the volume's initial send. It will also examine/compare all portions that have been de-allocated since the previous send, so there is some impact on incremental backups as well. This results in slower access than what is possible.

Cases where this has an impact:

Adding large volumes to an archive
Deleting large amounts of data from a volume
Increasing a volume's size

Optimization could be achieved by creating a twin of the delta map, a zero map, during one of the early stages of the send process including get_delta_digest(). The zero mapping code would have to conform to each storage type, and the reflink version may be able to consume a 'tee' of the fiemap data. (An alternative would be to use SEEK_HOLE and SEEK_DATA, although they're unlikely to work with tlvm.)

The tlvm version might collect any "left-only" references in the case of an incremental send, or else do an extra metadata extraction step using a tlvm command other than thin_delta.

Assuming the result of zero mapping is a per-chunk bitmap like the delta map, the send_volume() function could attempt to skip through 8-bit or larger segments similar to how it handles the delta bmap_list.

One desired result would be the ability to add a mostly empty, terabyte-sized volume to an archive in a matter of seconds or a few minutes. Another result would be incremental send for a volume that had a vast amount of data deleted taking only a fraction of the time it would in the current worst-case scenario.

To illustrate the large difference that delta mapping vs (lack of) unallocated mapping makes:

Adding a new 1TB mostly-empty (1.5MB) volume to an archive took over 14 minutes.

Adding 48MB to that volume and doing an incremental (mapped) send took 9 seconds. So a backup of 32X the data finished in 1/93 the time. (The incremental send didn't have to compare large amounts of zeros because data had not been deleted from the volume, only added.)

The text was updated successfully, but these errors were encountered:

tasket · 2025-01-03T19:37:44Z

For incremental send:

It should be possible to make a segmented bmap_list (as with the delta map) for 'zero' areas, and then break up or adjust those segments to align with the delta bmap segments; but portions of the zero bmap segments that don't align with the delta segments can be discarded/moved, resulting in those areas being scanned normally. When an aligned zero map segment is encountered, zero-chunk entries can be quickly emitted into the new session's manifest without making buffer comparisons.

If the start of a zero bmap segment comes after the start of a delta segment, it may be necessary to prepend null bits to the start of the zero segment to make them match, possibly append to the end as well. Null bits indicate where normal buffer comparisons should occur so there is no risk to accuracy in doing this.

tlaurion · 2025-01-04T02:40:11Z

Parallelization would help here too, no?

tasket · 2025-01-04T16:34:23Z

Probably. Simple parallelizing could easily saturate memory bandwidth with buffer comparisons, though. I'm sure it would run in parallel better with the proposed changes.

tasket added enhancement New feature or request optimization labels Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip over unallocated spaces during send #228

Skip over unallocated spaces during send #228

tasket commented Jan 2, 2025 •

edited

Loading

tasket commented Jan 3, 2025 •

edited

Loading

tlaurion commented Jan 4, 2025

tasket commented Jan 4, 2025

Skip over unallocated spaces during send #228

Skip over unallocated spaces during send #228

Comments

tasket commented Jan 2, 2025 • edited Loading

tasket commented Jan 3, 2025 • edited Loading

tlaurion commented Jan 4, 2025

tasket commented Jan 4, 2025

tasket commented Jan 2, 2025 •

edited

Loading

tasket commented Jan 3, 2025 •

edited

Loading