Use memory map to speedup the load and untar of the snapshot archives #24798

HaoranYi · 2022-04-28T14:37:25Z

Problem

When reading large files, memory mapped files give much better I/O
performance. They benefit from OS virtual memory manager and avoid the copies
of data between kernel space buffer to user space buffer. Full Snapshot
achieve files are around 80G and incremental snapshot archives are around 10G.
Both of them should easily fit into the memory on the machine. Also, these
files are read sequentially during the uncompress and untar, which will
benefit greatly from the disk prefetch. By using memory map files for the
snapshot archives, we can achieve much better read performance for unpacking
the snapshot files during the start of the validator.

Proposed Solution

Use memmap2 crate to map the snapshot file to memory and read its content from
the memory map. memmap2 is the equivalent of boost::memory_mapped_file in C++.

HaoranYi · 2022-05-17T14:34:20Z

#25259

It turns out that using memory map only shows improvement for "uncompressed" snapshot files. For compressed snapshot files, memory map doesn't gain any meaningful improvement.

HaoranYi · 2022-05-17T14:50:31Z

https://gist.github.com/HaoranYi/7613f005d9f14a47772fdeeeca0b7d19

The breakdown timing for untar is as follows:

decompress: 16% 8% of it are spent on page fault.
writing file to disk: 49%
opening file: 6%
copying: 23%

memory map maybe helps on the 8% of the non page fault decompress. but it is not significant.

This run is from gce cluster. It looks like that the CPU only support avx1, 256bit. With a better cpu, that supports avx2, 512bit, we may see the improvement from the 23% copying time.

HaoranYi self-assigned this Apr 28, 2022

This was referenced Apr 29, 2022

Persist AccountDB Index along with snapshot #24643

Closed

Use memory map to speed up snapshot untar #24889

Merged

HaoranYi closed this as completed May 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use memory map to speedup the load and untar of the snapshot archives #24798

Use memory map to speedup the load and untar of the snapshot archives #24798

HaoranYi commented Apr 28, 2022

HaoranYi commented May 17, 2022

HaoranYi commented May 17, 2022 •

edited

Loading

Use memory map to speedup the load and untar of the snapshot archives #24798

Use memory map to speedup the load and untar of the snapshot archives #24798

Comments

HaoranYi commented Apr 28, 2022

Problem

Proposed Solution

HaoranYi commented May 17, 2022

HaoranYi commented May 17, 2022 • edited Loading

HaoranYi commented May 17, 2022 •

edited

Loading