You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that the behavior of both the MAP_POPULATE flag in mmap and the MADV_WILLNEED on "real linux" will pre-fault the entire file into the mapped memory region, the only difference between the two being MAP_POPULATE operates at the time the mapping is established, and MADV_WILLNEED operating at the time of the first memory access into the mapped region.
In effect this is the rough equivalent to reading the entire file into memory (i.e. fread) but without an internal buffer to hold the contents (the page table is completely pre-faulted and the virtual pointer into the page table is returned).
On Windows, the blocking pre-fault behavior is not seen. This causes atrocious disk I/O performance especially on spindle devices that are reliant on this behavior, as pages are faulted in only in the accessed region (that is, the "vanilla" mmap behavior sans pre-faulting). I saw that a few million binary searches on a 9GB mmap'd file took hours on WSL but seconds on docker and "real ubuntu 16.04" on the same machine.
A naive workaround is to pre-populate the file into pagecache sequentially via other means (such as calling md5 or wc -l on the file before trying to mmap it). This doesn't work for anonymous maps with no file backing, and the costly workaround is to write the whole data structure with 1's or something in one or two threads before using the code in multi-threaded regions (where otherwise it will block HARD.)
Example:
char *Raw = mmap(0,size,PROT_READ,MAP_SHARED | MAP_POPULATE, fp, 0);
madvise(Raw,size,MADV_WILLNEED);
// access values of Raw at random (or in a binary search pattern)
(either the MAP_POPULATE or the MADV_WILLNEED hint should suffice on their own but both are shown here.)
Be sure the file (I use files >8GB but anything big will do) isn't already paged into RAM or you won't see the behavior. I use RAMMap to drop page caches in Windows to reproduce what you'd see if you're reading the file for the first time from disk. Alternatively, use an anonymous map and try writing the values sequentially in a highly multithreaded environment (it will be slow, even slower than 1 thread).
Although MAP_POPULATE is linux-exclusive, the roughly equivalent MADV_WILLNEED works across kernel versions and distros, as well as BSD and OSX, so it's pretty POSIX-y.
(This was the first part of a double-bug post here: #2732 ) but the report was closed after just the second part being addressed.
The text was updated successfully, but these errors were encountered:
GabeAl
changed the title
mmap(2) with MAP_POPULATE and madvise(MAPV_WILLNEED) fail
mmap(2) with MAP_POPULATE and madvise(MADV_WILLNEED) fail
Sep 10, 2018
GabeAl
changed the title
mmap(2) with MAP_POPULATE and madvise(MADV_WILLNEED) fail
mmap: MAP_POPULATE and madvise(MADV_WILLNEED) fail
Sep 10, 2018
Do you know of any packaged software that reproduces these issues, or are you observing this in private projects?
A: looks like influxdb, at least (another foo-db with 14k github stars). Haven't tried, but it is pretty safe to take as a given you'd see the "took hours on WSL" behavior with a big db and a spinny disk.
This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.
I noticed that the behavior of both the MAP_POPULATE flag in mmap and the MADV_WILLNEED on "real linux" will pre-fault the entire file into the mapped memory region, the only difference between the two being MAP_POPULATE operates at the time the mapping is established, and MADV_WILLNEED operating at the time of the first memory access into the mapped region.
In effect this is the rough equivalent to reading the entire file into memory (i.e. fread) but without an internal buffer to hold the contents (the page table is completely pre-faulted and the virtual pointer into the page table is returned).
On Windows, the blocking pre-fault behavior is not seen. This causes atrocious disk I/O performance especially on spindle devices that are reliant on this behavior, as pages are faulted in only in the accessed region (that is, the "vanilla" mmap behavior sans pre-faulting). I saw that a few million binary searches on a 9GB mmap'd file took hours on WSL but seconds on docker and "real ubuntu 16.04" on the same machine.
A naive workaround is to pre-populate the file into pagecache sequentially via other means (such as calling md5 or wc -l on the file before trying to mmap it). This doesn't work for anonymous maps with no file backing, and the costly workaround is to write the whole data structure with 1's or something in one or two threads before using the code in multi-threaded regions (where otherwise it will block HARD.)
Example:
char *Raw = mmap(0,size,PROT_READ,MAP_SHARED | MAP_POPULATE, fp, 0);
madvise(Raw,size,MADV_WILLNEED);
// access values of Raw at random (or in a binary search pattern)
(either the MAP_POPULATE or the MADV_WILLNEED hint should suffice on their own but both are shown here.)
Be sure the file (I use files >8GB but anything big will do) isn't already paged into RAM or you won't see the behavior. I use RAMMap to drop page caches in Windows to reproduce what you'd see if you're reading the file for the first time from disk. Alternatively, use an anonymous map and try writing the values sequentially in a highly multithreaded environment (it will be slow, even slower than 1 thread).
Although MAP_POPULATE is linux-exclusive, the roughly equivalent MADV_WILLNEED works across kernel versions and distros, as well as BSD and OSX, so it's pretty POSIX-y.
(This was the first part of a double-bug post here: #2732 ) but the report was closed after just the second part being addressed.
The text was updated successfully, but these errors were encountered: