Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmap: MAP_POPULATE and madvise(MADV_WILLNEED) fail #3535

Closed
GabeAl opened this issue Sep 10, 2018 · 2 comments
Closed

mmap: MAP_POPULATE and madvise(MADV_WILLNEED) fail #3535

GabeAl opened this issue Sep 10, 2018 · 2 comments
Labels

Comments

@GabeAl
Copy link

GabeAl commented Sep 10, 2018

I noticed that the behavior of both the MAP_POPULATE flag in mmap and the MADV_WILLNEED on "real linux" will pre-fault the entire file into the mapped memory region, the only difference between the two being MAP_POPULATE operates at the time the mapping is established, and MADV_WILLNEED operating at the time of the first memory access into the mapped region.

In effect this is the rough equivalent to reading the entire file into memory (i.e. fread) but without an internal buffer to hold the contents (the page table is completely pre-faulted and the virtual pointer into the page table is returned).

On Windows, the blocking pre-fault behavior is not seen. This causes atrocious disk I/O performance especially on spindle devices that are reliant on this behavior, as pages are faulted in only in the accessed region (that is, the "vanilla" mmap behavior sans pre-faulting). I saw that a few million binary searches on a 9GB mmap'd file took hours on WSL but seconds on docker and "real ubuntu 16.04" on the same machine.

A naive workaround is to pre-populate the file into pagecache sequentially via other means (such as calling md5 or wc -l on the file before trying to mmap it). This doesn't work for anonymous maps with no file backing, and the costly workaround is to write the whole data structure with 1's or something in one or two threads before using the code in multi-threaded regions (where otherwise it will block HARD.)

Example:

char *Raw = mmap(0,size,PROT_READ,MAP_SHARED | MAP_POPULATE, fp, 0);
madvise(Raw,size,MADV_WILLNEED);
// access values of Raw at random (or in a binary search pattern)

(either the MAP_POPULATE or the MADV_WILLNEED hint should suffice on their own but both are shown here.)
Be sure the file (I use files >8GB but anything big will do) isn't already paged into RAM or you won't see the behavior. I use RAMMap to drop page caches in Windows to reproduce what you'd see if you're reading the file for the first time from disk. Alternatively, use an anonymous map and try writing the values sequentially in a highly multithreaded environment (it will be slow, even slower than 1 thread).

Although MAP_POPULATE is linux-exclusive, the roughly equivalent MADV_WILLNEED works across kernel versions and distros, as well as BSD and OSX, so it's pretty POSIX-y.

(This was the first part of a double-bug post here: #2732 ) but the report was closed after just the second part being addressed.

@GabeAl GabeAl changed the title mmap(2) with MAP_POPULATE and madvise(MAPV_WILLNEED) fail mmap(2) with MAP_POPULATE and madvise(MADV_WILLNEED) fail Sep 10, 2018
@GabeAl GabeAl changed the title mmap(2) with MAP_POPULATE and madvise(MADV_WILLNEED) fail mmap: MAP_POPULATE and madvise(MADV_WILLNEED) fail Sep 10, 2018
@therealkenc
Copy link
Collaborator

Left unanswered from #2732:

Do you know of any packaged software that reproduces these issues, or are you observing this in private projects?

A: looks like influxdb, at least (another foo-db with 14k github stars). Haven't tried, but it is pretty safe to take as a given you'd see the "took hours on WSL" behavior with a big db and a spinny disk.

Copy link
Contributor

This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants