-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement mmapDataFacade #1947
Comments
Benchmarks revealed a 10% slowdown w.r.t internal memory. We are putting this on ice for the moment. |
Did some quick benchmarking on OSX while on the plane this afternoon. Using this test: On an OSX ramdisk:
On the regular OSX filesystem:
I'm not quite sure what this is telling me, I suspect I need to run more samples. I played with a few different My machine has 16GB of RAM and I have plenty free, so I'm fairly confident that filesystem caching was in full effect and nothing got swapped out. OSX also performs memory compression when things get tight, but I didn't see that kick in either. /cc @daniel-j-h |
I took a look at some logs from my previous tests, and I think I might've been paging some stuff to swap after all. I halved the data size (4GB to 2GB) and shrank the ramdisk a bit. Results now look like this:
Tests on the regular filesystem:
Overall, ¯_(ツ)_/¯. Seems like @daniel-j-h do you have details of how you tested this on Linux? |
@danpat can you have a look - you refactored the data facades. Is this ticket still relevant and actionable? |
We could still do this - in fact, things are slowly getting easier as we refactor the I/O handling. Let's keep this open as a feature request - one day, down the road, somebody might implement it :-) Keeping this history will be useful. |
First step towards this was done in #4881. For further gains we would need to mmap every input file separately. |
Only thing that PR doesn't complete from our original list is:
|
OSRM currently supports reading data from files into heap memory (
InternalDataFacade
), or pre-loading data into shared memory using IPC shared memory blocks (SharedDataFacade
+osrm-datastore
).We can consolidate the behaviour of both of these by using
mmap
. Instead of reading files into memory explicitly, we should be able tommap
the data files, and immediately begin using them.There are a few changes that need to be made to get us there:
mmap
d data access vs heap - what, if any, penalty is there? How does this change when the file wemmap
is on a ramdisk?mmap
ed and fix them - basically anything inosrm-datastore
(src/storage/storage.cpp
) that isn't just loaded into memory in one big blob. Problem here isvector<bool>
and its proxy behavior; we need a contiguous container we canmemcpy
to.SharedDataFacade
and perform similar.swap
operations againstmmap
ed memory addresses rather thanshm
addresses.mmap
ed files on-the-flymmap
instead of explicitread
disk files for leaf nodes in the StaticRTree to boost performance (coordinate lookups represent the largest part of any given routing query because of the I/O in the rtree).The main goal here is to minimize double-reads of data. In situations where we are constantly cycling out data sets (in the case of traffic updates), we want to minimize I/O and the number of times any single bit of data gets touched. By using
mmap
andtmpfs
, we can emulate the current share-memory behavior, but avoid an extra pass over the data.For normal
osrm-routed
use, we would essentially get lazy-loading of data -osrm-routed
would start up faster, but queries would be slower since pages are loaded from disk on demand until data is touched and lives in the filesystem cache. This initial slowness could be avoided by pre-seeding the data files into the filesystem cache or viaMAP_POPULATE
(Linux 2.5.46+), and this could be done in parallel toosrm-routed
already starting up and answering queries./cc @daniel-j-h @TheMarex
The text was updated successfully, but these errors were encountered: