Implement mmapDataFacade #1947

danpat · 2016-02-02T15:51:37Z

OSRM currently supports reading data from files into heap memory (InternalDataFacade), or pre-loading data into shared memory using IPC shared memory blocks (SharedDataFacade+osrm-datastore).

We can consolidate the behaviour of both of these by using mmap. Instead of reading files into memory explicitly, we should be able to mmap the data files, and immediately begin using them.

There are a few changes that need to be made to get us there:

Benchmark mmapd data access vs heap - what, if any, penalty is there? How does this change when the file we mmap is on a ramdisk?
Identify data structures that can't be mmaped and fix them - basically anything in osrm-datastore (src/storage/storage.cpp) that isn't just loaded into memory in one big blob. Problem here is vector<bool> and its proxy behavior; we need a contiguous container we can memcpy to.
Clone the SharedDataFacade and perform similar .swap operations against mmaped memory addresses rather than shm addresses.
Figure out IPC signalling for swapping out mmaped files on-the-fly
Investigate using mmap instead of explicit read disk files for leaf nodes in the StaticRTree to boost performance (coordinate lookups represent the largest part of any given routing query because of the I/O in the rtree).
Make sure this works on Windows too.

The main goal here is to minimize double-reads of data. In situations where we are constantly cycling out data sets (in the case of traffic updates), we want to minimize I/O and the number of times any single bit of data gets touched. By using mmap and tmpfs, we can emulate the current share-memory behavior, but avoid an extra pass over the data.

For normal osrm-routed use, we would essentially get lazy-loading of data - osrm-routed would start up faster, but queries would be slower since pages are loaded from disk on demand until data is touched and lives in the filesystem cache. This initial slowness could be avoided by pre-seeding the data files into the filesystem cache or via MAP_POPULATE (Linux 2.5.46+), and this could be done in parallel to osrm-routed already starting up and answering queries.

/cc @daniel-j-h @TheMarex

The text was updated successfully, but these errors were encountered:

TheMarex · 2016-02-14T18:43:47Z

Benchmark mmapd data access vs heap - what, if any, penalty is there? How does this change when the file we mmap is on a ramdisk?

Benchmarks revealed a 10% slowdown w.r.t internal memory. We are putting this on ice for the moment.

danpat · 2016-05-28T02:43:06Z

Did some quick benchmarking on OSX while on the plane this afternoon. Using this test:
https://gist.github.com/danpat/67e6ab63836ffbcc4d832e7db509a5b5

On an OSX ramdisk:

RAM access Run1: 21.026866s wall, 20.580000s user + 0.150000s system = 20.730000s CPU (98.6%)
RAM access Run2: 18.135529s wall, 18.070000s user + 0.040000s system = 18.110000s CPU (99.9%)
RAMdisk mmap Run1: 20.520104s wall, 19.460000s user + 0.790000s system = 20.250000s CPU (98.7%)
RAMdisk mmap Run2: 19.265660s wall, 18.490000s user + 0.730000s system = 19.220000s CPU (99.8%)

On the regular OSX filesystem:

RAM access Run1: 17.700162s wall, 17.650000s user + 0.030000s system = 17.680000s CPU (99.9%)
RAM access Run2: 17.893318s wall, 17.820000s user + 0.040000s system = 17.860000s CPU (99.8%)
Disk mmap Run1: 19.178829s wall, 18.200000s user + 0.740000s system = 18.940000s CPU (98.8%)
Disk mmap Run2: 19.359454s wall, 18.440000s user + 0.780000s system = 19.220000s CPU (99.3%)

I'm not quite sure what this is telling me, I suspect I need to run more samples. I played with a few different madvise values. MADV_RANDOM added about a 25% slowdown to the mmap calls when enabled, but had no effect on the direct RAM access.

My machine has 16GB of RAM and I have plenty free, so I'm fairly confident that filesystem caching was in full effect and nothing got swapped out. OSX also performs memory compression when things get tight, but I didn't see that kick in either.

/cc @daniel-j-h

danpat · 2016-05-28T03:07:31Z

I took a look at some logs from my previous tests, and I think I might've been paging some stuff to swap after all. I halved the data size (4GB to 2GB) and shrank the ramdisk a bit.
I also removed std::rand() and just used i * BIGPRIME % ARRAYSIZE to access elements during the loop. While I was seeding with std::srand() and std::rand() should be consistent when seeded, I'm not 100% clear what's happening under the covers, so I removed it as a possible variable.

Results now look like this:
Tests on the ramdisk volume:

RAM access Run1: 11.017670s wall, 10.960000s user + 0.030000s system = 10.990000s CPU (99.7%)
RAM access Run2: 11.398677s wall, 11.330000s user + 0.030000s system = 11.360000s CPU (99.7%)
RAMdisk mmap Run1: 11.630367s wall, 11.240000s user + 0.360000s system = 11.600000s CPU (99.7%)
RAMdisk mmap Run2: 11.878009s wall, 11.480000s user + 0.370000s system = 11.850000s CPU (99.8%)

Tests on the regular filesystem:

RAM access Run1: 11.302447s wall, 11.080000s user + 0.050000s system = 11.130000s CPU (98.5%)
RAM access Run2: 10.781652s wall, 10.730000s user + 0.030000s system = 10.760000s CPU (99.8%)
Disk mmap Run1: 12.049692s wall, 11.430000s user + 0.460000s system = 11.890000s CPU (98.7%)
Disk mmap Run2: 12.164826s wall, 11.710000s user + 0.380000s system = 12.090000s CPU (99.4%)

Overall, ¯_(ツ)_/¯. Seems like mmap on the regular filesystem on OSX is a bit slower (~10%). On OSX's ramdisk (e.g. diskutil erasevolume HFS+ 'RAM Disk' $(hdiutil attach -nomount ram://8485760) for a 4GB disk), we do see some speedup that brings it pretty close to direct RAM access.

@daniel-j-h do you have details of how you tested this on Linux?

daniel-j-h · 2016-05-29T15:54:40Z

https://github.com/mapbox/tmpfs-mmap-zero-copy

daniel-j-h · 2016-12-13T17:14:45Z

@danpat can you have a look - you refactored the data facades.

Is this ticket still relevant and actionable?

danpat · 2016-12-13T18:03:41Z

We could still do this - in fact, things are slowly getting easier as we refactor the I/O handling.

Let's keep this open as a feature request - one day, down the road, somebody might implement it :-) Keeping this history will be useful.

TheMarex · 2018-03-12T09:50:11Z

First step towards this was done in #4881. For further gains we would need to mmap every input file separately.

danpat · 2018-10-20T05:44:18Z

mmap-ing individual files has been done in #5242

Only thing that PR doesn't complete from our original list is:

implement mmap-based hot-swapping
Windows support

TheMarex added the Feature Request label Feb 14, 2016

danpat mentioned this issue Mar 30, 2016

OSRM on IPFS ? #2144

Closed

TheMarex self-assigned this Feb 13, 2018

TheMarex mentioned this issue Feb 13, 2018

Add datafacade allocator backed by mmap #4881

Merged

5 tasks

TheMarex mentioned this issue Mar 12, 2018

Refactor mapping of files to memory locations #4952

Closed

danpat mentioned this issue Oct 20, 2018

Support directly mmap-ing datafiles #5242

Merged

8 tasks

danpat assigned danpat and unassigned TheMarex Oct 26, 2018

DennisOSRM closed this as completed May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement mmapDataFacade #1947

Implement mmapDataFacade #1947

danpat commented Feb 2, 2016 •

edited

Loading

TheMarex commented Feb 14, 2016

danpat commented May 28, 2016

danpat commented May 28, 2016 •

edited

Loading

daniel-j-h commented May 29, 2016

daniel-j-h commented Dec 13, 2016

danpat commented Dec 13, 2016 •

edited

Loading

TheMarex commented Mar 12, 2018

danpat commented Oct 20, 2018

Implement mmapDataFacade #1947

Implement mmapDataFacade #1947

Comments

danpat commented Feb 2, 2016 • edited Loading

TheMarex commented Feb 14, 2016

danpat commented May 28, 2016

danpat commented May 28, 2016 • edited Loading

daniel-j-h commented May 29, 2016

daniel-j-h commented Dec 13, 2016

danpat commented Dec 13, 2016 • edited Loading

TheMarex commented Mar 12, 2018

danpat commented Oct 20, 2018

danpat commented Feb 2, 2016 •

edited

Loading

danpat commented May 28, 2016 •

edited

Loading

danpat commented Dec 13, 2016 •

edited

Loading