-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move shared memory block swapping control to osrm-datastore #2570
Comments
related #1221 |
As a first step I would like to implement this without having named memory segments, since this would definitely require an API change. I believe to get this right we cannot operate on process granularity but need to consider every thread. Currently we have a piece of code that waits until all requests are finished before it swaps the data. However that is unnecessary, what we really want to do is to wait until all requests that referenced the old dataset are finished, before we deallocate that shared memory region. To allow this we would need to have multiple versions of the shared-memory data facade in memory. Every request holds shared ownership of that data facade. If all ownerships are given up the facade is deallocated, and we can signal for the release of the shared memory region. Things that need to change to enable this:
@daniel-j-h do you think this sounds sane at all? |
This is the first step to having fine grained locking on data updates, see issue #2570.
This is the first step to having fine grained locking on data updates, see issue #2570.
First of all, I agree with the assessment in #2570 (comment). Still I'd like to raise the question if we are implementing a complex solution here that might be based on not-changing the API and/or unnecessarily complicated.
Is definitely a bottleneck, but given our usual request time not a big problem to me. We don't loose much here. What I am more concerned about is the complexity of the solution. Could we do better if we target this for a 6.0 release and make it an API change? Are we doing a major change here to work around an API change? |
@TheMarex sounds good overall, shared ownership makes sense.
The issue I'm seeing here is the following: at the moment we only ever have one dataset loaded. With your proposed design we potentially have multiple datasets loaded to memory as long as there are requests holding onto it. This multiplies memory requirements, or am I missing something here? For the record, I played around with shared memory reader / writer locks for hotswapping over at |
Is this not already the case? Loading a dataset takes quite some time. So I guess the |
We sometimes hit outliers on our production machines where requests take several seconds and clog the pipeline. This makes switching dataset times quite non-deterministic. Combine this with the fact that datastore doesn't actually wait until the data is completely switched over, but just exists after the shared memory region is written, we have hacks in place just to ensure that the data correctly loaded.
This is actually less complex then our current locking setup, because we needed to add several locks to work around edge cases (concurrent datastore runs, non-commited data changes, several race conditions between the two). And multiple clients are just not supported at all which is a major downside for a shared-memory based system. We can replace that haywire of mutexes using established multi-processing idioms. This will make the code both easier to understand and less error prone. And as cherry on top we will get support for multiple clients.
As I outlined above this is a strict feature subset of something that would require an API change. This approach can be extended to use named datastores, but that is actually completely independent on how we do the locking.
Both yes and no. It multiplies the memory requirement for storing a |
This is the first step to having fine grained locking on data updates, see issue #2570.
I found a limitation with the design:
Even though there are no queries referencing the old dataset, This of course breaks our test setup, where we start an We could side-step that problem by not needing any client intervention to switch datasets. That would mean we don't keep state (e.g. safe the current I'm going to benchmark this to see what we would be looking at here in terms of overhead. We could potentially safe some overhead with keeping the shared memory mapped at constant locations (hence a syscall less). Downside there would probably be that we have a constant requirement of datasize*2 (not just during updates). That could be improved just switching the memory lock on/off., hence letting the OS do its job and just page it out. |
I did a quick benchmark on this and creating
|
I have this design now finally working. Only remaining issues are around hardening: How are we going to detect and recover from dangling locks. This might happen if either |
OSRM supports a hot-swap feature - you can use
osrm-datastore
to replace an old dataset with a new one whileosrm-routed
(ornode-osrm
) remains running, with only a very brief hitch in request handling performance.However, the design is a bit ugly -
osrm-datastore
simply loads the new data, but it's actually up to the reader process (osrm-routed
) to perform the data exchange, mark the new block in use and release the old memory.This has a few problems:
We should change this. I picture it working as follows:
osrm-datastore
loads new data, then signals all registered listeners that new data is available, then waitsosrm-datastore
removes the old data once all listeners have moved, then exitsThere would be some additional locking required to avoid problems with running additional
osrm-datastore
during an exchange, and to handle the startup/shutdown of listeners during an exchange.The text was updated successfully, but these errors were encountered: