-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osrm-extract with europe.osm.pbf and 64GB RAM failed due to memory limit ?! #6146
Comments
@kimjohans The data processing that OSRM does is extremely memory intensive. Processing the planet is a 300-400GB of RAM operation. The world is a big place, as is Europe. The way many people tackle this is to rent a temporary server from an online virtual server provider - process the data, then shut the machine down. This will cost real $$$, but that's the reality of this software. |
@danpat An idea where the best place for such a server is ? Amazon Cloud ? Maybe you have some experience with those "bit special" servers. Europe is a big place .. and grown slowly, so lots of ways and paths. But programming some type of memory usage limitation would be a good idea. Of course the process would run longer - but better longer than not at all. |
@kimjohans The speedup algorithms OSRM use need access to the "whole graph" at the same time - this is where the large memory footprint comes from. It's probably possible to re-engineer things to use less RAM, but the compromise will likely be time (it already takes many many hours to process Europe), and also a complete lack of people who are working on that problem right now (feel free to submit a PR!). It would be a significant re-write of many of the core parts of the extraction pipeline - not something to undertake lightly. Any of the large providers have machine types rentable with more than enough capacity. AWS, GCP, etc. They're not cheap ($8+USD/hour), but as long as you don't need to constantly update your map, it's a one-time cost. Just make sure you use the same operating system/CPU family for processing the data as you do for running route calculations on it later on. |
You (and everybody that finds this issue) could alleviate the memory using zram. zram will take up a portion of your RAM and use it as compressed swap space. This will allow you to get a bit more out of your available space. I have done the same using a 600G swap partition on my NVMe in addition to zram since I only have 16GB of available RAM for planet.osm.pbf. This makes things unbelievably slow due to osrm-extract reading and writing to random flash pages and wears out the NVMe very quickly. Slow as in: I've started the extract 7 days ago and I expect it to continue for at least 7 more days (But I'll only have to do it once). However, you have a way better RAM size to sourcefile size ratio (64GB vs my 16, europe vs world), so perhaps allocating 16GB as zram (high priority swap) and 128GB on the NVMe using a swapfile (to get the wear leveling benefits of the filesystem) might do the trick for you. The more edges/nodes are kept in RAM the faster the process is and I recon you already have most of europe in RAM, failing because of a very small memory overrun. |
I mean, you don't need to use zram - you can just configure additional swap space. But the tradeoff is the same: time. Without fast random access across the entire data structure, processing will be glacial. OSRM for a long time used https://github.com/stxxl/stxxl - but support for that was removed in 2020 (#5760) because STXXL had seen no development at all for a long time, and available memory continues to increase, while OSRM's memory needs have stayed fairly stable (i.e. it's not growing as fast as available servers are). You could try reverting that PR and doing processing with STXXL - I think it will outperform the generic algorithms that back kernel-swapped memory pages, but I don't think anyone has really measured it. |
Just for reference I'm able to extract the full Europe using the docker steps described in the guide.
To extract the data I used an aws ec2 type r6i.16xlarge: 64cpu, 512Gb di RAM, 256G ssd (probably is really oversize) |
@AlessandroSalvetti thanks for your input, do you know if there is a repository of extracted files free to download ? I'd be interested to download europe to avoid redoing these steps myself. |
@laem I'm sorry but I didn't find any repository |
Hello all,
i downloaded the current europe file from geofabrik.de to update my current routing data.
So i ran
osrm-extract europe-latest.osm.pbf
to extract the files, but:
...
[info] Using profile api version 4
[info] RAM: peak bytes used: 38270754816
[error] [exception] std::bad_alloc
[error] Please provide more memory or consider using a larger swapfile
On a server with 64GB RAM ??!
I am not able to create a swap file (because it is an VServer) and there is (as far as i can see) no way to limit the RAM usage of osrm-extract. That VServer is powerful enough to do all my routing-calculations and i'm not able to rent an extra server just for extracting the PBF data. Honestly, i think that 64GB RAM should be fairly enough ... or there should be an option to limit the RAM usage. I will use OpenSource projects as often as i can and help them as good as i can, but (just my two cents) an open source project can only be used successfully and widely when a lot of people (here: developers) can use it. I know nobody in my colleagues circles with such a powerful server (128 GB RAM ?!). And on the widely used vservers swapping is deactivated.
Any solution or idea (helpful ones, pls not "buy a root server", "use a smaller map" or "buy a good server") ?
The text was updated successfully, but these errors were encountered: