-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move external shapefiles to tables in the DB #4092
Conversation
a340b8c
to
c17e901
Compare
I'm going to run some benchmarking on this |
Averaged over zoom 0 to zoom 12, shapefiles in the DB takes 514 minutes, shapefiles on disk take 521 minutes. This is a 1% difference, but I'm not sure it's above my margins of error. In any case, performance isn't a concern with this PR. |
Is that the time to render all metatiles at these 13 zoom levels, or just to render the oceans and icesheets? |
Everything. |
For Docker, will ogr2ogr already be included in the container? No changes are needed other than the switch to |
I added gdal-bin to the dockerfile, that should be enough. If someone wants to test it, that would be good. |
I believe having the coastlines as ways in the database would make it easier to solve #712 (hide admin boundary lines on the coastline) |
So I will need to remove the docker image and then rebuild the image to test this? Unfortunately that takes up a large amount of bandwidth, so I won't be able to do it until our airport is reopened again (hopefully sometime in mid-April or at least by May?). @Adamant36 or @sommerluk - would you be able to test this PR with Docker? |
Sure. I'll give it a try over the weekend. |
No, I can't test it. Currently, I've here often a bandwidth of 0,0 Mbit/s, and in the best case sometimes 0,9 Mbit/s. Sorry. |
I'd like to target this for after 5.1 |
@Adamant36 - would you have a chance to test this over the weekend? |
Unfortunately not. My VM setup broke and I'm probably going wait until 20.04 comes out to setup it up again so I dont have to deal with it twice. I can test it after that though. |
Note that this would give up more options for fixing problems like #621, so it is beneficial for more flexibility in rendering options, in addition to the improved performance. |
I'm back in the USA, so I can test this once the merge conflict is resolved. |
This adds a script that loads files into the DB based on a YAML file listing the data sources. The script can be run while rendering is going on, as it swaps old tables with new ones in a transaction. Loading is done by using ogr2ogr to load into a temporary schema, clustering, then the swap in transaction. The status of the tables is tracked in the `external_data` table, which lists the last modified date of each table. This allows the loading script to use conditional GETs and only download and update for sources which have changed.
0994b5f
to
4111063
Compare
Rebased. |
I've added the new dependencies to the import dockerfile and readme |
Great, this now works as expected. I was a little surprised that the shapefiles are no longer downloaded as separate files at all. Will updating the shapefiles require manually running |
|
||
``` | ||
scripts/get-shapefiles.py | ||
scripts/get-external-data.py | ||
``` | ||
|
||
This script downloads necessary files, generates and populates the *data* directory with all needed shapefiles, including indexing them through *shapeindex*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line does not seem to be correct now
Typically you'd automate this with a cron job |
Ideally we should document how to do this, though that doesn't need to be included in this PR. I'm tested this PR in docker, and would be ready to approve it, but I see a review by @imagico is pending. |
I had requested a review because I hadn't gotten any other response. I'd go ahead and merge this now, but I want to do it at the start of a release cycle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, it seems to work fine.
What is a bit annoying is that with the coastlines in the database you have to duplicate the coastline data if you have several test databases (like for real data and abstract tests) and switch between them via localconfig.
Hi, I followed all the instructions for Docker but I still have issues reported here and I don't have a table called |
The shape files never loaded for me when I did a clean install and ran Kosmtik. It just errored out after the map loaded and said that the shapefiles where missing. |
So which could be the error? Here the log file. |
Here's the relevant part of the log:
|
@LorenzoStucchi was this a new installation, or an update to a previous version which was working before? |
@Adamant36 re: "The shape files never loaded for me when I did a clean install and ran Kosmtik": When did this happen? Was it with the latest release? |
@jeisenbe the last update is here.
It is a new installation on Docker on Windows. But I realise that I have to run twice the command
|
@LorenzoStucchi did you check out the latest commit of the |
@jeisenbe I used the |
OK, I've updated to the latest version of Docker , and now I am seeing the same errors as reported by @LorenzoStucchi and @Adamant36:
The import process was otherwise fine: if I switch back to the latest release (v5.2.0) there is no problem, even without re-importing the data. |
However, when on the last commit in this PR or on the latest commit on the master branch, I now notice this error after the import process completes:
This is noted after the data is imported:
|
OK, I killed and removed all the docker processes and images, and tried starting again from the last commit in this PR. This worked fine. To do this, run Then you can use You might need to enter "y" at some point to say that you want to recreate the container. During the import process, the shapefiles should be downloaded and imported, if this has not been done already. Then run |
I have also tried this with the latest commit on the @Adamant36 and @LorenzoStucchi could you try this and see if it works? |
I tried but not succeded, using the
|
You mean that you checked out the tag If you are switching between v5.2.0 and the current master branch, I'm afraid you will need to take the steps outlined above in #4092 (comment) - kill docker-compose, remove the docker relevant image(s), and rebuild. |
I cancelled everything with the procedure explained and using |
Reloading everything worked for me. Hopefully there will be a better solution at some point though. |
I retry with |
This adds a script that loads files into the DB based on a
YAML file listing the data sources. The script can be run
while rendering is going on, as it swaps old tables with
new ones in a transaction.
Loading is done by using ogr2ogr to load into a temporary
schema, clustering, then the swap in transaction. The status
of the tables is tracked in the
external_data
table, whichlists the last modified date of each table. This allows the
loading script to use conditional GETs and only download and
update for sources which have changed.
Fixes #4089
Rendering tested with a planet DB in Antarctica and a local coastline.
If I were rewriting the python script I'd probably do it differently now that I'm better, but it's existing code that works. I'm happy with it's functionality, it's just internal refactoring I would do.