Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache parent address for rank 29 and 30 (#529) #547

Merged
merged 2 commits into from
Apr 4, 2021

Conversation

alfmarcua
Copy link

Parent address are cached and reused between ranks 29 and 30 objects with the same parent as indicated in #529

@alfmarcua
Copy link
Author

I have made a few test with a Nominatim database with Portugal data. I have run java -jar photon-0.3.4.jar -nominatim-import -host localhost -port 5432 -database nominatim_portugal -user nominatim -password XXXX -languages es,fr for both versions.

The import of new version seems a bit faster (3 minutes 31 seconds) than the one performed with the master branch (4 minutes 54 seconds).

Also after logging all the queries made to the database, the logs of the newer version are smaller: 988 MB for the new version vs 1.8 GB for the master branch. Analyzing them gives 439585 queries for the new version vs 796212 for the master branch.

@lonvia
Copy link
Collaborator

lonvia commented Mar 17, 2021

Looks good. I'm testing this on a planet, which will take a couple of days.

In the meantime, could you please rebase on master? I've fixed the CI issue there. So after rebasing the failing check should be gone.

@alfmarcua
Copy link
Author

Sure, rebase done!
We are dealing with speeding up the process and, from our results, we do not see a clear improvement of the overall process time despite of the number of queries has been significantly reduced.
Could you provide us with some insights about the obtained times and number of queries both before and after the pull request?

Copy link
Collaborator

@lonvia lonvia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For dumping the planet I get 8h 44min on master, 8h 04min with this PR and 7h 27min with this PR and reversed ORDER BY. So that is about 15% improvement. I don't have numbers for the number of queries.

@lonvia
Copy link
Collaborator

lonvia commented Apr 4, 2021

I've run another test on the planet after fixing the issue of the missing housenumbers (#558). Now the photon database is back at the expected 140GB. Master needs 21h 18min and this PR 12h 25min. That comes very close to the theoretical maximum of what the PR could achieve. Nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants