-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outputting metadata only #626
Comments
First of all, I'm posting the results of your script from a local test run for your reference. https://gist.github.com/mmd-osm/5327e534807b8c45ed015c7b2956cac9 - it only took a few minutes to process. In theory, the file contents of a file called relations_meta_attic.bin would be sufficient for your use case. It's only 259MB large, and is available from https://dev.overpass-api.de/clone/. It requires some custom C++ code, though. |
I am not entirely sure how to interpret your reply and think maybe there is a misunderstanding. I don't necessarily need help for this particular case. I worked around the biggest problems (timeouts and request rate limiting) and as you could see it works fairly ok. It takes only a few minutes because it skips over the really big relations. I don't necessarily need them here but each of them alone takes several minutes - and some don't even finish within 10 minutes (each!). The intent of my report was rather to spark a discussion of how such use cases could be improved in general within Overpass. If I understand correctly, then all necessary metadata needed is contained in independent files (+ the indices I guess). That means that one could actually interact with them locally in a custom application without the need for all others (specifically w/o the non-attic version) - but that also means that the OP server has access to these data without the need to merge a lot of different "tables", right? |
would have already printed all relevant details, except for the user id, and the changeset. Adding both fields is a two-line change, the data is available anyway at that point. |
Just to make to clear... I am satisfied and the script will only be executed maybe another dozen times over the next weeks or so. It is kind of a one-time hack. The purpose of the issue was really just to show an actual use case for this kind of query in case you get bored. ;) But it's great to know and have it publicly documented that there is a much fast alternative in any case. Thanks, bye. |
I am trying to do some statistics on the history of route relations. I am not interested in the relation members at all, just the history of the metadata, e.g., which users edited which version at which point in time. I am interested in reducing the server load - to be able to gather data from some relations
Some of the relations have a long history with many members (think of (inter)national routes). In "simple" queries one has to use
out meta
to fetch the necessary information. The standard API's/history
returns basically the same as Overpass in that case: the complete history including the complete data of all members. It is thus no alternative either.An example query I am using is depicted below.
One way to trim that significantly down is to select exactly what's output by using
stat
, e.g.,This help a lot on the client side (if the library would otherwise deserialize the whole dataset into objects), however, AFAICT it does not reduce the load on the server at all.
Would it make sense to add another output "modificator" to return the bare minimum of metadata information to allow the server to do less work or can they query be optimized somehow?
For reference, this is what I did so far: https://github.com/stefanct/osm_refhistorymeta/blob/main/ref_contributors.py
PS: I found the documentation concerning
retro
a bit lacking. For example, I have no idea why the examples sometimes useforeach
vs.for
(like the two above). Using the full name ofu
andt
would also have made things a bit easier to grasp for me.PPS: In the official documentation there is a hint to
out noids
and this does indeed remove the IDs of the relation and its members but that's only a fraction of what is returned (e.g., all tags are still there) and probably does not reduce the server's load either.PPPS: This is somewhat related to #189 but only very loosely.
The text was updated successfully, but these errors were encountered: