-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
match and ignore certain characters in start & end dates #304
Comments
@rwelty1889 thanks for the issue! We should absolutely aim to support approximation / uncertainty in dates and at least capture that data, even if we are unlikely to be able to correctly process those approximate dates correctly in the vector tiles pipeline, etc. right now. Whilst saying that, I am very wary to start adding this kind of logic to the vector tile pipeline (cc @geohacker ). My proposal here would be to add separate We do need a pathway to be able to record date approximations so this is an important conversation to have - unfortunately, we will also need to balance the "needs of the renderer" at this point, and currently it is hard to support anything other than fixed start and end dates. Will let @geohacker comment a bit more on the feasibility of doing these string replacements - I'm not a huge fan of adding more complexity to the already slightly brittle SQL functions. @rwelty1889 would you be open to considering the idea of separate tags for EDTF times? Going to chew on this a bit more and discuss with @geohacker - we should definitely do something about this, just not certain what the best way forward here is. Thanks again for the ticket @rwelty1889 and happy to talk about this soon. |
for now, i would consider using the _edtf versions of the tag side by side with the original versions, in order to preserve the symbols. in the long run, we need a coordinated approach that doesn't require duplication (or quasi duplication) of data in the tags. providing some sort of date filter between the date tags and the renderer is probably going to need to be done. |
just for reference, the start_date page in the wiki documents a lot of stuff about uncertainty and imprecision which so far as i know is rarely if ever supported in OSM. it's also mostly ad hoc, where as EDTF is an open standard. this link contains my current thoughts and commentary on the issues: start_date issues |
related to #303 |
@batpad I wonder if this is something we could add to our Also noting this is related to #15 |
what i asked for in this ticket is minimal. another project i have started but have limited time for is a parser for levels 0 and 1 of EDTF, which i'm writing in ANTLR, meaning that it can target several different implementation languages. that's something that i'll put out there under a reasonably permissive open source license when i get it working, probably a 3 clause BSD or something like that. |
I see on your Wiki that you list existing tags. Is that from an export or other systematic search or just anecdotal? Might not matter. Just curious. What I am wondering is if instead of targeting specific strings, we could REGEX out only numbers and dashes, ignore everything else, and get better results. This would at least not lead to a whole series of string replacements in our vector tile code. But it would also apply to everything, whether or not it improved the output I looked at your list and classified them by that:
So a few places where things are helped and many more where they are not. so that means a case-by-case filtering. Ideally informed by TagInfo (forthcoming) we could prioritize parsing those formats that are most common in the existing data. @batpad I'm keen to hear more about your concerns on brittleness in the date functions. My hope has been that we could incrementally add intelligence to our date functions that might start simple with |
you did this test on the current tagging; i'm proposing abandoning that part of start_date in favor of EDTF which is the second column in that table. current tagging is mostly ad hoc when it comes to uncertainty, where as EDTF is standards based. |
The wiki had been recommending |
I don’t think we should implement the workaround proposed in #304 (comment). The In the future, if we want to fold these subkeys back into /ref #15 (comment) |
What's your idea for a cool feature that would help you use OHM better.
this is a request for very limited support for some EDTF features.
ignore trailing %, ~, and ? characters after dates (they represent approximation and uncertainty and the timeslider would probably do this anyway with full EDTF support
ignore leading and trailing / on dates -
they represent intervals where one end of the range is unknown and just ignoring them is fine for now.
this is part of experimentation with lifecycle stuff.
Current workarounds
i could just delete the characters but they are part of the data. i would have to put FIXME as reminders that this needs to be addressed.
Additional info
the three buildings that appear here in OHM do not exist in 1834, but are displayed because OHM does not currently know what to do with the start_date value
The text was updated successfully, but these errors were encountered: