-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TG2-VALIDATION_COORDINATES_NOTZERO #87
Comments
Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: |
Likeliness in Data Quality Dimension changed to Likelihood |
Agreed at TDWG 2018 DQIG meeting that the name TG2-VALIDATION_COORDINATES_ZERO is satisfactory. |
I would make a modification to this one to avoid one particular false trigger of a failed validation. I would replace the Expected Response "INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude and/or dwc:decimalLongitude are EMPTY or both of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT" with "INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude is EMPTY or both of the values are not interpretable as numbers; COMPLIANT if either the numeric value of dwc:decimalLatitude is not = 0 or the numeric value of dwc:decimalLongitude is not = 0 or if (the numeric values of both coordinates are equal to 0 and (the dwc:coordinateUncertaintyInMeters can be interpreted as a number and the numeric value of dwc:coordinateUncertaintyInMeters is >= 1) or (the dwc:coordinatePrecision can be interpreted as a number and the numeric value of dwc:coordinatePrecision is not = 0)); otherwise NOT_COMPLIANT" To the Information Elements I would add dwc:coordinateUncertaintyInMeters and dwc:coordinatePrecision. I would change the Examples from dwc:decimalLatitude="0", dwc:decimalLongitude="0" to dwc:decimalLatitude="0", dwc:decimalLongitude="0", dwc:coordinateUncertaintyInMeters = "20037509" To the Notes I would add "Valid values of uncertainty or precision can indicate real occurrences at the geographic coordinates 0, 0. A georeference indicating that the location is only known to be from Earth would likely have coordinates 0,0 and coordinateUncertaintyInMeters equal to half the equatorial circumference." |
It would be interesting to know how many true recordings there are at exactly 0.000000, 0.000000 in the middle of the ocean. I would expect it is very low, if not none. Is it worth making the test a lot more complicated so that you don't flag those few - rather than flag them anyway and if someone is interested in that area, checking those few? |
@tucotuco not sure I understand the change, it doesn't seem to agree with the notes, which imply only that a coordinate uncertainty equal to half the equatorial curcumerence is an allowed 0,0 value. As stated, 0,0 without both coordinate uncertainty and coordinate precision is flagged, but any value in either uncertainty or precision makes 0,0 compliant. That doesn't make sense to me as I expect the number of error cases where a coordinate uncertainty was given but latitude and longitude weren't would be much much larger than the number of cases of 0,0 that are real observations. I'd much rather leave out the edge case, leave the specification as is, and flag any case where latitude and longitude are zero. |
I capitulate. :-)
…On Wed, Apr 8, 2020 at 11:04 PM Paul J. Morris ***@***.***> wrote:
@tucotuco <https://github.com/tucotuco> not sure I understand the change,
it doesn't seem to agree with the notes, which imply only that a coordinate
uncertainty equal to half the equatorial curcumerence is an allowed 0,0
value. As stated, 0,0 without both coordinate uncertainty and coordinate
precision is flagged, but any value in either uncertainty or precision
makes 0,0 compliant. That doesn't make sense to me as I expect the number
of error cases where a coordinate uncertainty was given but latitude and
longitude weren't would be much much larger than the number of cases of 0,0
that are real observations.
I'd much rather leave out the edge case, leave the specification as is,
and flag any case where latitude and longitude are zero.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#87 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ724BLKKYORG3NHMPHDTRLUULDANCNFSM4EKSOW3A>
.
|
I tend to agree with @ArthurChapman. The test would make more sense to me if EITHER dwc:decimalLatitude or dwc:decimalLongitude were zero. It just makes it a more useful test as the lat=lon=0 is going to be rarer. At one stage, I chased up a suite of what were badly processed records in the ALA where you got a 45 degree line heading southwest from 0,0. The test name would still make sense if we did the OR approach. My ongoing philosophy of being ok with some false positives still holds. |
I don't like telling people their data are wrong when they are not, not matter how many there are, especially because the test would continue to tell them the same thing every time. That would annoy me to know end if I was trying to use the test to improve my data. I have hear this sentiment among the folks we deal with, and that is one of the reasons they like VertNet so much - we don't continue to pester unnecessarily. I checked in my GBIF snapshot from a year ago. There are 68373 occurrences with one or the other zero, but not both. Of these, it looks like about 75% are real with the zero, and the rest are errors. |
I am not surprised that there are many records with one of Latitude or Longitude as Zero - many of these are terrestrial and even the marine ones could be good. Where both are 0 - there are not many, if any, that are valid records. Many have arisen where the data is EMPTY and certain databases converted the NULL value to 0. From memory (I could be wrong) but the old Advanced Revelation database software (used in South Africa at one stage) converted Null values to 0. I think that GBIF may be removing the 0,0 records - hence you getting no records. I would not touch records where one of Latitude or Longitude are 0. But where both are 0 we should identify. |
My data are based on my copy of all GBIF. Nothing is filtered, so there
aren't any missing. But good, when both zero, trigger.
…On Thu, Apr 9, 2020 at 6:29 PM Arthur Chapman ***@***.***> wrote:
I am not surprised that there are many records with one of Latitude or
Longitude as Zero - many of these are terrestrial and even the marine ones
could be good. Where both are 0 - there are not many, if any, that are
valid records. Many have arisen where the data is EMPTY and certain
databases converted the NULL value to 0. From memory (I could be wrong) but
the old Advanced Revelation database software (used in South Africa at one
stage) converted Null values to 0. I think that GBIF may be removing the
0,0 records - hence you getting no records.
I would not touch records where one of Latitude or Longitude are 0. But
where both are 0 we should identify.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#87 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ725CYVPP2VCWMN2KDLDRLY45FANCNFSM4EKSOW3A>
.
|
As usual, I bow to the experts. I found 22 records in ALA of lat/lon=0,0. These are rendered as 'spatially invalid' on test 'coordinates don't match country (error)' and also warnings on lat=0, long=0, lat=long=0. |
Suggest Description: 'Are the values of either dwc:decimalLatitude or dwc:decimalLongitude numbers that are not equal to 0?' in place of: 'Are the values of either dwc:decimalLatitude or dwc:decimalLongitude numbers that are not = 0?' |
… Making method name consistent with test name.
Specification is inconsistent with dataID 707 in the validation data, which has data values indicative of a phrasing: INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude and/or dwc:decimalLongitude are EMPTY or either of the values are not interpretable as numbers; As we are trying to isolate just 0,0 coordinates with a Response.result of NOT_COMPLIANT in this test, we probably do wan to change the specification to use either instead of both. |
…st current (2023-06-12) test descriptions. Addressed implementation of tdwg/bdq#87 VALIDATION_COORDINATES_NOTZERO Adding ProvidesVersion annotations. Removing now empty file stubs for checked methods. Adding to unit test.
In this test we are trying to exclude 0, 0 - Not 0, 145.7 etc. @chicoreus - your wording above is saying the that 0 in either latitude or longitude is empty, but this wasn't what was intended originally. There is a greater likelihood that 0, 147.5 is a good record than 0,0. |
This was a test for lat/lon 0,0 so reflecting it to the 'positive' probably stuffed the logic. The intent is INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or not interpretable as numbers; COMPLIANT if the value of dwc:decimalLatitude and dwc:decimalLongitude are not zero; otherwise NOT_COMPLIANT This makes DataID 707 COMPLIANT |
@ArthurChapman I'm confusing things by just changing one clause. The intent is indeed that 0,26.445 is COMPLIANT, the question is how to handle 0,"foo", is that COMPLIANT (because it is 0 something, unlike the logic for the main clause), or should we explicitly exclude it as the "foo" might be other than zero. Thus in full: INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude and/or dwc:decimalLongitude are EMPTY or either of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT These cases are the same using "both" or "either"
These cases differ: Both:
Either:
If we use "both" in the INTERNAL_PREREQUISITES_NOT_MET clause, then both decimalLatitude and decimalLongitude must be non-numeric for the INTERNAL_PREREQUISITES_NOT_MET to be met, otherwise, we pass on to the compliant/non compliant clauses and ask if both values are zero, if both are then NOT_COMPLIANT If we use "either" in the INTERNAL_PREREQUISITES_NOT_MET clause, then a non-numeric value in either decimalLatitude or decimalLongitude prevents us from being able to tell if the asserted coordinate is 0,0, and we assert that we can't run the test instead. |
@Tasilee "INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or not interpretable as numbers; " isn't explicit about what happens when one of dwc:decimalLatitude or dwc:decimalLongitude is not interpretable as numbers. I could implement either way from that phrasing, but, given the "or' inn the begning of the clause, I would tend to say that it carries on to the second part of the clause meaning "INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or either is not interpretable as a number; " |
So, what you are suggesting for the ER is- INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or either value is not interpretable as a number; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT ? |
@Tasilee in essence, yes. The specific proposal in #87 (comment) is: INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude and/or dwc:decimalLongitude are EMPTY or either of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT |
Sorry to be pedantic, but surely you only need any one of the input values to be EMPTY to trigger INTERNAL_PREREQUISITES_NOT_MET? As in INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or either of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT |
@Tasilee pedantic is good. Or could be interpreted as an exclusive or (where one being empty satisfies the condition, but both being empty does not). We could be more explicit by adding the phrase "at least one of" INTERNAL_PREREQUISITES_NOT_MET if at least one of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or at least one of either of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT |
Thanks @chicoreus - I can live with that. |
Updated INTERNAL_PREREQUISITES_NOT_MET in the Expected Response in line with discussion on #43 and updated Specification Last Updated. Removed NEEDS WORK |
…-06-28) specifications. Addressing tdwg/bdq#87 VALIDATION_COORDINATES_NOTZERO updating metadata, implementation, and unit test to reflect change in specification.
Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated" |
The text was updated successfully, but these errors were encountered: