Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-VALIDATION_COORDINATES_NOTZERO #87

Open
iDigBioBot opened this issue Jan 5, 2018 · 25 comments
Open

TG2-VALIDATION_COORDINATES_NOTZERO #87

iDigBioBot opened this issue Jan 5, 2018 · 25 comments
Labels
CORE TG2 CORE tests Likeliness SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID 1bf0e210-6792-4128-b8cc-ab6828aa4871
Label VALIDATION_COORDINATES_NOTZERO
Description Are the values of either dwc:decimalLatitude or dwc:decimalLongitude numbers that are not equal to 0?
TestType Validation
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:decimalLatitude
dwc:decimalLongitude
Information Elements Consulted
Expected Response INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude is bdq:Empty or is not interpretable as a number, or dwc:decimalLongitude is bdq:Empty or is not interpretable as a number; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT
Data Quality Dimension Likeliness
Term-Actions COORDINATES_NOTZERO
Parameter(s)
Source Authority
Specification Last Updated 2023-06-20
Examples [dwc:decimalLatitude="21.0534", dwc:decimalLongitude="81.0554": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:decimalLatitude and dwc:decimalLongitude are not zero"]
[dwc:decimalLatitude="0", dwc:decimalLongitude="0",: Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:decimalLatitude and dwc:decimalLongitude are zero"]
Source ALA, GBIF, OBIS
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes A record with 0.0 is interpreted as the string "0"
@iDigBioBot
Copy link
Collaborator Author

Comment by Lee Belbin (@Tasilee) migrated from spreadsheet:
Suggest we split this into two tests

@ArthurChapman ArthurChapman added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Jan 17, 2018
@ArthurChapman
Copy link
Collaborator

Likeliness in Data Quality Dimension changed to Likelihood

@tucotuco
Copy link
Member

Agreed at TDWG 2018 DQIG meeting that the name TG2-VALIDATION_COORDINATES_ZERO is satisfactory.

@tucotuco
Copy link
Member

tucotuco commented Apr 9, 2020

I would make a modification to this one to avoid one particular false trigger of a failed validation. I would replace the Expected Response

"INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude and/or dwc:decimalLongitude are EMPTY or both of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT"

with

"INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude is EMPTY or both of the values are not interpretable as numbers; COMPLIANT if either the numeric value of dwc:decimalLatitude is not = 0 or the numeric value of dwc:decimalLongitude is not = 0 or if (the numeric values of both coordinates are equal to 0 and (the dwc:coordinateUncertaintyInMeters can be interpreted as a number and the numeric value of dwc:coordinateUncertaintyInMeters is >= 1) or (the dwc:coordinatePrecision can be interpreted as a number and the numeric value of dwc:coordinatePrecision is not = 0)); otherwise NOT_COMPLIANT"

To the Information Elements I would add dwc:coordinateUncertaintyInMeters and dwc:coordinatePrecision.

I would change the Examples from

dwc:decimalLatitude="0", dwc:decimalLongitude="0"

to

dwc:decimalLatitude="0", dwc:decimalLongitude="0", dwc:coordinateUncertaintyInMeters = "20037509"

To the Notes I would add

"Valid values of uncertainty or precision can indicate real occurrences at the geographic coordinates 0, 0. A georeference indicating that the location is only known to be from Earth would likely have coordinates 0,0 and coordinateUncertaintyInMeters equal to half the equatorial circumference."

@ArthurChapman
Copy link
Collaborator

It would be interesting to know how many true recordings there are at exactly 0.000000, 0.000000 in the middle of the ocean. I would expect it is very low, if not none. Is it worth making the test a lot more complicated so that you don't flag those few - rather than flag them anyway and if someone is interested in that area, checking those few?

@chicoreus
Copy link
Collaborator

@tucotuco not sure I understand the change, it doesn't seem to agree with the notes, which imply only that a coordinate uncertainty equal to half the equatorial curcumerence is an allowed 0,0 value. As stated, 0,0 without both coordinate uncertainty and coordinate precision is flagged, but any value in either uncertainty or precision makes 0,0 compliant. That doesn't make sense to me as I expect the number of error cases where a coordinate uncertainty was given but latitude and longitude weren't would be much much larger than the number of cases of 0,0 that are real observations.

I'd much rather leave out the edge case, leave the specification as is, and flag any case where latitude and longitude are zero.

@tucotuco
Copy link
Member

tucotuco commented Apr 9, 2020 via email

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 9, 2020

I tend to agree with @ArthurChapman. The test would make more sense to me if EITHER dwc:decimalLatitude or dwc:decimalLongitude were zero. It just makes it a more useful test as the lat=lon=0 is going to be rarer.

At one stage, I chased up a suite of what were badly processed records in the ALA where you got a 45 degree line heading southwest from 0,0.

The test name would still make sense if we did the OR approach. My ongoing philosophy of being ok with some false positives still holds.

@tucotuco
Copy link
Member

tucotuco commented Apr 9, 2020

I don't like telling people their data are wrong when they are not, not matter how many there are, especially because the test would continue to tell them the same thing every time. That would annoy me to know end if I was trying to use the test to improve my data. I have hear this sentiment among the folks we deal with, and that is one of the reasons they like VertNet so much - we don't continue to pester unnecessarily.

I checked in my GBIF snapshot from a year ago. There are 68373 occurrences with one or the other zero, but not both. Of these, it looks like about 75% are real with the zero, and the rest are errors.

@ArthurChapman
Copy link
Collaborator

I am not surprised that there are many records with one of Latitude or Longitude as Zero - many of these are terrestrial and even the marine ones could be good. Where both are 0 - there are not many, if any, that are valid records. Many have arisen where the data is EMPTY and certain databases converted the NULL value to 0. From memory (I could be wrong) but the old Advanced Revelation database software (used in South Africa at one stage) converted Null values to 0. I think that GBIF may be removing the 0,0 records - hence you getting no records.

I would not touch records where one of Latitude or Longitude are 0. But where both are 0 we should identify.

@tucotuco
Copy link
Member

tucotuco commented Apr 9, 2020 via email

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 10, 2020

As usual, I bow to the experts.

I found 22 records in ALA of lat/lon=0,0. These are rendered as 'spatially invalid' on test 'coordinates don't match country (error)' and also warnings on lat=0, long=0, lat=long=0.

@Tasilee Tasilee changed the title TG2-VALIDATION_COORDINATES_ZERO TG2-VALIDATION_COORDINATES_NOTZERO Mar 22, 2022
@tucotuco
Copy link
Member

Suggest Description:

'Are the values of either dwc:decimalLatitude or dwc:decimalLongitude numbers that are not equal to 0?'

in place of:

'Are the values of either dwc:decimalLatitude or dwc:decimalLongitude numbers that are not = 0?'

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Sep 2, 2022
… Making method name consistent with test name.
chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Sep 2, 2022
@chicoreus
Copy link
Collaborator

Specification is inconsistent with dataID 707 in the validation data, which has data values indicative of a phrasing:

INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude and/or dwc:decimalLongitude are EMPTY or either of the values are not interpretable as numbers;

As we are trying to isolate just 0,0 coordinates with a Response.result of NOT_COMPLIANT in this test, we probably do wan to change the specification to use either instead of both.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Jun 18, 2023
…st current (2023-06-12) test descriptions. Addressed implementation of tdwg/bdq#87 VALIDATION_COORDINATES_NOTZERO   Adding ProvidesVersion annotations.   Removing now empty file stubs for checked methods.  Adding to unit test.
@ArthurChapman
Copy link
Collaborator

In this test we are trying to exclude 0, 0 - Not 0, 145.7 etc. @chicoreus - your wording above is saying the that 0 in either latitude or longitude is empty, but this wasn't what was intended originally. There is a greater likelihood that 0, 147.5 is a good record than 0,0.

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 18, 2023

This was a test for lat/lon 0,0 so reflecting it to the 'positive' probably stuffed the logic. The intent is

INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or not interpretable as numbers; COMPLIANT if the value of dwc:decimalLatitude and dwc:decimalLongitude are not zero; otherwise NOT_COMPLIANT

This makes DataID 707 COMPLIANT

@chicoreus
Copy link
Collaborator

@ArthurChapman I'm confusing things by just changing one clause. The intent is indeed that 0,26.445 is COMPLIANT, the question is how to handle 0,"foo", is that COMPLIANT (because it is 0 something, unlike the logic for the main clause), or should we explicitly exclude it as the "foo" might be other than zero. Thus in full:

INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude and/or dwc:decimalLongitude are EMPTY or either of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT

These cases are the same using "both" or "either"

dwc:decimalLatitude dwc:decimalLongitude Response.status Response.value
0 0 RUN_HAS_RESULT NOT_COMPLIANT
A B INTERNAL_PREREQUISITES_NOT_MET
1.45 0 RUN_HAS_RESULT COMPLIANT
6.4255 35.634 RUN_HAS_RESULT COMPLIANT
0 35.634 RUN_HAS_RESULT COMPLIANT

These cases differ:

Both:

dwc:decimalLatitude dwc:decimalLongitude Response.status Response.value
1.45 A RUN_HAS_RESULT COMPLIANT
Foo 0 RUN_HAS_RESULT COMPLIANT

Either:

dwc:decimalLatitude dwc:decimalLongitude Response.status Response.value
1.45 A INTERNAL_PREREQUISITES_NOT_MET
Foo 0 INTERNAL_PREREQUISITES_NOT_MET

If we use "both" in the INTERNAL_PREREQUISITES_NOT_MET clause, then both decimalLatitude and decimalLongitude must be non-numeric for the INTERNAL_PREREQUISITES_NOT_MET to be met, otherwise, we pass on to the compliant/non compliant clauses and ask if both values are zero, if both are then NOT_COMPLIANT

If we use "either" in the INTERNAL_PREREQUISITES_NOT_MET clause, then a non-numeric value in either decimalLatitude or decimalLongitude prevents us from being able to tell if the asserted coordinate is 0,0, and we assert that we can't run the test instead.

@chicoreus
Copy link
Collaborator

@Tasilee "INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or not interpretable as numbers; " isn't explicit about what happens when one of dwc:decimalLatitude or dwc:decimalLongitude is not interpretable as numbers. I could implement either way from that phrasing, but, given the "or' inn the begning of the clause, I would tend to say that it carries on to the second part of the clause meaning "INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or either is not interpretable as a number; "

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 19, 2023

So, what you are suggesting for the ER is-

INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or either value is not interpretable as a number; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT

?

@chicoreus
Copy link
Collaborator

@Tasilee in essence, yes. The specific proposal in #87 (comment) is:

INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude and/or dwc:decimalLongitude are EMPTY or either of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 19, 2023

Sorry to be pedantic, but surely you only need any one of the input values to be EMPTY to trigger INTERNAL_PREREQUISITES_NOT_MET? As in

INTERNAL_PREREQUISITES_NOT_MET if dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or either of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT

@chicoreus
Copy link
Collaborator

@Tasilee pedantic is good. Or could be interpreted as an exclusive or (where one being empty satisfies the condition, but both being empty does not). We could be more explicit by adding the phrase "at least one of"

INTERNAL_PREREQUISITES_NOT_MET if at least one of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or at least one of either of the values are not interpretable as numbers; COMPLIANT if either the value of dwc:decimalLatitude is not = 0 or the value of dwc:decimalLongitude is not = 0; otherwise NOT_COMPLIANT

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 19, 2023

Thanks @chicoreus - I can live with that.

@ArthurChapman
Copy link
Collaborator

Updated INTERNAL_PREREQUISITES_NOT_MET in the Expected Response in line with discussion on #43 and updated Specification Last Updated. Removed NEEDS WORK

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Jun 28, 2023
…-06-28) specifications. Addressing tdwg/bdq#87 VALIDATION_COORDINATES_NOTZERO updating metadata, implementation, and unit test to reflect change in specification.
@Tasilee
Copy link
Collaborator

Tasilee commented Sep 18, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CORE TG2 CORE tests Likeliness SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation
Projects
None yet
Development

No branches or pull requests

5 participants