Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-VALIDATION_COORDINATESSTATEPROVINCE_CONSISTENT #56

Open
iDigBioBot opened this issue Jan 5, 2018 · 70 comments
Open

TG2-VALIDATION_COORDINATESSTATEPROVINCE_CONSISTENT #56

iDigBioBot opened this issue Jan 5, 2018 · 70 comments
Labels
Consistency CORE TG2 CORE tests SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID f18a470b-3fe1-4aae-9c65-a6d3db6b550c
Label VALIDATION_COORDINATESSTATEPROVINCE_CONSISTENT
Description Do the geographic coordinates fall on or within the boundary from the bdq:sourceAuthority for the given dwc:stateProvince or within the distance given by bdq:spatialBufferInMeters outside that boundary?
TestType Validation
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:stateProvince
dwc:decimalLatitude
dwc:decimalLongitude
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the values of dwc:decimalLatitude or dwc:decimalLongitude are bdq:Empty or invalid, or dwc:stateProvince is bdq:Empty or not found in the bdq:sourceAuthority; COMPLIANT if the geographic coordinates fall on or within the boundary in the bdq:sourceAuthority for the given dwc:stateProvince (after coordinate reference system transformations, if any, have been accounted for), or within the distance given by bdq:spatialBufferInMeters outside that boundary; otherwise NOT_COMPLIANT.
Data Quality Dimension Consistency
Term-Actions COORDINATESSTATEPROVINCE_CONSISTENT
Parameter(s) bdq:sourceAuthority
bdq:spatialBufferInMeters
Source Authority bdq:sourceAuthority default = "10m-admin-1 boundaries" {[https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-1-states-provinces/]}
bdq:spatialBufferInMeters default = "3000"
Specification Last Updated 2024-08-30
Examples [dwc:stateProvince="Tasmania", dwc:decimalLatitude="-42.85", dwc:decimalLongitude="146.75": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="Input fields contain interpretable values"]
[dwc:stateProvince="Córdoba", dwc:decimalLatitude="-41.0525925872862", dwc:decimalLongitude="-71.5310546742521": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="Input fields contain interpretable values but coordinates don't match dwc:stateProvince with buffer"]
Source ALA
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes The geographic determination service is expected to return a list of names of first-level administrative divisions for geometries that the geographic point falls on or within, including a 3 km buffer around the administrative geometry. A match on any of those names should constitute a consistency, and dwc:countryCode should not be needed to make this determination, that is, this test does not attempt to disambiguate potential duplicate first-level administrative division names. The level of buffering may be related to the scale of the underlying GIS layer being used. At a global scale, typical map scales used for borders and coastal areas are either 1:3M or 1:1M (Dooley 2005, Chapter 4). Horizontal accuracy at those scales is around 1.5-2.5km and 0.5-0.85 km respectively (Chapman & Wieczorek 2020).
@iDigBioBot
Copy link
Collaborator Author

Comment by Lee Belbin (@Tasilee) migrated from spreadsheet:
Unsure what spatial scale we should go down to

@iDigBioBot
Copy link
Collaborator Author

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet:
Not a matter of resolution - some countries use Provinces (e.g. Canada) others States.

@iDigBioBot
Copy link
Collaborator Author

Comment by John Wieczorek (@tucotuco) migrated from spreadsheet:
Why not just stick with dwc:stateProvince, since that is unambiguously defined as the first administrative unit smaller than country and there are over a hundred distinct names for first level divisions in the world?

@iDigBioBot
Copy link
Collaborator Author

Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet:
What about cases where no decimalLat or decimalLong are supplied but we have verbatimLat,Long or coords? In those cases, should this test be applied AFTER interpreting decimalLat and decimalLong?

@iDigBioBot
Copy link
Collaborator Author

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet:
There is definitely an implied order (and perhaps we need to make an explicit order) for the tests - for example if it is fails the COUNTRY_COORDINATE_MISMATCH (VALIDATION_COORDINATE_COUNTRY_INCONSISTENT) then it will definitely fail this one as well so if it fails the first then this test is redundant

@ArthurChapman
Copy link
Collaborator

Difficult to get a standard vocabulary for StateProvince names and boundaries that work.

@ArthurChapman ArthurChapman added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Jan 17, 2018
@cgendreau
Copy link
Contributor

cgendreau commented Jan 17, 2018

Just to add to the difficulty of getting standard vocabulary, data from some country may be in another language and it could also use another alphabets and I don't think they should be flagged as INCONSISTENT.

@chicoreus
Copy link
Collaborator

@cgendreau if the stateProvince name string cannot be found in either the GIS data source or a thesaurus used to find variant and internatinonalized forms of the names, then the expectation would be that this test would return a result status of data/internal prerequisites not met, with no value for result value, rather than a result value of compliant or not compliant (noting that validation result values under the framework are only CONSISTENT or INCONSISTENT).

@Tasilee Tasilee changed the title TG2-VALIDATION_STATEPROVINCE_COORDINATE_INCONSISTENT TG2-VALIDATION_COORDINATE_STATEPROVINCE_INCONSISTENT Aug 21, 2018
@ArthurChapman ArthurChapman changed the title TG2-VALIDATION_COORDINATE_STATEPROVINCE_INCONSISTENT TG2-VALIDATION_COORDINATES_STATEPROVINCE_INCONSISTENT Aug 21, 2018
@Tasilee Tasilee changed the title TG2-VALIDATION_COORDINATES_STATEPROVINCE_INCONSISTENT TG2-VALIDATION_COORDINATES-STATE-PROVINCE_INCONSISTENT Oct 1, 2018
@tucotuco tucotuco added the Parameterized Test requires a parameter label Nov 5, 2018
@ArthurChapman
Copy link
Collaborator

I wonder - rather than using TGN - we can use the ISO subdivision codes ISO 3166-2 (https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes) as the Default authority. What do you think here @tucotuco ?

@tucotuco
Copy link
Member

tucotuco commented Jul 6, 2019

I think name matching is not the actual issue here, and not the authority we are after for this test. This test should use a standardized stateProvince to do the lookup, and so should ideally pass through geography standardization first. TGN is probably not the right authority, because it can't do the spatial intersection needed. An authority service based on GADM would be great. I do not know of a production level one.

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Jul 6, 2019

Level 1 layers seem to be available, but Level 2 and lower not apparently. However the UN is apparently preparing a Level 2 DB at 1:1 million (https://www.unsalb.org/)

The FAO GAUL is (http://www.fao.org/geonetwork/srv/en/metadata.show%3Fid%3D12691) but I understand the licencing is a problem with its use "The GAUL always maintains global layers with a unified coding system at country, first (e.g. departments) and second administrative levels (e.g. districts). Where data is available, it provides layers on a country by country basis down to third, fourth and lowers levels".

ESRI has a World Administrative Divisions (to first level) at https://www.arcgis.com/home/item.html?id=f0ceb8af000a4ffbae75d742538c548b. There are also some OpenStreetmap layers that I haven't looked at but appear to be Vector layers only and in Mercator projections

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 14, 2019

Thanks @tucotuco and @ArthurChapman: If there isn't a current service that can provide spatial intersection at Level 2, this test is not operable?

@tucotuco
Copy link
Member

tucotuco commented Aug 9, 2019

The Google Maps API can return geocoding information at administrative levels more specific than country - administrative_area_level_1 is the equivalent for dwc:stateProvince. So, this is theoreticaly operable.

@ArthurChapman
Copy link
Collaborator

Level 1 doesn't seem a big problem. Level 2 is still a long way off globally. Looking at the datasets available (https://www.unsalb.org/data?page=3) so far only about 27 out of 197 countries are covered. Google Maps seems a good option for Level 1 (I haven't checked - but do they include Level 2 at all (for example for the 27 countries that SALB have?))

@tucotuco
Copy link
Member

tucotuco commented Aug 9, 2019 via email

@ArthurChapman
Copy link
Collaborator

Wouldn't that take another test though. One would need to test against something to decide if it was valid or not. I don't think this is what we were intending. INTERNAL_PREREQUISITES_NOT_MET is EMPTY is just testing if something is there or not - if nothing there the test can't be run. To determine if it is valid or not would require some sort of further testing

@tucotuco
Copy link
Member

Wouldn't that take another test though. One would need to test against something to decide if it was valid or not. I don't think this is what we were intending. INTERNAL_PREREQUISITES_NOT_MET is EMPTY is just testing if something is there or not - if nothing there the test can't be run. To determine if it is valid or not would require some sort of further testing

It would require multiple other tests, and I don't think this is an isolated example. The coordinates might have to be interpreted first. The point is that the test can not be run meaningfully unless all of the right conditions are met, and having real coordinates is definitely a requirement for running the test.

@ArthurChapman
Copy link
Collaborator

Can't go through all the tests at the moment - but don't we have another test that tests for validity of Coordinates?

@tucotuco
Copy link
Member

tucotuco commented Jan 27, 2023 via email

@ArthurChapman
Copy link
Collaborator

I agree that it would be good for something that is not a valid coordinate would mean that the INTERNAL_PREREQUISITES_NOT_MET.

I suppose my initial reluctance was looking at things in order - i.e. EXTERNAL_PREREQUISITES_NOT_MET - can't go any further. If this isn't the case - move to the next step - i.e. EMPTY so INTERNAL_PREREQUISITES_NOT_MET - go to the test.

But - this is probably not necessarily how it works, as while running the test you determine that they values can't be interpreted as coordinates for the test - so fails. In this case it is because the INTERNAL_PREREQUISTES_NOT_MET and that would be noted in the Response.result as such rather than "RUN_HAS_RESULT with Response.result as NOT_COMPLIANT"

Bottom Line: I agree that "INTERNAL_PREREQUISITES_NOT_MET if the values of dwc:decimalLatitude, dwc:decimalLongitude are EMPTY or not valid ..." is good (also in #50). Perhaps we need a third example - something like

[dwc:stateProvince="Neuquén", dwc:decimalLatitude="-141.0525925872862", dwc:decimalLongitude="-71.5310546742521", dwc:geodeticDatum="": Response.status=INTERNAL_PREREQUISTES_NOT_MET, Response.comment="Input fields contain invalid values"]

@Tasilee
Copy link
Collaborator

Tasilee commented Feb 26, 2023

Thanks @ArthurChapman and @tucotuco. Do I take it that there is agreement to my suggestion above (#56 (comment)) that we do need to add "or not valid" to this and similar tests?

@tucotuco
Copy link
Member

@Tasilee From me, yes.

@chicoreus
Copy link
Collaborator

@Tasilee Yes, I'll concur as well. For this test, data have quality if the coordinates match the state/province. Data lack a particular kind of property if they coordinates are not consistent with the state province. By considering validity of the coordinates (and state/province) within the path for internal prerequisites not met, a failure (NOT_COMPLIANT) of this test isolates a class of problematic data - those where the coordinates exist, the state/province is known, and the coordinates fall outside of the state/province. For data quality control, isolating this class of problem has value, and other tests can highlight problematic data where the coordinates are invalid or the state/province is unknown.

@ArthurChapman
Copy link
Collaborator

Yes from me too

@Tasilee
Copy link
Collaborator

Tasilee commented Feb 26, 2023

Changed ER and added 3rd example

@ArthurChapman
Copy link
Collaborator

I changed the third example - in the response from "fields" to "field" as only one is invalid.

@Tasilee
Copy link
Collaborator

Tasilee commented Mar 19, 2023

I changed "not valid" in the Expected Response to "invalid" to standardize the phrasing.

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 13, 2023

Restructured Parameter(s) and Source authority

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 11, 2023

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "ADM1 boundaries" {[https://gadm.org]}

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 16, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Field" to "TestField" and "Output Type" to "TestType".

This one I am also not sure about. Please check.

@tucotuco
Copy link
Member

@Tasilee If it is a validation, nothing is acted upon, no? Or am I missing the point?

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 17, 2023

@tucotuco - If I interpreted the Zoom conversation correctly with @chicoreus last week, my thought was what Information Element/s were the FOCUS of the 'test' (=ActedUpon). Originally, I also wondered if all the Information Elements associated with VALIDATIONs and ISSUEs would be 'Consulted'.

@chicoreus ? It would be good to get this nailed down as I would like to finish all the edits ASAP this week.

@chicoreus
Copy link
Collaborator

chicoreus commented Sep 18, 2023 via email

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 18, 2023

Changed all Information Elements to "ActedUpon" as per Paul's Java Code

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@Tasilee
Copy link
Collaborator

Tasilee commented Apr 16, 2024

Standardized reference to bdq:sourceAuthority in Expected Response to "EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available"

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 20, 2024

Amended TERM-ACTION

COORDINATES_STATE-PROVINCE_CONSISTENT

to

COORDINATES_STATEPROVINCE_CONSISTENT

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 16, 2024
chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 25, 2024
@chicoreus chicoreus changed the title TG2-VALIDATION_COORDINATES-STATEPROVINCE_CONSISTENT TG2-VALIDATION_COORDINATESSTATEPROVINCE_CONSISTENT Aug 30, 2024
@chicoreus
Copy link
Collaborator

Removing hyphen or underscore from names/labels to make TERM_ACTION consistent

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Nov 10, 2024
…ations: tdwg/bdq#56 tdwg/bdq#59 tdwg/bdq#187, noting in 59 potential of change needed as specification does not conform to Darwin Core value (which may just need rationale management in the issue, or may need a change).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Consistency CORE TG2 CORE tests SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY
Projects
None yet
Development

No branches or pull requests

8 participants