Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-VALIDATION_COUNTRYCOUNTRYCODE_CONSISTENT #62

Open
iDigBioBot opened this issue Jan 5, 2018 · 24 comments
Open

TG2-VALIDATION_COUNTRYCOUNTRYCODE_CONSISTENT #62

iDigBioBot opened this issue Jan 5, 2018 · 24 comments
Labels
Consistency CORE TG2 CORE tests ISO/DCMI STANDARD SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID b23110e7-1be7-444a-a677-cdee0cf4330c
Label VALIDATION_COUNTRYCOUNTRYCODE_CONSISTENT
Description Does the ISO country code, determined from the value of dwc:country, equal the value of dwc:countryCode?
TestType Validation
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:country
dwc:countryCode
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either of the terms dwc:country or dwc:countryCode are bdq:Empty; COMPLIANT if the values of dwc:country and dwc:countryCode match national-level country name and matching country code respectively in the bdq:sourceAuthority
Data Quality Dimension Consistency
Term-Actions COUNTRYCOUNTRYCODE_CONSISTENT
Parameter(s)
Source Authority bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]}
Specification Last Updated 2024-09-25
Examples [dwc:country="Australia", dwc:countryCode="AU": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:country matches dwc:countryCode"]
[dwc:country="United States Minor Outlying Islands", dwc:countryCode="US": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:country does not match dwc:countryCode"]
Source GBIF
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes The country code determination service should be able to match the name of a country in the original or any language in the source authority. When dwc:countryCode="XZ" to mark the high seas, country should be empty until a time when a dwc:country="High seas" or similar is adopted. This test must return NOT_COMPLIANT if there is leading or trailing whitespace or there are leading or trailing non-printing characters.
@iDigBioBot
Copy link
Collaborator Author

Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet:
In the cases when country was derived from coordinates, this would only make sense AFTER that step.

@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
In the example given, country=Australia, countryCode=4, I would expect this validation to return a result status INTERNAL_PREREQUISITES_NOT_MET, as 4 is not a valid ISO 2 letter or 3 letter country code or three digit country code (004 would be), and thus can't be compared with "Australia". A better example might be country=Australia, countryCode=GM. Specification should note specific acceptable controlled vocabularies for countryCode

@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
Rename: COUNTRY_COUNTRYCODE_CONSISTENT

@ArthurChapman ArthurChapman added VOCABULARY Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT labels Jan 17, 2018
@tucotuco tucotuco added the Parameterized Test requires a parameter label Nov 5, 2018
@Tasilee Tasilee changed the title TG2-VALIDATION_COUNTRY_COUNTRYCODE_INCONSISTENT TG2-VALIDATION_COUNTRY_COUNTRYCODE_CONSISTENT Mar 22, 2022
@Tasilee Tasilee removed the Parameterized Test requires a parameter label May 1, 2022
@Tasilee
Copy link
Collaborator

Tasilee commented Sep 12, 2022

Added to Notes: "This test will fail if there are leading or trailing white space or non-printing characters."

@ArthurChapman
Copy link
Collaborator

Should the Expected Response make a reference to bdq:sourceAuthority or to "ISO 3166-1-alpha-2"? Most similar tests do. @chicoreus @Tasilee

@chicoreus
Copy link
Collaborator

@ArthurChapman yes, it should.

@ArthurChapman
Copy link
Collaborator

Updated Expected Response (add bolded text) and updated Specification Last Updated

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either of the terms dwc:country or dwc:countryCode are EMPTY; COMPLIANT if the value of the country code determined from the value of dwc:country from the bdq:sourceAuthority is equal to the value of dwc:countryCode; otherwise NOT_COMPLIANT |

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Jun 30, 2023
…-06-28) specifications. Addressing tdwg/bdq#62 VALIDATION_COUNTRY_COUNTRYCODE_CONSISTENT updating metadata, adding an implementation, and a minimal unit test.
chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Jun 30, 2023
…-06-28) specifications. Enhancements to tdwg/bdq#62 VALIDATION_COUNTRY_COUNTRYCODE_CONSISTENT adding a lookup of alternative country names in getty, along with cache of results for call on service.  Adding test cases to cover cases in notes.  Reorganizing unit tests to put those that make external service calls into the IT integration test class.
@Tasilee
Copy link
Collaborator

Tasilee commented Jul 4, 2023

Changed Source Authority value from

bdq:sourceAuthority is "ISO 3166-1-alpha-2" [https://restcountries.eu/#api-endpoints-list-of-codes, https://www.iso.org/obp/ui/#search]

to

{bdq:sourceAuthority = ISO 3166-1-alpha-2} { Country codes [https://www.iso.org/obp/ui/#search]}

to align syntax and provide a better link.

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 4, 2023

Amended Source Authority to align with @chicoreus suggested syntax

{bdq:sourceAuthority = ISO 3166-1-alpha-2} { Country codes [https://www.iso.org/obp/ui/#search]}

to

bdq:sourceAuthority default = "ISO 3166-1-alpha-2 country codes" { [https://www.iso.org/obp/ui/#search]}

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 10, 2023

Corrected syntax on Source Authority

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 16, 2023

Changed Source Authority from

bdq:sourceAuthority default = "ISO 3166-1-alpha-2 country codes" {[https://www.iso.org/obp/ui/#search]}

to

bdq:sourceAuthority default = "ISO 3166 Country Codes" {[https://www.iso.org/iso-3166-country-codes.html]} {ISO 3166-1-alpha-2 Country Code search [https://www.iso.org/obp/ui/#search]}

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 18, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@chicoreus
Copy link
Collaborator

Updated notes to change "fail" text to more explict "This test must return NOT_COMPLIANT if there is leading or trailing whitespace or there are leading or trailing non-printing characters." The expected response is clear about exact matching, so this applies.

@chicoreus
Copy link
Collaborator

As noted in #21, this one is limited and probably problematic. The notes indicate the problem, specifying that the country code should be matchable to the country name in a local language, something the sourceAuthority doesn't actually provide.

@chicoreus
Copy link
Collaborator

Noted in #21 this test probably does need to consult both Getty (for country name synonyms) and the ISO list. Getty TGN has at least some ISO 2 letter code labels, need to check if reachable through API. Logic would be check country code and country against ISO list, if not found check country against getty for label matching the one found in the ISO list for the country code when looking up the country value in getty.

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 13, 2024

@tucotuco knows TGN far better than I.

I did a quick scan of a range of "nation" level country names and they all had values for 2-letter, 3-letter and numeric ISO 3166 codes. Therefore, could we use TGN to resolve consistency?

COMPLIANT if the values of dwc:country and dwc:countryCode match national-level country name and matching country code respectively in the bdq:sourceAuthority.

?

@chicoreus chicoreus changed the title TG2-VALIDATION_COUNTRY_COUNTRYCODE_CONSISTENT TG2-VALIDATION_COUNTRYCOUNTRYCODE_CONSISTENT Aug 30, 2024
@chicoreus
Copy link
Collaborator

Removing underscores in names/labels to make TERM_ACTON consistent

@tucotuco
Copy link
Member

@tucotuco knows TGN far better than I.

I did a quick scan of a range of "nation" level country names and they all had values for 2-letter, 3-letter and numeric ISO 3166 codes. Therefore, could we use TGN to resolve consistency?

COMPLIANT if the values of dwc:country and dwc:countryCode match national-level country name and matching country code respectively in the bdq:sourceAuthority.

?

This requires two calls, one each for country and for countryCode. If they both resolve to the same entity, then the COMPLIANT result can be asserted.

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 25, 2024

I have amended the Expected Response and Source Authority to simplify the evaluation, hopefully. The 'high-seas' issue is Noted (in Notes).

@ArthurChapman
Copy link
Collaborator

See discussions under #98 wrt to XZ for high seas. As the source Authorities don't include XZ we could change the Expected Response to

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either of the terms dwc:country or dwc:countryCode are bdq:Empty; COMPLIANT if the values of dwc:country and dwc:countryCode match national-level country name and matching country code respectively in the bdq:sourceAuthority or if dwc:CountryCode='XZ' and dwc:country="High Seas"

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 26, 2024

As with my comment on #50, I think this test is ok as it stands in regards 'High seas' locations.

Dave Watts has made a case that we should be actively promoting dwc:country="High seas" (or similar) and dwc:countryCode="XZ". Until that project progresses well down the track, we are very likely to get a bdq:Empty for either or both dwc:country and dwc:countryCode. In this case, we end up with INTERNAL_PREREQUISITES_NOT_MET, which is an appropriate assertion.

As @chicoreus second paragraph at #42 (comment) suggests, we need to make some (strong) recommendations. We also need to document the issue consistently in Notes, as we are doing with #42 and #98. Maybe also something something about this issue in Supplementary would also be a good idea?

@tucotuco
Copy link
Member

The strong recommendation should actually go to Darwin Core as an issue on countryCode.

@chicoreus
Copy link
Collaborator

chicoreus commented Sep 27, 2024 via email

@tucotuco
Copy link
Member

The Darwin Core issue for countryCode is already open at tdwg/dwc#520.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Nov 10, 2024
…s for additional tests: tdwg/bdq#62 tdwg/bdq#68 cleaining up comments, removing an extraneous @param.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Consistency CORE TG2 CORE tests ISO/DCMI STANDARD SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY
Projects
None yet
Development

No branches or pull requests

5 participants