Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-AMENDMENT_SCIENTIFICNAMEID_FROM_TAXON #57

Open
iDigBioBot opened this issue Jan 5, 2018 · 46 comments
Open

TG2-AMENDMENT_SCIENTIFICNAMEID_FROM_TAXON #57

iDigBioBot opened this issue Jan 5, 2018 · 46 comments
Labels
Amendment Conformance CORE TG2 CORE tests NAME Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID 431467d6-9b4b-48fa-a197-cd5379f5e889
Label AMENDMENT_SCIENTIFICNAMEID_FROM_TAXON
Description Proposes an amendment to the value of dwc:scientificNameID if it can be unambiguously resolved from bdq:sourceAuthority using the available taxon terms.
TestType Amendment
Darwin Core Class dwc:Taxon
Information Elements ActedUpon dwc:scientificNameID
Information Elements Consulted dwc:taxonID
dwc:acceptedNameUsageID
dwc:originalNameUsageID
dwc:taxonConceptID
dwc:scientificName
dwc:higherClassification
dwc:kingdom
dwc:phylum
dwc:class
dwc:order
dwc:superfamily
dwc:family
dwc:subfamily
dwc:tribe
dwc:subtribe
dwc:genus
dwc:genericName
dwc:subgenus
dwc:specificEpithet
dwc:infraspecificEpithet
dwc:cultivarEpithet
dwc:vernacularName
dwc:scientificNameAuthorship
dwc:taxonRank
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificNameID is bdq:NotEmpty, or if all of dwc:scientificName, dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:scientificNameAuthorship, and dwc:cultivarEpithet are bdq:Empty, FILLED_IN the value of dwc:scientificNameID for an unambiguously resolved single taxon record in the bdq:sourceAuthority through (1) the value of dwc:scientificName or (2) if dwc:scientificName is bdq:Empty through values of the terms dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:scientificNameAuthorship and dwc:cultivarEpithet, or (3) if ambiguity produced by multiple matches in (1) or (2) can be disambiguated to a single Taxon using the values of dwc:subtribe, dwc:tribe, dwc:subgenus, dwc:genus, dwc:subfamily, dwc:family, dwc:superfamily, dwc:order, dwc:class, dwc:phylum, dwc:kingdom, dwc:higherClassification, dwc:taxonID, dwc:acceptedNameUsageID, dwc:originalNameUsageID, dwc:taxonConceptID, dwc:taxonomicRank, and dwc:vernacularName; otherwise NOT_AMENDED
Data Quality Dimension Conformance
Term-Actions TAXONID_FROM_TAXON
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "GBIF Backbone Taxonomy" {[https://doi.org/10.15468/39omei]} {API endpoint [https://api.gbif.org/v1/species?datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&name=]}
Specification Last Updated 2023-09-17
Examples [dwc:taxonID="", dwc:scientificNameID="", dwc:acceptedNameUsageID="", dwc:originalNameUsageID="", dwc:taxonConceptID="", dwc:scientificName="Chicoreus palmarosae (Lamarck, 1822)", dwc:higherClassification="", dwc:kingdom="Animalia", dwc:phylum="Mollusca", dwc:class="Gastropoda", dwc:order="", dwc:family="Muricidae", dwc:subfamily="", dwc:genus="Chicoreus", dwc:genericName="Chicoreus", dwc:subgenus="", dwc:infragenericEpithet="", dwc:specificEpithet="palmarosae", dwc:infraspecificEpithet="", dwc:cultivarEpithet="", dwc:vernacularName="", dwc:scientificNameAuthorship="(Lamarck, 1822)", dwc:taxonRank="", bdq:sourceAuthority=”marinespecies.org”: Response.status=FILLED_IN, Response.result=dwc:scientificNameID="urn:lsid:marinespecies.org:taxname:208134", Response.comment="dwc:scientificName matched to unique taxon record in WoRMS, exact match on name and authorship. Resolvable at https://marinespecies.org/aphia.php?p=taxdetails&id=208134"]
[dwc:scientificNameID="", dwc:taxonID="", dwc:acceptedNameUsageID="", dwc:originalNameUsageID="", dwc:taxonConceptID="", dwc:scientificName="Graphis", dwc:higherClassification="", dwc:kingdom="", dwc:phylum="", dwc:class="", dwc:order="", dwc:family="", dwc:subfamily="", dwc:genus="", dwc:genericName="", dwc:subgenus="", dwc:infragenericEpithet="", dwc:specificEpithet="", dwc:infraspecificEpithet="", dwc:cultivarEpithet="", dwc:vernacularName="", dwc:scientificNameAuthorship="", dwc:taxonRank="": Response.status=NOT_AMENDED, Response.result=, Response.comment="dwc:scientificName="Graphis" is ambiguous as could be either a lichen or a gastropod."]
Source FP-Akka
References
Example Implementations (Mechanisms) Kurator/FilteredPush sci_name_qc Library, FP-KurationServices, Arctos, MCZbase, Symbiota
Link to Specification Source Code https://github.com/FilteredPush/sci_name_qc/blob/v1.1.2/src/main/java/org/filteredpush/qc/sciname/DwCSciNameDQ.java#L397 https://github.com/FilteredPush/sci_name_qc/blob/v1.1.2/src/main/java/org/filteredpush/qc/sciname/DwCSciNameDQ.java#L476
Notes Return a result with no value and a Result.status of NOT_AMENDED with a Response.comment of ambiguous if the information provided does not resolve to a unique result (e.g. if homonyms exist and there is insufficient information in the provided data, for example using the lowest ranking taxa in conjunction with dwc:dwc:scientificNameAuthorship, to resolve them). When referencing a GBIF taxon by GBIF's identifier for that taxon, use the the pseudo-namespace "gbif:" and the form "gbif:{integer}" as the value for dwc:scientificNameID.
@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
Moving from scientificName as a string to a link to a guid in a taxonomic or nomenclatural authority is key for moving towards linked open data and other semantic delivery of biodiversity data. There is almost never enough data in flat Darwin Core to fill in any of the other ID terms in the Taxon class, but it is often possible to link scientific name strings to nomenclatural or taxonomic records.

@pzermoglio pzermoglio changed the title TG2-AMENDMENT_TAXONID_FROM_SCIENTIFICNAME TG2-AMENDMENT_TAXONID_FROM_TAXON Jan 17, 2018
@ArthurChapman ArthurChapman added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Jan 17, 2018
@godfoder
Copy link
Contributor

We should add taxonRank to the list of fields for this and #70 . It is especially important for the interpretation of monomials in scientific name absent other supporting data.

@chicoreus
Copy link
Collaborator

@godfoder I concur. Do we need to specify a more complex set of prerequisites?

@chicoreus
Copy link
Collaborator

A couple of issues for implementation:

Acton to take when taxonID is NOT_EMPTY: The specification is mute on what action to take when dwc:taxonID has a value. Since other tests specify CHANGED only if term that is proposed to be amended is NOT_EMPTY, the implication is that an amendment is to be proposed, for purposes such as conforming taxonID values to a national authority. This should probably be spelled out in the notes section.

Extraneous terms in the list of Information Elements: The specification states that a proposed amendment should be based on "on the basis of the value of the lowest ranking not EMPTY taxon classification terms dwc:scientificName, dwc:scientificNameAuthorship, dwc:kingdom, dwc:phylum, dwc:class, etc.", with @godfoder's comment clearly indicating that taxonRank should be included in this list. The notes imply that none of the other ID terms (dwc:scientificNameID, dwc:acceptedNameUsageID, dwc:originalNameUsageID, dwc:taxonConceptID) should be included in this analysis, so it seems that they shouldn't be included in the informationElements, unless there is a clear specification of how to include them to infer a value of taxonID. Also, neither dwc:higherClassification nor dwc:vernacularName are included in the specification, and thus don't seem to fit in the list of information elements.

@ArthurChapman
Copy link
Collaborator

Further to the logic of @chicoreus - I don't understand the inclusion of dwc:scientificNameAuthorship as it isn't a taxon classification term in the hierarchy, and that and that field alone could not supply a taxonID.

@chicoreus
Copy link
Collaborator

@ArthurChapman I see scientificNameAuthorship as an essential term for identifying which taxonID to use, it can often disambuguate homonyms and if the authorship string associated with the source record for taxonID isn't the same as the authorship string in a record under consideration, then something likely isn't correct and an assertion of of a taxonID match is not a good one to make.

@tucotuco
Copy link
Member

Not yet! There are some questions still to be answered (the email I sent around on #57 and #70) - for example

  1. on treatment of dwc:cultivarEpithet different to dwc:infraspecificEpithet (I believe they shouldn't be differently treated - @tucotuco to clarify use of dwc:cultivarEpithet) - see my comment of two days ago

My understanding is that a cultivarEpithet should be as determinant of a Taxon as an infraspecificEpithet is and treated in the same way.

  1. and whether or not the higher categories have TAXONIDs. From my email
    "I am still not fully convinced re TAXONID and higher level taxa. Does the sourceAuthority (GBIF?) give a TAXONID for a family name?I am not familiar enough with TAXON ID to know. If they don't then I accept @chicoreus arguments. But if they do, and a record has only a name at the Family level with no information at a lower level (i.e. I have only been able to identify this record to Family). If the sourceAuthority gives a Taxon ID for the Family - then why would be not use that TAXONID for the record.
    This is particularly relevant as the Botanical Code defines a taxa as
    "Taxonomic groups at any rank will, in this Code, be referred to as taxa (singular: taxon)."
    In the Zoological Code:
    "A taxonomic unit, whether named or not: i.e. a population, or group of populations of organisms which are usually inferred to be phylogenetically related and which have characters in common which differentiate (q.v.) the unit (e.g. a geographic population, a genus, a family, an order) from other such units. A taxon encompasses all included taxa of lower rank (q.v.) and individual organisms. The Code fully regulates the names of taxa only between and including the ranks of superfamily and subspecies"
    The Zoological Code treats a family name as a taxon
    "family name or name of a family
    A scientific name of a taxon at the rank of family."
    Darwin Core definition
    "A group of organisms (sensu http://purl.obolibrary.org/obo/OBI_0100026) considered by taxonomists to form a homogeneous unit." It gives the example of "The genus Truncorotaloides as published by Brönnimann et al. in 1953 in the Journal of Paleontology Vol. 27(6) p. 817-820."

I agree with @chicoreus about the case where dwc:family (and no lower rank) is populated and dwc:scientificName is not, for the simple fact that the Taxon is ambiguous. Specifically, it MIGHT be the family, but it might be something in the family. Probably way too subtle for most people to worry about, but I think it's correct.

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Mar 14, 2022

OK - if we accept the rasoning of @chicoreus and @tucotuco

INTERNAL_PREREQUISITES_NOT_MET if dwc:taxonID is not EMPTY or if all of, dwc:scientificName, dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:scientificNameAuthorship, and dwc:cultivarEpithet are EMPTY, AMENDED the value of taxonID for an unambiguously resolved single taxon record in the specified source authority service through (1) the value of dwc:scientificName or (2) if dwc:scientificName is EMPTY through values of the terms dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:scientificNameAuthorship and dwc:cultivarEpithet), or (3) if ambiguity produced by multiple matches in (1) or (2) can be disambiguated to a single Taxon using the values of dwc:subgenus, dwc:genus, dwc:subfamily, dwc:family, dwc:order, dwc:class, dwc:phylum, dwc:kingdom, dwc:higherClassification, dwc:scientificNameID, dwc:acceptedNameUsageID, dwc:originalNameUsageID, dwc:taxonConceptID, dwc:taxonomicRank, and dwc:vernacularName); otherwise NOT_AMENDED

If accepted it appears that we can take dwc:genericName and dwc:infragenericEpithet out of Information Elements

Note:

  1. I have taken dwc:cultivatedEpithet and dwc:taxonRank out of the INTERNAL_PREREQUISTES_NOT_MET
  2. I have changed the wording of (1) and (2) around dwc:cultivatedEpithet in the AMENDED area to treat it the same as dwc:infraspecificEpithet
  3. I have taken dwc:taxonRank out of (2)
  4. I have added dwc:subfamily to (3)
  5. Some other minor wording and spelling corrections

@Tasilee
Copy link
Collaborator

Tasilee commented Mar 15, 2022

I have to defer to @chicoreus, @ArthurChapman and @tucotuco on this. I will apply @ArthurChapman's latest Expected Response, with a few more tweaks.

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 3, 2022

  • I have corrected a stray bracket in the ER.
  • Quoting @ArthurChapman "If accepted it appears that we can take dwc:genericName and dwc:infragenericEpithet out of Information Elements". True for dwc:infragenericEpithet (done) but not so for dwc:genericName as it is in the current ER.

Are we all happy with the specifications on this one now?

@Tasilee Tasilee removed the NEEDS WORK label Apr 3, 2022
@Tasilee
Copy link
Collaborator

Tasilee commented Apr 18, 2022

Changed "AMENDED" to "FILLED_IN" in accordance with discussions April 16.

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 19, 2022

Amended Example to align with @chicoreus comments in email 17th June 2022.

chicoreus added a commit to FilteredPush/sci_name_qc that referenced this issue Aug 26, 2022
…IF backbone has more than one record for the same taxon with identical spelling of name and authorship. Excluding name matches that have an accepted key that is the same as a key allready matched.
chicoreus added a commit to FilteredPush/sci_name_qc that referenced this issue Sep 2, 2022
…y as a string retaining the responsibility of interpreting a user provided string in the source authority object, but constructing that object after the call to the test method instead of before, allowing, as with darwin core term inputs, the method APIs for the tests to use just strings.
@chicoreus
Copy link
Collaborator

chicoreus commented Oct 11, 2022 via email

@tucotuco
Copy link
Member

So the text of cultivarEpithet should also be found in scientificName?

Yes, I think it should. But for a definitive answer it is best to ask someone such as @mdoering and @ nielsklazenga.

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Oct 11, 2022

@nielsklazenga - any comments? [space inadvertently included in last post by @tucotuco

@nielsklazenga
Copy link
Member

Regarding cultivarEpithet, yes, that is part of the scientificName string.

@Tasilee
Copy link
Collaborator

Tasilee commented Jan 27, 2023

Why don't we have an "EXTERNAL_PREREQUISITES_NOT_MET" if we reference bdq:sourceAuthority?!

I've added it as otherwise it will stuff up the test data work.

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 12, 2023

Changed Parameter(s) to "bdq:sourceAuthority" as per discussions 12th June 2023

@ArthurChapman
Copy link
Collaborator

I have added to the Notes to be consistent with #71:

"When referencing a GBIF taxon by GBIF's identifier for that taxon, use the the pseudo-namespace "gbif:" and the form "gbif:{integer}" as the value for dwc:taxonID."

@chicoreus
Copy link
Collaborator

Will need to include the new terms dwc:superfamily, dwc:tribe, dwc:subtribe tdwg/dwc#65 tdwg/dwc#45 tdwg/dwc#46

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 4, 2023

Added the terms dwc:superfamily, dwc:tribe, dwc:subtribe to the Information elements and Expected response, and updated Specification Last Updated.

On this one, please check my Expected response.

@Tasilee Tasilee removed the NEEDS WORK label Jul 4, 2023
@Tasilee
Copy link
Collaborator

Tasilee commented Jul 4, 2023

Amended Source Authority values to align with @chicoreus syntax

From

bdq:sourceAuthority default = "GBIF Backbone Taxonomy" [https://doi.org/10.15468/39omei] |
| | API endpoint [https://api.gbif.org/v1/species?datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&name=]

to

bdq:sourceAuthority default = "GBIF Backbone Taxonomy" {[https://doi.org/10.15468/39omei]} {API endpoint [https://api.gbif.org/v1/species?datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&name=]}

chicoreus added a commit to FilteredPush/sci_name_qc that referenced this issue Jul 14, 2023
…pecifications. Addressed tdwg/bdq#57 AMENDMENT_TAXONID_FROM_TAXON Updated metadata, ProvidesVersion and Specification annotations.  Further updates to implementation for superfamily, tribe, and subtribe.  Removed reviewed stub method.
@Tasilee
Copy link
Collaborator

Tasilee commented Sep 16, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Field" to "TestField" and "Output Type" to "TestType".

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@Tasilee Tasilee changed the title TG2-AMENDMENT_TAXONID_FROM_TAXON TG2-AMENDMENT_SCIENTIFICNAMEID_FROM_TAXON Dec 13, 2023
chicoreus added a commit to FilteredPush/sci_name_qc that referenced this issue Jul 19, 2024
…est to reflect change from taxonID to scientificNameID as the expected external identifier reference for a taxon.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Amendment Conformance CORE TG2 CORE tests NAME Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY
Projects
None yet
Development

No branches or pull requests

7 participants