Add category to dataset #247

timrobertson100 · 2020-11-03T11:35:10Z

The current Dataset has type and subtype which is slightly problematic. Type is really indicating the row format used in the DwC-A and causes problems since a checklist can have occurrences, and an occurrence dataset can in fact be the output of sampling event data.

Better use of SubType may help, but I feel could add to more confusion due to the overlap (e.g. an occurrence dataset with subtype sampling event).

Since the API is now so well used and changing this is disruptive, I propose to introduce a new multi-value field named category to categorize datasets. In time we can deprecate type and subtype.

The categories would include the likes of (edited to include suggestions that came in from chat below):

Citizen science data
Observation data
Natural history collection
a. Consider separating out fossils as a separate category, to avoid accidental misuse
Single organism sequenced (i.e. tissue from an NHM specimen)
a. Consider adding tissue sample as well (which may or may not be sequenced) to aid discovery of preserved tissue without drawing on ambiguous other terms
Environmental DNA and/or metagenomics (e.g. soil sample, water, insect soup etc)
Targeted species detection (PCR-based assays)
Long term monitoring data
Sampling event (where some protocol has been used)
Checklist data
Material citations (e.g. taxonomic treatments in literature)
private sector data
a. Consider splitting this into finer categories (e.g. proponent data for environmental impact assessment prior to development) versus other categories (to be defined)
tracking data (i.e. recaptures or GPS tracking of individual organisms)
Machine observation (e.g. camera trap)

The multiple categories would be added to each occurrence record at indexing, allowing an intuitive filter to be added in GBIF.org so people can select on/off the dataset categories that interest them.

CC @ahahn-gbif @MortenHofft for comments in particular

The text was updated successfully, but these errors were encountered:

ahahn-gbif · 2020-11-03T11:40:57Z

Thanks!

~~Assuming this will also support metrics (and understanding that multivalue means that a dataset can belong to more than one category), I would like to add~~
~~9. private sector data~~
~~10. tracking data (i.e. recaptures or GPS tracking of individual organisms)~~

[Tim: Thanks - Added above!]

ahahn-gbif · 2020-11-03T11:44:03Z

Question: should 4. metagenomic (eDNA) be two separate categories? There is quite a difference in interpretation of these data, even though they are both "sequence based" @ManonGros, would you comment?

[Tim Edited to add: I've split them above now, but will change again based on more comments]

jlegind · 2020-11-03T12:00:42Z

Machine observation seems like a sub category of Sampling Event.

timrobertson100 · 2020-11-03T12:25:18Z

Machine observation seems like a sub category of Sampling Event.

That's ok isn't it? Because it's multivalue a dataset can be marked as both or just sampling event, or perhaps there are cases where a machine observation would be appropriate where no real sampling protocol is used.

jhnwllr · 2020-11-03T12:27:10Z

This new category would be free text using the vocab server? Or are we trying to have all the categories defined?

timrobertson100 · 2020-11-03T12:28:09Z

This new category would be free text using the vocab server? Or are we trying to have all the categories defined?

~~Undecided, but at this point we're proposing the categories~~

Revised: I'd now suggest the vocabulary server, as detailed later in this thread.

ManonGros · 2020-11-03T12:57:34Z

Great! I love the idea!

~~Just one comment:~~
~~> 4. Single organism metagenomic (i.e. tissue from an NHM specimen)~~
~~> 5. Environmental eDNA (e.g. soil sample, water, insect soup etc)~~

Number 4 doesn't seem right. What I understand when reading "Single organism metagenomic" is that someone took a gut sample of a cow (for example) and sequenced it, resulting a bunch of occurrences for the gut microbiome. I guess this isn't the idea, is it?
If you mean that tissues from a specimen were sequenced, then I would write something more along the lines of "Single organism sequenced". And actually, we could group metagenomics with eDNA (often eDNA is metagenomics). So in the end, I think we could do something like:

~~4. Single organism sequenced (i.e. tissue from an NHM specimen)~~
~~5. Environmental eDNA and/or metagenomics (e.g. soil sample, water, insect soup etc)~~

[Tim: Edited with suggestions expressed here - thanks, you indeed understood what I intended!]

Perhaps @thomasstjerne has some thoughts on this?

thomasstjerne · 2020-11-03T15:12:49Z

Added Targeted species detection (PCR-based assays)

dschigel · 2020-11-05T15:43:40Z

Thanks @timrobertson100 for making me aware of the thread, very exciting. So far, I found eight likely independent variables that may determine the evidence / dataset type in GBIF. I need to meditate a bit more before presenting my views here, and happy to brainstorm / whiteboard a bit if people are available?

emeyke · 2020-11-05T15:47:14Z

Keeping track of this as well

dschigel · 2020-11-12T14:10:12Z

Hello all, I like the idea of sorting datasets and types of evidence, but I am not sure it is most attractive for users to do so using a single filter / vocabulary (but I got the feasibility as put by Tim). I drew some mind maps but don't have time to add pictures here, so just type for your consideration. I started from thinking why would users need to sort dataset / types of evidence? It is a quick way to in/exclude types of data that matter for your cases based on how the evidence was generated and its properties. I came up with 8 independent variables that cross over suggested categorization of the dataset and the basisOfRecord vocabulary as we have today. Note that I think the work independent is important here, though some of the combinations of 1-8 below are impossible in real life.

I am using loose words to describe my thinking, this is not a vocabulary I am suggesting, and there are some unresolved overlaps:

Preservation status of evidence: virtual only or physical: fossil, dead, living (zoos, cultures, gardens, aquaria). Note some thinks like amber are not easy to place, as one can get DNA from amber, there are subfossils etc.). Question: Can I re-examine the physical material? What and where is it?
Integrity / N species: Single & whole (e.g. insect, i.e. contains all its genet within one individual), partial (tissue sample, leaf, fruit body) or mixed specimen (common in moss and lichen collection, when collecting individual species is not possible: but is not intentional sampling e.g. like plankton see 6). Question: Can I study full morphology, or only some traits, or only link museum specimen to DNA sequence?
DNA: not explored, sequence, PCR. Note: this is in between virtual and physical, as DNA or PCR products can be stored for long time (physical), but DNA evidence for species presence, often a sequence, is a machine generated virtual evidence not much different from a digital image or a sound. Question: Can I re-examine the identification, do phylogeny, or all I have is a label name?
Dynamic / Static data. Dynamic: tracking, time series, mark-recapture. Question: can I only study processes, or only patters?
The way the evidence is generated: literature processing, collection digitization, personal observations, systematic sampling. Question: Can I sort the data by reliability of its generation?
For sampling event data, but maybe occurrences, too: presence-only (sampling effort unknown / undocumented), presence-absence, abundance (quantitative). Question: What kinds of statistical analyses are possible?
The way data is packed in GBIF: metadata only, checklist, occurrences only, sampling event. Might include filter by extension used, esp. if we are getting more of those in TDWG. Question: What do I get in my GBIF download, verbatim and GBIF interpreted?
Community that generating the data (perhaps this is more relevant to tagging publishers, but one may need to filter occurrences and datasets by): (groups of) individuals, natural history collections, private sector, marine, citizen science, machine. Some of these are not mutually exclusive: can be "natural history collection" + "citizen science", or "machine". Question: Can I study data trends in a particular demographic sector?

Once again, this is just a capture of unfinished thoughts; it would be nice to brainstorm / whiteboard how good categorization would look like. I was thinking to slice it out as e.g. 1, 7, and 13 in the original post can be simultaneously true. If these are tags and overlap is no problem, then fine. But if this is strict filter, we may need more than only field to capture types of preservation vs. generating community vs. ways of generating vs. quantitativness etc. Feel free to discard if out of scope. I also did not find the collection of BoR discussions, which is applicable here partly.

ManonGros · 2020-11-13T13:34:21Z

I assume the categorisations would come from us (at least that's how it is at the moment for citizen science datasets) but it would be great if other people could help with the curation as well. Just something to keep in mind.

For example, let's say that we ask Node managers to check the datasets tagged "citizen science". We want:

An easy way for them to see all the citizen science datasets for their node.
If a Node manager noticed a dataset tagged erroneously, we want to keep track of that so that we don't re-tag it next time.

ManonGros · 2021-04-21T14:07:05Z

Looking at this issue: gbif/portal-feedback#3381, ~~we would be missing the data extracted from taxonomic literature (i.e., Plazi) category.~~ You are right, I missed it!

timrobertson100 · 2021-04-21T14:16:54Z

Thanks @ManonGros

Looking at this issue: gbif/portal-feedback#3381, we would be missing the data extracted from taxonomic literature (i.e., Plazi) category.

That is what this was intended to be:

Material citations (e.g. taxonomic treatments in literature)

(Related is that Plazi just proposed Material citation an an addition to basisOfRecord vocabulary in the Darwin Core issues for public commentary)

dagendresen · 2021-05-31T14:39:26Z

+1 @dmitry for one to many and using keyword tags (instead of a 1:1 core record to category)
+1 @marie for thinking of enabling Node staff to curate categories --> and can also add a feature request for enabling anybody to annotate a datapoint/set with category information (with provenance intact)

Remember also that a "dataset" (as in Darwin-Core-archive-dataset) can be a mixed bag of "evidence records" (aka core record, eg. aka occurrences) of different categories -- if a category "tag" is designed to apply to all core records in a DwC-A

And that the de-normalization of the "evidence records" (core records) means that one cannot be certain of which class that a given property linked to a core record is intended to be linked to

elywallis · 2021-06-03T13:07:56Z

I really like this idea. Certainly the ALA has users who want a very simple way to select groupings of records across data providers. The group I hear this request from most are curators/researchers who ‘just’ want museum or herbarium specimens.

A couple of suggestions:
3. Natural history collection - might still be useful to also have a category for Fossil specimens so these can easily be separated out.
The reason for separating Fossils out is that subfossils (or any fossil species still extant) often show up outside the extant distribution and can easily be mistaken for errors and flagged as such, when they’re perfectly legitimate.

Single organism sequenced (i.e. tissue from an NHM specimen)
Having an additional category for Tissue sample would be very useful, whether sequences have been derived or not.
Users of this category might be researchers seeking tissues for loan/destructive sampling who currently have to search BasisOfRecord = material sample plus Preparations pot luck.
Private sector data - do you mean data gathered by companies undertaking environmental impact assessments prior to approval of development/mining projects? If so, in Australia this would commonly be called “Proponent data” (being data from proponents of a development). If Private sector data means something else, perhaps could have both?

timrobertson100 · 2021-06-03T13:46:52Z

Remember also that a "dataset" (as in Darwin-Core-archive-dataset) can be a mixed bag of "evidence records" (aka core record, eg. aka occurrences) of different categories -- if a category "tag" is designed to apply to all core records in a DwC-A

Thanks, @dagendresen. My thinking here was to try and decouple this from the class/basisOfRecord issues in Darwin Core to be able to react to reporting/user needs quickly (e.g. introduce a new tag for datasets). Acknowledging that there can be "mixed bag" datasets, my intuition is that most users would appreciate broad filtering to e.g. "omit records that originate from datasets tagged as eDNA" even if there were a few entries in there that might be of some interest, or to produce reports (e.g. growth charts) based on e.g. data originating from datasets tagged as private-sector related. Does this seem reasonable, please?

really like this idea

Thanks, @elywallis - I'll add your input to the list at the top now.

Private sector data - do you mean data gathered by companies undertaking environmental impact assessments prior to approval of development/mining projects?

I believe that was the intention, yes. I don't know the details, but I'm aware the data management team is increasingly running reports on trends using categories like this. I'll add your comments in the top list, without proposing a final decision.

timrobertson100 · 2021-06-03T13:57:07Z

Slightly off-topic, but perhaps useful:

It may not be known to many, but GBIF is progressively moving vocabularies like this into our integrated vocabulary server. This will allow data managers (e.g. including node managers @dagendresen ) to be involved in defining the concepts. Concepts can be hierarchical (e.g. finer categorizations of private data) and once a vocabulary version is released, it is picked up in the data processing pipelines. This is still evolving, but LifeStage is in production now.

What this means relating to this issue, is that as we find new requirements to categorise datasets for a new report or community we see emerging, we'll have the tools in place to accommodate that without needing software developer involvement (only requires a vocabulary to be changed, and then proceed with tagging datasets).

dagendresen · 2021-06-03T14:43:49Z

"mixed bag" datasets

@timrobertson100 I would (if asked) completely agree that best practice is to avoid "mixed bag" datasets and that a "tag" to enable filter for a "purpose-of-reuse" would be very useful and welcome! And believe we could live well with such functionality not applying 100% to "mixed bag" datasets :-)

(apropos -- GBIF Norway is "negotiating" with Norwegian data publishers to "break" up "mixed bag" datasets into smaller datasets that would be more homogenous)

debpaul · 2021-06-03T14:47:59Z

@timrobertson100 wrote:

Slightly off-topic, but perhaps useful:

It is may not be known to many, but GBIF is progressively moving vocabularies like this into our integrated vocabulary server. This will allow data managers (e.g. including node managers @dagendresen ) to be involved in defining the concepts. Concepts can be hierarchical (e.g. finer categorizations of private data) and once a vocabulary version is released, it is picked up in the data processing pipelines. This is still evolving, but LifeStage is in production now.

What this means relating to this issue, is that as we find new requirements to categorise datasets for a new report or community we see emerging, we'll have the tools in place to accommodate that without needing software developer involvement (only requires a vocabulary to be changed, and then proceed with tagging datasets).

Tim, can you see my <happy dance!>? At some point, we need something, a talk from GBIF, a TDWG Webinar, about this effort. I think the broader community will find it very enlightening about how we can use the data we have to improve and understand the data.

CecSve · 2022-06-16T09:25:15Z

13. Machine observation (e.g. camera trap)

Maybe this relates to this category and could potentially be a subcategory, but it would be great to be able to categorize datasets from e.g. drones. Other remote sensing data, e.g. radar, sonar etc. could be subcategories as well. However, drones for example can have subcategories in itself, e.g. UAV, UAS and ROV etc.

To keep it simple, should tracking data perhaps be a subcategory of machine observations?

timrobertson100 · 2022-06-16T09:32:41Z

should tracking data perhaps be a subcategory of machine observations?

Are catch and release style data (e.g. bird ringing) considered to be "tracking", or identifying an individual by sight (e.g. whale fin)? I genuinely don't know if that is tracking or not, but they wouldn't be machine observation.

ahahn-gbif · 2022-06-16T09:35:41Z

Alternatively: should we consider a breakdown like this (sub-categories of machine observations, or others) rather as a separate controlled/proposed vocabulary to be used under "methodology"? I do not have a full understanding of user needs here, but there seems to be a difference in purpose between setting simple, intuitive filters ("not eDNA" or "just tracking data"), and the more specialized breakdowns that serve a user being particularly interested in, say, data collected via drones.

In the first case, categorizing at ingestion to serve search filters would be supporting most cases adequately, where more specific queries may be better served by supporting structured keywording of methods used in data collection (including publisher / user guidance on tagging datasets for more detailed methodological approaches).

ahahn-gbif · 2022-06-16T09:38:54Z

To keep it simple, should tracking data perhaps be a subcategory of machine observations?

The purpose here, if I understand correctly, is to support users to include/exclude particular content, based on how it was derived. In that sense: I would value the fact that some users may want to exclude known, repeated observations / loggings of one and the same individual over time higher than how these data were collected "technically".

CecSve · 2022-06-16T11:48:24Z

Are catch and release style data (e.g. bird ringing) considered to be "tracking", or identifying an individual by sight (e.g. whale fin)? I genuinely don't know if that is tracking or not, but they wouldn't be machine observation.

True, they would not be machine observations so there would need to be a separation of the two.

Jegelewicz · 2022-11-22T15:17:07Z

At what point is GBIF diverging from TDWG standards? How can we do things as a community if we are developing vocabularies in silos? How will this fit with LatimerCore and eventually whatever MaterialSample standards come out of TDWG? Sigh.

timrobertson100 · 2022-11-22T16:38:16Z

I've left a comment on tdwg/material-sample#29 but will also note here.

I'm not sure there is a TDWG standard that would cover this, but terms from various vocabularies could be used (relating to LatimerCore, Darwin Core etc). It's really intended to provide the means to codify datasets to allow easy filtering of data and driving reports on data seen in GBIF. We're asked to report on counts by e.g. private sector data etc which is probably more unique to the GBIF network than the kind of problems TDWGs current task groups cover.

There is of course a large overlap between the GBIF and TDWG communities, and GBIF (staff and network) promotes, implements, and contributes to standards so it could be that one might emerge from this, but it's not immediately obvious.

MattBlissett · 2023-08-11T11:50:08Z

Also relevant for publishers, e.g. private sector publishers: https://docs.gbif.org/private-sector-data-publishing/2.0/en/#table-01

CecSve · 2024-04-09T09:54:40Z

I have added the vocabulary now as DatasetCategory on UAT with the following changes:

Citizen science ~~data~~
2. Observation ~~data~~
3. Natural history collection
a. Consider separating out fossils as a separate category, to avoid accidental misuse - added Fossil as a child of NaturalHistoryCollection
4. Single organism sequenced (i.e. tissue from an NHM specimen) - added Tissue as child of SingleOrganismSequenced
a. Consider adding tissue sample as well (which may or may not be sequenced) to aid discovery of preserved tissue without drawing on ambiguous other terms
5. Environmental DNA and ~~/or~~ metagenomics (e.g. soil sample, water, insect soup etc)
6. Targeted species detection (PCR-based assays)
7. Long term monitoring ~~data~~
8. Sampling event (where some protocol has been used)
9. Checklist ~~data~~
10. Material citations (e.g. taxonomic treatments in literature)
11. private sector ~~data~~ - added as BusinessSector instead
a. Consider splitting this into finer categories (e.g. proponent data for environmental impact assessment prior to development) versus other categories (to be defined)
12. Tracking ~~data~~ (i.e. recaptures or GPS tracking of individual organisms)
13. Machine observation (e.g. camera trap)

I have added comments in brackets as Description, when possible, but several concepts could benefit from a Description and ideally also an External description

tobiasgf · 2024-04-09T12:30:12Z

The issues name is "Add category to dataset" and the vocabulary is called "DatasetCategory", but as I read it, it is a multi-value field at occurrence level. Maybe we should consider renaming the field and issue to reflect that?

tobiasgf · 2024-04-09T12:31:20Z

I read it as the main aim is to be able to provide intuitive filters for the users of the data. That is important to keep in mind, so we do not make it over-complicated. I believe Data Products / Helpdesk must have an intuitive feeling (at least) on which types of data data users most often wish to focus on / exclude, and that those categories are the ones now finding their way into the vocabulary. I have some suggestions/comments on those suggested (later...).

tobiasgf · 2024-04-09T12:34:20Z

private sector serves a user need much like the wish to be able to filter on thematic types of data like fresh water, health, marine this issue, where the wish is to either produce reports/growth charts OR delimit classical data types of e.g. habitat relevance. I believe it is wise to think about these needs in the same work here (not sure of they should be included in the same overall field).

tobiasgf · 2024-04-09T13:04:36Z

If I understand it correctly, the consensus is that this field (at occurrence level) eventually contains values that are being assigned based on some rules upon ingestion, minimizing the need for manual interaction/curation.

Some thoughts on this:

Should we have a first brainstorm/meeting on how such rules could be - both at a general level, but also checking that we can actually establish some rules for the categories that have been proposed already. And then start designing those rules for real.

Some early thoughts/examples on what might be used for rules:

simple info about known sources, e.g.:

publisher id: iNaturalist is always citizenScience, NatureMetrics is "Private Sector"
dataset id: INSDC/ENA is all "DNA"

content of selected fields, e.g.:

has something in dna-derived extension
uses eventCore

taxon belongs to a selected checklist

"all parasites"
"freshwater species"

spatial rules

shape file with marine areas
freshwater

auto-labelling from data formatting tools and similar

Data coming from the "eDNA tool" is always DNA metabarcoding
CamtrapDP is Machine observation (or is it?)

Positive/negative lists based on manual curation/refinement (e.g. "no this is not citizen science although the rule suggests so" or "this IS citizen science although the rule suggests it is not")

...?

And combinations of the above, including procedures like the Clustering Algorithm. Simpler rules are of course preferable, and could help refine the categories of the vocabulary?

CecSve · 2024-04-09T13:39:40Z

The issues name is "Add category to dataset" and the vocabulary is called "DatasetCategory", but as I read it, it is a multi-value field at occurrence level. Maybe we should consider renaming the field and issue to reflect that?

The field will contain information at a record level about how the dataset was compiled so it is pointing to the dataset source in a way. However, most users will not access data on GBIF by downloading specific datasets, but rather query across datasets and this is why the information has to be at record level. The original proposal was to call the field category, but adding dataset qualifies the content of the field more precisely for users.

CecSve · 2024-04-09T13:51:45Z

private sector serves a user need much like the wish to be able to filter on thematic types of data like fresh water, health, marine this issue, where the wish is to either produce reports/growth charts OR delimit classical data types of e.g. habitat relevance. I believe it is wise to think about these needs in the same work here (not sure of they should be included in the same overall field).

We could maybe have concepts like ThematicAreaFreshwater, ThematicAreaHealth etc., however, this would depend on both the scope and the expansion of thematic areas. Would all freshwater data be part of the freshwater thematic area by default or is it only mobilized data as part of the thematic area that should automatically be mapped to such a concept? If the latter, then I do not think that a controlled vocabulary inclusion would be the most optimal solution.

Also, are the thematic areas filter options more internal relevant or of public relevance? The scope of these categories should be for external end-users, not for internal GBIFS relevance.

CecSve · 2024-04-09T13:56:27Z

If I understand it correctly, the consensus is that this field (at occurrence level) eventually contains values that are being assigned based on some rules upon ingestion, minimizing the need for manual interaction/curation.

Some thoughts on this:

Should we have a first brainstorm/meeting on how such rules could be - both at a general level, but also checking that we can actually establish some rules for the categories that have been proposed already. And then start designing those rules for real.

Some early thoughts/examples on what might be used for rules:

simple info about known sources, e.g.:
* publisher id: iNaturalist is always citizenScience, NatureMetrics is "Private Sector"

* dataset id: INSDC/ENA is all "DNA"
content of selected fields, e.g.:
* has something in dna-derived extension

* uses eventCore
taxon belongs to a selected checklist
* "all parasites"

* "freshwater species"
spatial rules
* shape file with marine areas

* freshwater
auto-labelling from data formatting tools and similar
* Data coming from the "eDNA tool" is always DNA metabarcoding

* CamtrapDP is Machine observation (or is it?)
Positive/negative lists based on manual curation/refinement (e.g. "no this is not citizen science although the rule suggests so" or "this IS citizen science although the rule suggests it is not")

...?

And combinations of the above, including procedures like the Clustering Algorithm. Simpler rules are of course preferable, and could help refine the categories of the vocabulary?

Should we create a new issue for implementation and automated categorization perhaps @timrobertson100?

tobiasgf · 2024-04-09T13:58:50Z

The field will contain information at a record level about how the dataset was compiled so it is pointing to the dataset source in a way.

OK, then I did misunderstand. If the values/categories have to be the same across all records in a dataset, then we can of course not use the same approach for "thematic data" which varies within datasets (e.g. rats are health relevant, but not all iNaturalist is health relevant. Brown Trout is fresh water but not all iNaturalist is fresh water, ....). Also ENA/INSDC datasets have a mixture of the categories of DNA-associated data, that would make it difficult to categorize at dataset level.

I understand that most datasets are of a single category, but I am not sure if I understand why the category classification needs to refer to dataset level (again with the user in perspective). Some categories will only be possible to infer (from rules) by looking at the single occurrences anyway.

tobiasgf · 2024-04-09T14:01:02Z

Would all freshwater data be part of the freshwater thematic area by default or is it only mobilized data as part of the thematic area that should automatically be mapped to such a concept?

Also, are the thematic areas filter options more internal relevant or of public relevance? The scope of these categories should be for external end-users, not for internal GBIFS relevance.

All data yes, and the themes are of user relevance (also/primarily)

tobiasgf · 2024-04-09T14:03:35Z

Sorry for expanding the issue into the topic on making it operational. As I indicate, the attempt to design the rules may affect the actual delimitation of categories. But no need to mix in same issue, I guess. Sorry.

thomasstjerne mentioned this issue Nov 5, 2020

Has the addition of DNA to dwc:basisOfRecorded be raised as an issue gbif/doc-publishing-dna-derived-data#36

Closed

dschigel mentioned this issue Nov 12, 2020

umbrella issue related to dwc:basisOfRecord and an Evidence class tdwg/dwc#302

Open

dschigel mentioned this issue Nov 17, 2020

Section 2.2 'basisOfRecord' in table 2 and 4 gbif/doc-publishing-dna-derived-data#131

Closed

timrobertson100 mentioned this issue May 31, 2021

Extend basisOfRecord vocabulary gbif/gbif-api#84

Open

dagendresen mentioned this issue Jun 1, 2021

Living Norway network & access to edit in the GBIF Registry gbif-norway/helpdesk#41

Closed

m-hope mentioned this issue Jun 4, 2021

Basis of record appears as UNKNOWN supplied basis "Genomic DNA" AtlasOfLivingAustralia/la-pipelines#397

Closed

albenson-usgs mentioned this issue Jun 7, 2021

Differentiate between physical occurrences vs. DNA occurrences iobis/Project-team-Genetic-Data#2

Open

CecSve mentioned this issue Jun 16, 2022

Dataset categories - curation before uploading first vocabulary version gbif/vocabulary#115

Open

CecSve mentioned this issue Jul 27, 2022

Flagging occurrence records where community feedback has identified wrong or doubtful identifications gbif/portal-feedback#4187

Open

timrobertson100 mentioned this issue Nov 22, 2022

Possible examples and need for sharing mixed observation and vouchered specimen record datasets gbif/portal-feedback#4432

Open

6 tasks

Jegelewicz mentioned this issue Nov 22, 2022

GBIF vs. LatimerCore vs. MaterialSample tdwg/material-sample#29

Closed

ManonGros mentioned this issue Jan 3, 2023

Thematic analytics gbif/portal-feedback#4505

Open

MortenHofft mentioned this issue Feb 9, 2024

Make it easier to find records with certain characteristics gbif/gbif-web#488

Closed

CecSve mentioned this issue Jun 11, 2024

Observing/collection method filtering tdwg/dwc-qa#210

Open

ManonGros mentioned this issue Oct 11, 2024

DatasetType in the eml profile? gbif/eml-profile#11

Closed

Add category to dataset #247

Add category to dataset #247

Comments

timrobertson100 commented Nov 3, 2020 • edited Loading

ahahn-gbif commented Nov 3, 2020 • edited by timrobertson100 Loading

ahahn-gbif commented Nov 3, 2020 • edited by timrobertson100 Loading

jlegind commented Nov 3, 2020

timrobertson100 commented Nov 3, 2020

jhnwllr commented Nov 3, 2020

timrobertson100 commented Nov 3, 2020 • edited Loading

ManonGros commented Nov 3, 2020 • edited by timrobertson100 Loading

thomasstjerne commented Nov 3, 2020

dschigel commented Nov 5, 2020 • edited Loading

emeyke commented Nov 5, 2020

dschigel commented Nov 12, 2020

ManonGros commented Nov 13, 2020

ManonGros commented Apr 21, 2021 • edited Loading

timrobertson100 commented Apr 21, 2021

dagendresen commented May 31, 2021

elywallis commented Jun 3, 2021

timrobertson100 commented Jun 3, 2021

timrobertson100 commented Jun 3, 2021 • edited Loading

dagendresen commented Jun 3, 2021 • edited Loading

debpaul commented Jun 3, 2021

CecSve commented Jun 16, 2022 • edited Loading

timrobertson100 commented Jun 16, 2022

ahahn-gbif commented Jun 16, 2022 • edited Loading

ahahn-gbif commented Jun 16, 2022

CecSve commented Jun 16, 2022

Jegelewicz commented Nov 22, 2022

timrobertson100 commented Nov 22, 2022

MattBlissett commented Aug 11, 2023

CecSve commented Apr 9, 2024 • edited by timrobertson100 Loading

tobiasgf commented Apr 9, 2024

tobiasgf commented Apr 9, 2024

tobiasgf commented Apr 9, 2024

tobiasgf commented Apr 9, 2024

CecSve commented Apr 9, 2024

CecSve commented Apr 9, 2024 • edited Loading

CecSve commented Apr 9, 2024

tobiasgf commented Apr 9, 2024 • edited Loading

tobiasgf commented Apr 9, 2024

tobiasgf commented Apr 9, 2024 • edited Loading

timrobertson100 commented Nov 3, 2020 •

edited

Loading

ahahn-gbif commented Nov 3, 2020 •

edited by timrobertson100

Loading

ahahn-gbif commented Nov 3, 2020 •

edited by timrobertson100

Loading

timrobertson100 commented Nov 3, 2020 •

edited

Loading

ManonGros commented Nov 3, 2020 •

edited by timrobertson100

Loading

dschigel commented Nov 5, 2020 •

edited

Loading

ManonGros commented Apr 21, 2021 •

edited

Loading

timrobertson100 commented Jun 3, 2021 •

edited

Loading

dagendresen commented Jun 3, 2021 •

edited

Loading

CecSve commented Jun 16, 2022 •

edited

Loading

ahahn-gbif commented Jun 16, 2022 •

edited

Loading

CecSve commented Apr 9, 2024 •

edited by timrobertson100

Loading

CecSve commented Apr 9, 2024 •

edited

Loading

tobiasgf commented Apr 9, 2024 •

edited

Loading

tobiasgf commented Apr 9, 2024 •

edited

Loading