-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Version dataset metadata independent of imports #1358
Comments
@thomasstjerne & @mdoering, could you please fix this long standing bug? GSD metadata in the CoL should reflect the version which was synced in the project, but not the version currently imported into the CLB. (just in case, WFerns GSD was synced 2024-07-09; WPlants - 2024-07-08) |
The September edition was released on 2024-09-25. Ferns were last imported 18th September and in July before that: The fern sectors were synced last on the 30th September: Before that on the 9th of July. The metadata for import 65 indeed looks odd:
@yroskov this problem was never mentioned to me before and I am very surprised to see this now. It was working now for more than 2 years.f |
I raised this many times during our stands up... (especially, in relation to IRMNG) |
I believe I know what's going on. If you download the last archives they all lack metadata! |
Can you point me to an old issue please? |
Dataset metadata is only archived during imports, i.e. when no metadata is included in the archive there won't be any archival. And as the dataset metadata version is tied to the import attempt, it requires considerable refactoring to change that. The idea was that we do not want to archive every manual edit that is being done on a dataset, but instead allow manual changes via the UI or API to happen and only write a final version to the archive when a new one, through an import, shows up. It seems we now rather need an independent metadata versioning system that has its own version number and will be triggered to archive a version when:
Every import and sync would then refer to a specific metadata version which can be retrieved from the archive. |
Unfortunately, this can happen with any source. For example, quite often we get a notification about a new ITIS and do an import a few days before the release, without including that update in the release. ...and this happen to almost all GSDs imported by "third parties" out of our control, e.g. WCVP, WFO, Bryonames, all Lepidoptera, etc. |
but datasets with metadata in imports are versioned fine, they are not a problem! |
Describe the bug
New CoL release of September 2024 contains wrong metadata for World Plants and World Ferns.
Real versions of both GSDs are 19.4, Jun 2024 / 2024-06-30. (Indeed, new data versions were imported in CLB in Spetember 2024, but they were not synced by me in the CoL of September!). However, these incorrect versions (as 24.9, Sep 2024) are shown in GSD metadata in the September release:
https://www.catalogueoflife.org/data/dataset/1140
https://www.catalogueoflife.org/data/dataset/1141
The text was updated successfully, but these errors were encountered: