-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOIs for CF Convention releases? #127
Comments
An excellent idea, I think. |
Makes a lot of sense to me.
Cheers, Roy.
Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to [email protected]. Please also use this e-mail if your requirement is urgent.
…________________________________
From: [email protected] <[email protected]> on behalf of David Hassell <[email protected]>
Sent: 18 January 2018 19:28
To: cf-convention/cf-conventions
Cc: Subscribed
Subject: Re: [cf-convention/cf-conventions] DOIs for CF Convention releases? (#127)
An excellent idea, I think.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#127 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AfI2geMh29WgZpi4h9niedoqGHr574-yks5tL5mcgaJpZM4RjadX>.
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________
|
Another option is to have a single DOI and recommend that users include the version number when citing CF. What URL should result when dereferencing a CF DOI? I would think either the main CF web page or the current CF specification document. |
It sounds like a good idea to assign DOIs for the cv convention documents. The content, to which a DOI points, has to be invariable. Therefore, a DOI can only be assigned to a particular version of the cf convention document and not to the cf conventions in general. |
I know that on some DOI services (e.g. https://zenodo.org/) you can have a unique DOI for each release, but also generic DOI that always resolves to the latest version. I don't know if this feature is ubiquitous, though. For instance, https://doi.org/10.5281/zenodo.832255 resolves to the latest version of cf-python, whatever it may be. Right now it's v2.1, and v2.1 has it's own DOI https://zenodo.org/record/1039367 |
The DOI itself is permanent, the URL that results from dereferencing the DOI can be changed. The object/concept the DOI identifies should be permanent. What that object/concept actually represents and the possible versioning of that object, I believe, is up to those stewarding that object. DataCite [1] is the DOI minting service I've used. Their metadata schema [2] includes a field for version information. There are some notes on versioning on page 28 of the "DataCite Metadata Schema Documentation for the Publication and Citation of Research Data" [3] including:
Not sure what other DOI minting services recommend or how this might work if using the GitHub DOI minting tie-in with FigShare. [2] http://doi.org/10.5438/0014 [3] https://schema.datacite.org/meta/kernel-4.1/doc/DataCite-MetadataKernel_v4.1.pdf |
Yes, I kind of like the idea of having a top-level DOI and one for each version. Though, more DOIs means more things to maintain and more DOIs to include when tracking citations. With a top-level DOI and individual version DOIs, what would be the recommended citation? Including the version information in the citation is more transparent (at least to the human eye). The DataCite metadata schema includes a |
OK, thanks for the clarification. I wasn't aware of that possibiliy. |
@davidhassell would you be willing to make this happen? |
OK, looks like I'll be the odd one out here. Let me ask a few questions:
I know the community likes DOIs, but I'm not convinced there is any analytical advantage to the function provided by the DOIs. |
I completely reject the idea that a URL on the internet is a suitable fixed point of reference. The "canonical URL" for the CF-conventions has changed over time, rendering unusable any publication citation that relied upon that. DOIs provide a fixed record suitable for citation that is capable of being updated to point to new "landing pages" for the same content. |
Assuming someone maintains the mapping between DOI and the intended digital object's current URL.
Otherwise, DOIs become stale unique strings the same as URLs do.
I said I'd stay out of the persistent identifier flame war, but I failed. Maybe we should use blockchain.
… On Jan 19, 2018, at 11:58 AM, Ryan May ***@***.***> wrote:
I completely reject the idea that a URL on the internet is a suitable fixed point of reference. The "canonical URL" for the CF-conventions has changed over time, rendering unusable any publication citation that relied upon that.
DOIs provide a fixed record suitable for citation that is capable of being updated to point to new "landing pages" for the same content.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#127 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABbHQxMdhPyMjQpwEsqD-hfJECeDFzirks5tMNfGgaJpZM4RjadX>.
|
Sure, everything digital needs upkeep--that's the blessing and the curse. It's not my area of expertise, so I'm not really qualified to debate this with an informed point of view. therefore when it comes to best practice for long term reference and archival, I'll trust what the experts (i.e. digital library people) tell me to do: DOIs. |
The only reason canonical UIs have to change is that they have been chosen and managed without regard to their final purpose. (Something that DOIs are also vulnerable to, though I agree not as commonly.) Put me in the Cool URIs Don't Change camp. |
DOIs were designed to decouple content (CF conventions) from the
particulars of how and where it's served. In an ideal world URLs wouldn't
change, but we all know they do. It's much easier to update the location a
DOI resolves to than to set up forwarding from stable URLs on webservers
that you may or may not have access to, etc...
…On Fri, Jan 19, 2018 at 10:08 AM John Graybeal ***@***.***> wrote:
The only reason canonical UIs have to change is that they have been chosen
and managed without regard to their final purpose. (Something that DOIs are
also vulnerable to, though I agree not as commonly.) Put me in the Cool
URIs Don't Change camp.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#127 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AfI2gVK8VOCVN1QQtb_eVNmh7sdnTuKSks5tMNoAgaJpZM4RjadX>
.
|
I am happy to make something happen! The DOI server would, I think, keep a copy of the versioned document(s), thereby decoupling the need for a stable URL. |
TLDR version: I will not object further nor complain if you go the DOI path (except occasionally with a wink and nudge to close colleagues). Thanks for listening to my input! I just have a few followups, to fully explain my perspective. I am not aware of DOI servers being used to archive content. In fact not sure how they would know what to archive, given they just point to another resource, which could have arbitrarily many links to its parts (if the document is maintained as a set of pages, for example). I'm interested to know more. I accept the judgment of the library community that DOIs are perfect unique identifiers for bibliographic materials, that is their clear community choice. On the other hand, the expert librarians I talk to at Stanford are open to the possibility that DOIs are not the primary references for certain other kinds of digital content. The kind of content where I am most experienced is semantic content, where IRIs are the typical (but not universal) identifier of choice, because of the W3C semantic standards. So, in short, I think one identifier type does not fit all needs. I accept that DOIs were designed to decouple content; they were poorly designed to resolve content, without knowing what to add to them to make them resolvable. That said, you can generally find a DOI with Google, and yes, DOIs are easy(ier) to re-point by design. I also concede that the DOI infrastructure is well-enough funded (and consistently-enough-used for this kind of thing) that the DOI infrastructure will not cause as many long-term headaches as most IRIs will. So I will not be trying to argue further, but I do want to note:
These realities seem to map one-to-one with the realities of creating IRIs to decouple the content from the particulars of how and where it's served (I recommend Tim Berners-Lee's Cool URIs document, it's a short read and a fun bit of history). Either way, to have a successful persistent identifier, you have to be thoughtful, you have to invest resources in managing the maintenance and succession processes, and you have to understand that this is an indirection service that is run by an organization, one which you may or may not have full control over for the (eternal!) life of the identifier. If you manage those issues, either technology is equally effective, with only minor differences in cost-per-identifier and user pain to resolve the identifier. |
I just realized that netCDF also has its own DOI as mentioned here: https://www.unidata.ucar.edu/software/netcdf/docs/faq.html#How-should-I-cite-use-of-netCDF-software It is written (if the URL does not work at some point in the future):
|
Was there a conclusion to this issue? Is someone going to move it forward? |
Could we discuss this at the meeting in Reading in June? |
I believe that there was an agreement in Reading to create a DOI for the CF convention documentation. Is that correct? If so, shall we discuss the details on how to do it? We have a few options on how to implement it. One of them is using Zenodo as suggested by @rsignell-usgs , which would also archive the document itself as mentioned by @davidhassell , and would allow a general DOI grouping all releases as suggested by @ethanrd . I use Zenodo in other projects and it is minimal work to operate in a GitHub environment. I'm checking an alternative through UCSD library which offers similar resources and I just learned that they operate in a partnership with NCAR. I'll post here once I got some news. |
@castelao , thanks for picking this issue up again! |
I was really impressed by Zenodo and think it would be a great idea - lots of benefits, low workload. |
UCSD library could provide that, but they suggested to use Zenodo since it can be integrated with GitHub, which I confirm that is nearly zero maintenance. My contact in the library also mentioned that they trust Zenodo due to the solid institutions that support it. I canto do the repository setup to connect it with Zenodo automatically if there is a consensus to move this forward. |
As I recall, the decision at the Reading meeting was to mint a DOI for CF in general rather than for any particular version of any particular document. Is there a way using Zenodo with GitHub to mint a DOI that isn't associated with a particular document/artifact/release? Or, perhaps the overarching DOI should be tied to the CF web page repo rather than the CF conventions document repo. (Seems an appropriate repo since we want the DOI to dereference to https://cfconventions.org.) PS David and I have started on a meeting summary document. We'll share it out for comment and such once it isn't quite so rough. |
Sorry for the delay, I'm back. Thanks for the correction @ethanrd. Yes, I also recall an agreement for a single DOI. Although I would recommend using a master DOI with one child DOI for each release, it is possible to use a single DOI for the CF concept. Thus it would not be associated to a specific version. In that case, I would recommend it to point to the general https://cfconventions.org website, not the repository. My question is, how to move forward? If nobody says anything against this in 3 weeks, shall I start implementing such single DOI? |
Creating a single DOI pointing to https://cfconventions.org would be great, I think, and what was decided at the Reading meeting. We didn't decide to not create further DOIs (e.g. for different conventions versions) simply because we couldn't decide in the limited time how best to proceed. These will come later ... |
Sounds like a thumbs up, @castelao ! |
Great! We need to put some information together to move this forward:
The other fields should be straightforward, but I would print it all here for approval before submitting it. |
I just created a PR (#507) which adds a CITATION.cff file. The .cff file is a possible alternative to creating a separate "How to Cite CF" page. Though it is pretty minimal and clunky so maybe we'll want to do both. For instance, I don't think there's a way to describe citing with the overarching CF DOI vs the CF version DOIs. |
Dear @ethanrd, Gui @castelao, et al. Thanks for your PRs. #507 by Ethan to add the .cff file looks fine to me. Are you suggesting we add some text also in the CF convention document on "How to cite", Ethan? Shall we do that now, in this issue? #443 by Gui has some comments outstanding.
David made suggestions to update this text, to cover UGRID. David, I think it would be better to keep the .json description and the Abstract the same. Would it be OK with you if we conclude this issue first, and then you start a new one to update them both? This is the oldest open conventions issue, and I'm hopeful that we can conclude it very soon. We're nearly there! Best wishes Jonathan |
In website issue 182 Ethan @ethanrd wrote
I'm repeating that here because the licence has also been mentioned in PR #443 linked to this issue. Since Ethan commented two weeks ago today, I think we can regard the choice of CC0 as having been agreed if no-one objects before next Friday 9th. |
Hi Jonathan @JonathanGregory - Sorry, we may have rushed things a bit (after a very long wait). The CC0 license has already been implemented in both the conventions repo and the website repo (PR #504 and website PR #440). The decision was made back in 2022 (see this comment). As you quoted, there was discussion (long after the decision) but no objection. Two weeks ago I created the PRs. With the long delay and given the text is defined by the license/deed, we perhaps rushed the 3-week rule. |
Hi Jonathan @JonathanGregory, In terms of "How to Cite", I was actually thinking about a page on the CF website. But I do like the idea of having some mention in the CF Conventions document. Yes, I think we could discuss that here and start up another PR to add citation information. Should we start another issue in the website repo to discuss a "How to Cite CF" web page or can we do that here as well? |
Dear Ethan @ethanrd
Yes, I don't that's a problem. Sorry if looked like I was trying to turn the clock back. The discussion of the licence was mostly in the website repo, but obviously affects the conventions repo as well, and had been mentioned in this issue previously. Because it's an important decision and was an outstanding question on PR #443, I thought it would do no harm to state it clearly in this issue, so that we can be perfectly clear we've followed the usual decision process. I don't expect that anyone will object. Is it OK to leave this issue open for one more week? In any case, we have other things not quite concluded here. Best wishes Jonathan |
It's a good idea to put it on the website. Maybe we could agree the words, and where to put them on the website, in a website issue, and then return to this issue to decide where to put the same words in the conventions document? Also, I think it would be appropriate to state the licence in the conventions document as well. Cheers, Jonathan |
Dear @ethanrd, Gui @castelao and others Is the DOI for the conventions document specifically, or for CF as a whole, including the standard names? At the moment, #443 lists the authors of the convention document as the authors. That is consistent with what Ethan did for "How to cite", and makes sense if the DOI is for the conventions document. There is also a list of contributors. I think they are the contributors to the CF convention, excluding the authors, but I'm not sure. Has this been discussed before? I'm sorry, I don't remember. It feels a bit unfair to me to list all the contributors to the conventions but not the contributors to the standard names or to information management. But if we include all these lists, that will be a lot of people, and will require continual maintenance to keep them up to date. Instead of listing contributors by name, is there a way to refer to the CF website for those lists? Best wishes Jonathan |
Hi all, 1
To my understanding (I might very well be wrong here!) the .cff file is useful a as file in the repo as such, but does not add much to a website or a html document. So I guess that we could have one cff file for the 2 3
I don't know how these headers/footers are set up, but I imagine that the Info Management team could fix this. 4 5
But there is nothing like a header or footer, because the html version is just one ling page |
@JonathanGregory, my understanding is that #443 was for the DOI of the CF-Conventions only. And an equivalent would be done for standard names. I think it is important to have the names, with affiliations and preferably with ORCIDs, of contributors in the .zenodo.json, thus on the DOI record. The citation text is less important since, with the DOI record, everyone is appropriately linked. Indeed, there is some work to keep this list updated, but I think it is important to give such credit if we expect the community to dedicate time to contributing to CF. My understanding is that 'CITATION.cff' is meant for machine-to-machine communication. For the first time, we will have to link the repo with Zenodo, and after that, the following releases will trigger a new DOI automatically. During this very first time, it will create an overarching DOI (in this case, overarching among the CF-Conventions versions, not overarching for the whole CF) and another one specific for the release. The following ones will trigger just a new release DOI. I think the 'how to cite' text is important, and I usually instruct people to cite the release DOI only when referring to a specific version, otherwise, the default would be to cite the overarching one. |
Dear all I agree with Gui @castelao that public credit should be given to the contributors to the conventions for the effort they have dedicated. That is indeed the purpose of the list on the website, which so far I have compiled and maintained. Does everyone agree that we need to list them all individually in If this is necessary, then I think the two lists should be identical, because that will simplify the maintenance. At present, I think As Gui says, we could assign a DOI to the standard name table. I think the contributords to standard names would be the authors of that document, because there are no other authors. I am concerned that the contributions to information management should be recognised as well. Many of their contributions aren't specific to conventions or standard names. They should therefore be identified as contributors to both, I suppose, which I think is another argument for giving URLs to the lists in I agree with @larsbarring that the Best wishes Jonathan |
I agree that all authors should be listed in both the Does anyone know how contributor information in a As far as I can see, the CITATION.cff file is only used to automatically add a citation link/dropdown to the right-hand navigation on the GH repo main page (see my test repo for an example). So, I think the .cff file should be kept pretty minimal, i.e., not adding any information that doesn't show up in the drop-down. @JonathanGregory - I agree with your earlier comments on discussing the content of a "How to Cite CF" website page in a website repo issue. I will try to start an issue for that in the next few days. Unless we need a |
Just to be perfectly clear, no-one objected regarding this:
As @ethanrd said, CC0 has been implemented by stating it in |
@JonathanGregory , I copied the list of contributors (website) to create the zenodo's contributors list. If it's missing, it was my mistake. Would you know who are missing? Note that I intentionally removed all authors from the contributors list, since the authors list is a higher 'rank', and my understanding is that it wouldn't require such redundancy. There is no restriction in adding it as well if you prefer so. I see the authors' list as an equivalent of the authors of a peer reviewed paper, and contributors would be the equivalent of everyone on the acknowledgments of that paper. There is a value in finding and fixing a typo and it should be recognized, but also should be clear that it is different than the time committed from the authors. @ethanrd , The value of having everyone explicitly listed and respective ORCIDs is that the DOI links everyone. Those are fields in the DOI database. This is very different than having a text in a website or the document itself. The DOI database links objects with authors and contributors, and many more metadata for machine to machine communication. If you go in the ORCID of those authors and contributors, the CF would be listed. Only the 'rank' authors would show up in the citation text. If you look my ORCID, there is a mix of peer reviewed papers, software, and data, and each one with a different 'rank', as author or contributor. It is important to include the DOI that will be generated in the |
Hi Gui @castelao - Do you mean that the DOI metadata gets automatically harvested (by Zenodo or DataCite, I guess) and pushed to ORCID metadata? That definitely makes it worth listing everybody. Yes, I agree, the actual CF Conventions DOI (the top-level one) needs to be included in the Once we agree on and merge the |
Very close to that. My understanding is that Zenodo has authority to record
a new DOI, and that information propagates automatically (I don't know the
actual details) to other entities, so that ORCID gets those references
automatically.
Once we merge this zenodo.json PR, it might be worth doing a brief Zoom so
we can clafirfy any question and run it together.
…On Thu, Feb 15, 2024 at 1:04 PM Ethan Davis ***@***.***> wrote:
Hi Gui @castelao <https://github.com/castelao> - Do you mean that the DOI
metadata gets automatically harvested (by Zenodo or DataCite, I guess) and
pushed to ORCID metadata? That definitely makes it worth listing everybody.
Yes, I agree, the actual CF Conventions DOI (the top-level one) needs to
be included in the CITATION.cff. Yes, the metadata in the CITATION.cff
file could be used beyond the GH dropdown menu. I'm just not sure it is
currently.
Once we agree on and merge the zenodo.json file and then connect the repo
to Zenodo, can we force the minting of DOIs before the release of CF v1.12?
I think a GH release is required. Perhaps we could do a "blank" release to
mint the initial DOIs, delete the release, and then edit the DOI metadata
in Zenodo to point the version DOI to v1.11.
—
Reply to this email directly, view it on GitHub
<#127 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOQXZJG4RN745HLO423NATYTZS3TAVCNFSM4EMNU5L2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJUG4ZDAMRRGU4A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Dear Gui @castelao and @ethanrd I agree there is an advantage to listing all the contributors in the Yes, I think the authors of the CF document should also be listed as contributors, as they are in the contributors page. This reflects their contribution to discussions on agreeing conventions, not their authorship of parts of the text. Since this means we will have two lists of contributors, we will have to keep them consistent, and we could think about how to automate that once Cheers Jonathan |
Sounds good. I'll add back the authors into the contributors at
`zenodo.json`.
I like the idea of importing contributors from `zenodo.json` to keep both
consistent.
…On Fri, Feb 16, 2024 at 7:02 AM JonathanGregory ***@***.***> wrote:
Dear Gui @castelao <https://github.com/castelao> and @ethanrd
<https://github.com/ethanrd>
I agree there is an advantage to listing all the contributors in the
zenodo.json file with their ORCIDs, if that has the effect of linking the
CF DOIs automatically to their ORCIDs. Thanks for clarifying.
Yes, I think the authors of the CF document should also be listed as
contributors, as they are in the contributors
<https://cfconventions.org/conventions_contributors.html> page. This
reflects their contribution to discussions on agreeing conventions, not
their authorship of parts of the text. Since this means we will have two
lists of contributors, we will have to keep them consistent, and we could
think about how to automate that once zenodo.json is in place. Maybe
conventions_contributors.md could be generated from zenodo.json whenever
the latter is updated?
Cheers
Jonathan
—
Reply to this email directly, view it on GitHub
<#127 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOQXZLUHP3QXGNFRF4ALPDYT5RFXAVCNFSM4EMNU5L2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJUHA2DGOJWGYYQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I am planning to open a PR to add the CC0 license information to the conventions document. But we should also add the DOI for the particular CF version in the document. Does anyone know how we obtain a separate DOI for each version, and how/if we can get it ahead of actually submitting it to zenodo (else we have a chicken and egg problem)? |
From @castelao #443 (comment):
|
As the associated PR #443 includes ORCID for many authors can also cf-convention/discuss#178 closed at the the same time as this issue is closed? |
Good idea, @larsbarring. I have linked cf-convention/discuss#178 to PR #443 as well, so it should be closed automagically upon merging. Thanks. |
Another thought --- do we want to close this issue when PR #443 is merged ?
... probably there are more aspects that I have missed OR, do we want to keep this issue open and keep the "coordination" of these tasks here |
Dear Lars Thanks for reminding us of all these loose ends. Since there are discussions to be had about what we want, such as you mention, I suggest it would be good to let this issue be closed and start again in a new Discussion (rather than an issue) to consider what else needs to be done. The CFF file could have its own issue to accompany the PR that @ethanrd did. That isn't the same thing as adding the DOI. It's related only because the citation lists the DOI. Best wishes Jonathan |
Referring to Lars's comment above:
in issue 513, Ethan comments that the first can't be done and the other two will be done automatically by Zenodo once the PR for this issue is merged.
I have just added this to the related discussion 296 about old versions of the standard name table.
are now also raised in issue 513. Two weeks ago I asked if it was OK now to merge @castelao's PR, which will implement DOIs for the conventions by Zenodo. @castelao asked the same question. No-one has objected. If no-one objects today, I will merge the PR tomorrow and thus close this issue (the oldest one which is outstanding) - unless someone else does it before me! |
Seems like getting a new DOI for each release of CF would be a good idea.
And getting a DOI is pretty easy for GitHub releases:
https://guides.github.com/activities/citable-code/
What do folks think?
In March 2023 the CF governance panel decided to use Zenodo fo CF DOIs, as reported by Ethan @ethanrd. After the annual meeting in September 2023, Gui @castelao prepared pull request 443 to support CF's adoption of GitHub/Zenodo integration.
The text was updated successfully, but these errors were encountered: