Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update old DataCite schema records to 4.5 #540

Open
1 of 5 tasks
mariagould opened this issue Jan 7, 2024 · 17 comments
Open
1 of 5 tasks

Update old DataCite schema records to 4.5 #540

mariagould opened this issue Jan 7, 2024 · 17 comments
Assignees
Labels
datacite Work related to DataCite support in EZID epic High-level project with multiple sub-issues

Comments

@mariagould
Copy link

mariagould commented Jan 7, 2024

As of January 2025, DataCite will require that DOIs be registered and updated with schema version 4.0 or newer. See the initial announcement here. https://datacite.org/blog/deprecating-schema-3/.

To keep up with DataCite policies and practices, EZID's DataCite configuration needs to be updated so that DataCite DOIs can no longer be registered or updated with schema versions older than 4.0. This affects DOIs created via the API, UI, and XML deposits. Users will need to be informed about the change in advance and provided with guidance about upgrading.

Steps

  • Identify DataCite DOIs registered with schema versions older than 4.0
  • Notify users associated with these DOIs that they will need to update their metadata by X date, and if not, their metadata may be updated on their behalf
  • Determine whether and how to update metadata on users' behalf if they fail to take action
  • Change default settings in EZID to prevent DataCite DOIs from being registered or updated with schema versions older than 4.0
  • Update documentation as appropriate

Planned Workflow after Jan 2025
Creating new ID:

  • creating 2.x record (API): receive 2.x records -> report error
  • creating 3.x record (API): receive 3.x records -> report error
  • creating 4.x record (API): receive 4.x records -> created in schema 4.x
  • UI: record will be created in schema 4.x

Updating:

  • 2.x record (API and UI) -> report error
  • 3.x record (API and UI) -> report error
  • 4.x record (API and UI) -> schema 4.x in EZID and DataCite
@mariagould mariagould added the datacite Work related to DataCite support in EZID label Jan 7, 2024
@jsjiang jsjiang changed the title Deprecate DataCite schema versions older than 4.x Upgrade old DataCite schema versions to 4.5 Apr 9, 2024
@jsjiang
Copy link
Contributor

jsjiang commented Apr 9, 2024

Rushiraj created a ticket for similar topic #559 with information on how to retrieve DataCite records by schema version

To get stats on IDs with Schema 3 versions for a specific repository (e.d.cdl.cdl) is as follows:

curl --location 'https://api.datacite.org/dois?client-id=cdl.cdl&schema-version=3'

@jsjiang jsjiang changed the title Upgrade old DataCite schema versions to 4.5 Update old DataCite schema records to 4.5 Apr 9, 2024
@jsjiang
Copy link
Contributor

jsjiang commented Apr 9, 2024

Jing created related/duplicated ticket #556 with additional info. Copy additional info over and close dup. ticket.

We received an email from DataCite regarding Schema 3 deprecating schedule and request of updating metadata to Schema 4.

From: Kelly Stathis [email protected]
Date: Tuesday, January 30, 2024 at 9:00 AM
To: EZID [email protected], Rushiraj Nenuji [email protected], John Chodacki [email protected], Jing Jiang [email protected]
Subject: Action Required: Schema 3 usage within your consortium
CAUTION: EXTERNAL EMAIL
Dear California Digital Library team,

I'm writing to share that DataCite plans to deprecate Schema 3 on January 1, 2025, and to request your assistance with communicating this change to the Consortium Organizations within your consortium.

You can read more about what will change here: https://support.datacite.org/docs/updating-from-schema-3-to-schema-4. Once we deprecate Schema 3, repositories will be required to use Schema 4 for DOI registration and metadata updates.

There are 8 Repositories in your consortium with at least one Schema 3 DOI. Of these, 2 actively used Schema 3 in the past year to register or update DOIs. The Repositories actively using Schema 3 will be impacted by this change.

To assist you in understanding this usage, I have attached a spreadsheet of Repositories in your consortium to this email. This is broken down as follows:

• Count of DOIs (Total)
• Count of DOIs registered/updated in 2023
• Count of Schema 3 DOIs
• Count of Schema 3 DOIs registered/updated in 2023
• Count of Schema 3 DOIs missing resourceTypeGeneral
• Count of Schema 3 DOIs missing resourceTypeGeneral registered/updated in 2023
• Count of Schema 3 DOIs with contributorType "Funder"
• Count of Schema 3 DOIs with contributorType "Funder" registered/updated in 2023

The counts of DOIs missing resourceTypeGeneral and using contributorType "Funder" are included because these DOIs are not compatible with Schema 4. For more information, please see the FAQ covering differences between Schema 3 and Schema 4.

Please work with your Consortium Organizations as soon as possible to ensure that each has sufficient time to update their systems and workflows to use DataCite Metadata Schema 4. We're available to answer any questions you have about the process.

Best regards,
Kelly


Kelly Stathis | Technical Community Manager | DataCite
E: [email protected] | ORCID
W: datacite.org | Blog | Twitter | LinkedIn
Support Desk | Support Site | PID Forum

@jsjiang
Copy link
Contributor

jsjiang commented Apr 9, 2024

DataCite report (Jan 2024) on Schema 3 usage within your consortium:

cdlco.csv

Repo ID Repo Name Total DOIs Total V3 DOIs V3 DOIs missing resourceTypeGeneral V3 DOIs with contributorType "Funder
cdl.ucb UC Berkeley 39,496 24,524 7,574 0
cdl.ucsb UC Santa Barbara 13,1803 3,856 26 0
cdl.cdl CDL 20,645 3,851 18 0
cdl.ucla UC Los Angeles 10,496 0 0 0
cdl.ucsd UC San Diego 129,765 632 530 0
cdl.ucr UC Riverside 136 0 0 0
cdl.uci UC Irvine 1,414 3 0 1
cd.ucsc UC Santa Cruz 146 0 0 0
cdl.ucd UC Davis 221 1 0 0
cdl.ucsf UC San Francisco 32 9 0 0
cdl.ucm UC Merced 5 1 0 0

Query to find Schema 3 records:

Query to find Schema 3 records that are missing resourceTypeGeneral:

Query to find schema 3 records that use the contributorType "Funder"

@jsjiang
Copy link
Contributor

jsjiang commented Apr 9, 2024

  • Identify DataCite DOIs registered with schema versions older than 4.0

Records by schema versions (https://doi.datacite.org/providers/cdlco/dois):

  • Schema 3: 32,691
  • Schema 2.2: 12,190
  • Schema 2.1: 4

v2.1 records: https://doi.datacite.org/providers/cdlco/dois?schema-version=2.1:

Version 3 and version 2.2 records are retrieved and saved in the Google Drive folder:

@jsjiang
Copy link
Contributor

jsjiang commented Apr 9, 2024

  1. EZID saves DataCite records in two formats:
  • key/value pair in "datacite: xml doc" format
  • key/value pairs in "datacite.fieldname: value" format

Code for validating and formatting:

ezidapp.models.identifier.IdentifierBase.clean():

    def clean(self):
        self.baseClean()
        if self.isAgentPid:
            self.cleanAgentPid()
        self.cleanCitationMetadataFields()
        self.checkMetadataRequirements()
        self.computeComputedValues()

Notes:

  • the checkMetadataRequirements() function calls formRecord to generate an XML record. However when the record is in the "datacite.fieldname: value" format, the xml record is not used by the process (the metadata field is not updated to the xml version record). So the "datacite.fieldname: value" format record is saved as is in EZID.
  1. EZID converts a "datacite.fieldname: value" format record to XML based on metadata schema when registering the record with DataCite. So all records in DataCite are in XML format.

proc-datacite.py => _create_or_update() => impl.datacite.uploadMetadata() => impl.datacite.formRecord(): Form an XML record for upload to DataCite, employing metadata mapping if necessary

METADATA_TEMPLATE = """<?xml version="1.0" encoding="UTF-8"?>
<resource xmlns="http://datacite.org/schema/kernel-4"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://datacite.org/schema/kernel-4
    http://schema.datacite.org/meta/kernel-4/metadata.xsd">
  <identifier identifierType="{}">{}</identifier>
  <creators>
    <creator>
      <creatorName>{}</creatorName>
    </creator>
  </creators>
  <titles>
    <title>{}</title>
  </titles>
  <publisher>{}</publisher>
  <publicationYear>{}</publicationYear>
"""

@jsjiang
Copy link
Contributor

jsjiang commented Jun 4, 2024

The upgradeDcmsRecord function in datacite.py was developed to convert a DataCite Metadata Schema record to the latest version (currently, version 4). What is does currently:

  1. Convert resourceType and resourceTypeGeneral to version 4 competitive format
  • If record does not contain resourceType element:
    • Create one: (:unav)
  • If record contains the resourceType element:
    • If resourceTypeGeneral="Film", change it to "Audiovisual";
    • If resourceTypeGeneral attribute is not defined: report error.
  1. Handle the contributor type "Funder" that went away in version 4

@jsjiang
Copy link
Contributor

jsjiang commented Jun 13, 2024

Retrieved DataCite 3 records by campus:

  • Counts of DOIs for each campus had changes a little bit comparing with DataCite's January report in the attached cdlco.csv file
  • Campuses with DataCite 3 records:
    • UCB
    • UCSB
    • CDL
    • UCSD

Sample command:

curl --location 'https://api.datacite.org/dois?client-id=cdl.cdl&schema-version=3' > datacite_cdl.cdl_v3.json

Record files are saved in the Google Drive folder EZID/Identifiers/DataCite/DataCite_3_records

Note:
Each file only contains 25 records (1st page with default size). Find a way to retrieve all records for each campus.

DataCite API offers two pagination options:

  • Page number: use page[size] and can retrieve up to 10,000 records
  • Cursor based: use page[cursor] and has no limitations on the number of records that can be retrieved.

Example to retrieve the first 1,000 records:

curl --location "https://api.datacite.org/dois?client-id=cdl.cdl&schema-version=3&page[cursor]=1&page[size]=1000" > datacite_cdl.cdl_v3_1.json

Results file contains total records and page counts, plus the URL for retrieving the next page:

"meta": {
    "total": 3567,
    "totalPages": 4,

  "links": {
    "self": "https://api.datacite.org/dois?client-id=cdl.cdl&schema-version=3&page[cursor]=1&page[size]=1000",
    "next": "https://api.datacite.org/dois?client-id=cdl.cdl&page%5Bcursor%5D=MTQzODQzNzE4OTAwMCwxMC4xNTE0NC9wbC1jNDkuMzY1&page%5Bsize%5D=1000"
  }

Note: need to manually add search criteria "schema-version=3" to the next page url:
Change from:
https://api.datacite.org/dois?client-id=cdl.cdl&page%5Bcursor%5D=MTQzODQzNzE4OTAwMCwxMC4xNTE0NC9wbC1jNDkuMzY1&page%5Bsize%5D=1000

To:
https://api.datacite.org/dois?client-id=cdl.cdl&schema-version=3&page%5Bcursor%5D=MTQzODQzNzE4OTAwMCwxMC4xNTE0NC9wbC1jNDkuMzY1&page%5Bsize%5D=1000

@jsjiang
Copy link
Contributor

jsjiang commented Jun 18, 2024

  • Created script scripts/retrieve_datacite_records.py to automatically retrieve DataCite records and produce DOI lists.
  • The script performs 3 types of query for each campus:
    • v3 records
    • v3 records that missing the resourceTypeGeneral property
    • v3 records using the contributor funder field
  • output record files are named in campus_id_v3_querytype_page[no].json format. Page size is set to1000 records.
  • DOI lists are named in campus_id_v3_querytype.txt format
  • Output files are saved in the Google Drive folder EZID/Identifiers/DataCite/DataCite_3_records
  • Total v3 records: 33,025
  • v3 records without resourceTypeGeneral: 8,609
  • v3 records using funder as contributor: 0

Counts of DOIs by campus and by categories (Retrieved on June 17, 2024):

(ezid-py38) CDL-jjiang-9m:datacite_records jjiang$ wc -l *.txt
    3567 cdl.cdl_v3.txt
      10 cdl.cdl_v3_wo_res_type_gen.txt
       0 cdl.cdl_v3_wt_contrib_funder.txt
   24983 cdl.ucb_v3.txt
    8043 cdl.ucb_v3_wo_res_type_gen.txt
       0 cdl.ucb_v3_wt_contrib_funder.txt
       0 cdl.ucd_v3.txt
       0 cdl.ucd_v3_wo_res_type_gen.txt
       0 cdl.ucd_v3_wt_contrib_funder.txt
       0 cdl.uci_v3.txt
       0 cdl.uci_v3_wo_res_type_gen.txt
       0 cdl.uci_v3_wt_contrib_funder.txt
       0 cdl.ucla_v3.txt
       0 cdl.ucla_v3_wo_res_type_gen.txt
       0 cdl.ucla_v3_wt_contrib_funder.txt
       0 cdl.ucm_v3.txt
       0 cdl.ucm_v3_wo_res_type_gen.txt
       0 cdl.ucm_v3_wt_contrib_funder.txt
       0 cdl.ucr_v3.txt
       0 cdl.ucr_v3_wo_res_type_gen.txt
       0 cdl.ucr_v3_wt_contrib_funder.txt
    3843 cdl.ucsb_v3.txt
      26 cdl.ucsb_v3_wo_res_type_gen.txt
       0 cdl.ucsb_v3_wt_contrib_funder.txt
       0 cdl.ucsc_v3.txt
       0 cdl.ucsc_v3_wo_res_type_gen.txt
       0 cdl.ucsc_v3_wt_contrib_funder.txt
     632 cdl.ucsd_v3.txt
     530 cdl.ucsd_v3_wo_res_type_gen.txt
       0 cdl.ucsd_v3_wt_contrib_funder.txt
       0 cdl.ucsf_v3.txt
       0 cdl.ucsf_v3_wo_res_type_gen.txt
       0 cdl.ucsf_v3_wt_contrib_funder.txt
    3567 cdl.cdl_v3.txt
   24983 cdl.ucb_v3.txt
       0 cdl.ucd_v3.txt
       0 cdl.uci_v3.txt
       0 cdl.ucla_v3.txt
       0 cdl.ucm_v3.txt
       0 cdl.ucr_v3.txt
    3843 cdl.ucsb_v3.txt
       0 cdl.ucsc_v3.txt
     632 cdl.ucsd_v3.txt
       0 cdl.ucsf_v3.txt
   33025 total
      10 cdl.cdl_v3_wo_res_type_gen.txt
    8043 cdl.ucb_v3_wo_res_type_gen.txt
       0 cdl.ucd_v3_wo_res_type_gen.txt
       0 cdl.uci_v3_wo_res_type_gen.txt
       0 cdl.ucla_v3_wo_res_type_gen.txt
       0 cdl.ucm_v3_wo_res_type_gen.txt
       0 cdl.ucr_v3_wo_res_type_gen.txt
      26 cdl.ucsb_v3_wo_res_type_gen.txt
       0 cdl.ucsc_v3_wo_res_type_gen.txt
     530 cdl.ucsd_v3_wo_res_type_gen.txt
       0 cdl.ucsf_v3_wo_res_type_gen.txt
    8609 total
       0 cdl.cdl_v3_wt_contrib_funder.txt
       0 cdl.ucb_v3_wt_contrib_funder.txt
       0 cdl.ucd_v3_wt_contrib_funder.txt
       0 cdl.uci_v3_wt_contrib_funder.txt
       0 cdl.ucla_v3_wt_contrib_funder.txt
       0 cdl.ucm_v3_wt_contrib_funder.txt
       0 cdl.ucr_v3_wt_contrib_funder.txt
       0 cdl.ucsb_v3_wt_contrib_funder.txt
       0 cdl.ucsc_v3_wt_contrib_funder.txt
       0 cdl.ucsd_v3_wt_contrib_funder.txt
       0 cdl.ucsf_v3_wt_contrib_funder.txt
       0 total

@adambuttrick
Copy link

Noting change in v3 record counts from from January 2024:

repo_id 2024-01 2024-06-17 change
cdl.cdl 3851 3567 -284
cdl.ucb 24524 24983 459
cdl.ucsb 3856 3843 -13
cdl.ucsd 632 632 0

@jsjiang
Copy link
Contributor

jsjiang commented Jun 21, 2024

Retrieved v2.2 records using scripts/retrieve_datacite_records.py (with some modifications).

   11171 cdl.cdl_v22.txt
      49 cdl.ucb_v22.txt
       0 cdl.ucd_v22.txt
       1 cdl.uci_v22.txt
       0 cdl.ucla_v22.txt
       0 cdl.ucm_v22.txt
       0 cdl.ucr_v22.txt
     753 cdl.ucsb_v22.txt
       0 cdl.ucsc_v22.txt
      23 cdl.ucsd_v22.txt
       0 cdl.ucsf_v22.txt
   11997 total
   11169 cdl.cdl_v22_wo_res_type_gen.txt
       0 cdl.ucb_v22_wo_res_type_gen.txt
       0 cdl.ucd_v22_wo_res_type_gen.txt
       0 cdl.uci_v22_wo_res_type_gen.txt
       0 cdl.ucla_v22_wo_res_type_gen.txt
       0 cdl.ucm_v22_wo_res_type_gen.txt
       0 cdl.ucr_v22_wo_res_type_gen.txt
      98 cdl.ucsb_v22_wo_res_type_gen.txt
       0 cdl.ucsc_v22_wo_res_type_gen.txt
       0 cdl.ucsd_v22_wo_res_type_gen.txt
       0 cdl.ucsf_v22_wo_res_type_gen.txt
   11267 total
       0 cdl.cdl_v22_wt_contrib_funder.txt
       0 cdl.ucb_v22_wt_contrib_funder.txt
       0 cdl.ucd_v22_wt_contrib_funder.txt
       0 cdl.uci_v22_wt_contrib_funder.txt
       0 cdl.ucla_v22_wt_contrib_funder.txt
       0 cdl.ucm_v22_wt_contrib_funder.txt
       0 cdl.ucr_v22_wt_contrib_funder.txt
       0 cdl.ucsb_v22_wt_contrib_funder.txt
       0 cdl.ucsc_v22_wt_contrib_funder.txt
       0 cdl.ucsd_v22_wt_contrib_funder.txt
       0 cdl.ucsf_v22_wt_contrib_funder.txt
       0 total

@adambuttrick
Copy link

Moving to backlog. Planning is we will send out a message in the fall to users with records in the old version of the schema, give them chance to upgrade, and covert if they have not after DataCite deprecates old schema versions in 2025.

@adambuttrick adambuttrick self-assigned this Oct 5, 2024
@adambuttrick
Copy link

adambuttrick commented Oct 5, 2024

I refactored the retrieve records script to identify the < 4.x schema records by shoulder so we can contact the corresponding users. Updated files are here.

v2_Unique_Shoulder Count
10.5060/d2 1
10.5060/d4 11170
10.5062/f4 464
10.5063/aa 15
10.6085/aa 274
10.6080/k0 49
10.4246/10 8
10.4246/ca 14
10.4246/cw 1
10.7280/s9 1
v3_Unique_Shoulder Count
10.18118/g6 13
10.21418/g8 2
10.6078/d1 21
10.6078/j8 19
10.6078/m7 5
10.7297/x2 8025
10.7299/x7 16881
10.7928/h6 9
10.15144/lt 412
10.15144/mk 474
10.15144/pl 2624
10.5060/d2 11
10.5060/d8 2
10.7293/w2 44
10.18739/a2 2171
10.21229/m9 3
10.25494/p6 42
10.5062/f4 251
10.5063/08 1
10.5063/3x 1
10.5063/5t 1
10.5063/7p 1
10.5063/9k 1
10.5063/aa 1
10.5063/bg 1
10.5063/cf 1
10.5063/f1 138
10.5063/f7 1
10.5063/g7 1
10.5063/h7 1
10.5063/k0 1
10.5063/m0 1
10.5063/ns 1
10.5063/pr 1
10.5063/qr 1
10.5063/sj 1
10.5063/th 1
10.5063/vh 1
10.5063/x9 1
10.5063/z8 1
10.5065/d6 256
10.6085/aa 336
10.7940/m3 624
10.13022/m3 3
10.15782/d6 2
10.21224/p4 1
10.21228/m8 618
10.4246/uc 1
10.6072/h0 5
10.6075/j0 1

@jsjiang
Copy link
Contributor

jsjiang commented Oct 10, 2024

Unique DOI prefixes: 56

'doi:10.13022/M3',
'doi:10.15144/LT',
'doi:10.15144/MK',
'doi:10.15144/PL',
'doi:10.15782/D6',
'doi:10.18118/G6',
'doi:10.18739/A2',
'doi:10.21224/P4',
'doi:10.21228/M8',
'doi:10.21229/M9',
'doi:10.21418/G8',
'doi:10.25494/P6',
'doi:10.4246/10',
'doi:10.4246/CA',
'doi:10.4246/CW',
'doi:10.4246/UC',
'doi:10.5060/D2',
'doi:10.5060/D4',
'doi:10.5060/D8',
'doi:10.5062/F4',
'doi:10.5063/08',
'doi:10.5063/3X',
'doi:10.5063/5T',
'doi:10.5063/7P',
'doi:10.5063/9K',
'doi:10.5063/AA',
'doi:10.5063/BG',
'doi:10.5063/CF',
'doi:10.5063/F1',
'doi:10.5063/F7',
'doi:10.5063/G7',
'doi:10.5063/H7',
'doi:10.5063/K0',
'doi:10.5063/M0',
'doi:10.5063/NS',
'doi:10.5063/PR',
'doi:10.5063/QR',
'doi:10.5063/SJ',
'doi:10.5063/TH',
'doi:10.5063/VH',
'doi:10.5063/X9',
'doi:10.5063/Z8',
'doi:10.5065/D6',
'doi:10.6072/H0',
'doi:10.6075/J0',
'doi:10.6078/D1',
'doi:10.6078/J8',
'doi:10.6078/M7',
'doi:10.6080/K0',
'doi:10.6085/AA',
'doi:10.7280/S9',
'doi:10.7293/W2',
'doi:10.7297/X2',
'doi:10.7299/X7',
'doi:10.7928/H6',
'doi:10.7940/M3',

However, only 22 prefixes are in the ezid shoulder table. Find out the the not matched ones.

@adambuttrick
Copy link

adambuttrick commented Oct 11, 2024

@jsjiang After some review, some of these appear to be "super shoulder" (i.e DOI prefix only) values in EZID, e.g. 10.5063. The revised script derives from parsing all the DOIs into their corresponding shoulders (prefix + first two characters), but where a super shoulder exists, this may be an erroneous derivation/something that was never created in EZID. In these cases, I think we can just pull the user accounts and emails associated with the super shoulder/prefix instead.

@jsjiang
Copy link
Contributor

jsjiang commented Oct 14, 2024

Not matched prefixes:

DOI:10.15144/LT - no
DOI:10.15144/MK - no
DOI:10.15144/PL - no
DOI:10.4246/10 - supper DOI:10.4246/
DOI:10.4246/CA - supper
DOI:10.4246/CW - supper
DOI:10.4246/UC - supper
DOI:10.5060/D2 - no
DOI:10.5060/D4 - no
DOI:10.5060/D8 - no
DOI:10.5063/08 - supper DOI:10.5063/
DOI:10.5063/3X -
DOI:10.5063/5T -
DOI:10.5063/7P
DOI:10.5063/9K
DOI:10.5063/AA
DOI:10.5063/BG
DOI:10.5063/CF
DOI:10.5063/F7
DOI:10.5063/G7
DOI:10.5063/H7
DOI:10.5063/K0
DOI:10.5063/M0
DOI:10.5063/NS
DOI:10.5063/PR
DOI:10.5063/QR
DOI:10.5063/SJ
DOI:10.5063/TH
DOI:10.5063/VH
DOI:10.5063/X9
DOI:10.5063/Z8
DOI:10.5065/D6 - no
DOI:10.6085/AA - supper DOI:10.6085/
DOI:10.7293/W2 - no

@jsjiang
Copy link
Contributor

jsjiang commented Oct 14, 2024

datacite_v2_v3_prefix_user_email.txt

user_id	username	displayName	accountEmail	primaryContactEmail	shoulder_prefix
65	eschol_harvester	CDL eScholarship	[email protected]	[email protected]	doi:10.21418/G8
124	merritt	CDL UC3 Merritt	[email protected]	[email protected]	doi:10.6075/J0
124	merritt	CDL UC3 Merritt	[email protected]	[email protected]	doi:10.6078/D1
124	merritt	CDL UC3 Merritt	[email protected]	[email protected]	doi:10.7297/X2
152	opencontext	Open Context	[email protected]	[email protected]	doi:10.6078/M7
179	sb-bren	UCSB Bren School of Environmental Science & Mgmt	[email protected]	[email protected]	doi:10.5062/F4
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.18739/A2
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.21229/M9
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.25494/P6
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.5062/F4
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.5063/
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.5063/F1
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.6085/
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.6085/C3
182	sb-library	UC Santa Barbara Library	[email protected]	[email protected]	doi:10.7940/M3
184	sb-nceas	National Center for Ecological Analysis and Synthesis	[email protected]	[email protected]	doi:10.18739/A2
184	sb-nceas	National Center for Ecological Analysis and Synthesis	[email protected]	[email protected]	doi:10.25494/P6
184	sb-nceas	National Center for Ecological Analysis and Synthesis	[email protected]	[email protected]	doi:10.5063/
184	sb-nceas	National Center for Ecological Analysis and Synthesis	[email protected]	[email protected]	doi:10.5063/F1
185	sb-pisco	Partnership for Interdisciplinary Studies of Coastal Oceans	[email protected]	[email protected]	doi:10.6085/
185	sb-pisco	Partnership for Interdisciplinary Studies of Coastal Oceans	[email protected]	[email protected]	doi:10.6085/C3
186	sb-writ	UCSB Writing Program	[email protected]	[email protected]	doi:10.7940/M3
198	uc-geer	Geotechnical Extreme Events Reconnaissance	[email protected]	[email protected]	doi:10.18118/G6
202	ucb-crcns.org	Collaborative Research in Computational Neuroscience (CRCNS)	[email protected]	[email protected]	doi:10.6080/K0
204	ucb-ist-rit	Berkeley Research IT	[email protected]	[email protected]	doi:10.7928/H6
207	ucb-ling	UC Berkeley Department of Linguistics	[email protected]	[email protected]	doi:10.7297/X2
208	ucb-mvz	Museum of Vertebrate Zoology	[email protected]	[email protected]	doi:10.7299/X7
209	ucb_hrc	UC Berkeley Human Rights Center	[email protected]	[email protected]	doi:10.6078/D1
209	ucb_hrc	UC Berkeley Human Rights Center	[email protected]	[email protected]	doi:10.6078/J8
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.18118/G6
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.21418/G8
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.6078/D1
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.6078/J8
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.6078/M7
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.6080/K0
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.7297/X2
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.7299/X7
210	ucblibrary	UC Berkeley Library	[email protected]	[email protected]	doi:10.7928/H6
216	uci	UC Irvine Libraries	[email protected]	[email protected]	doi:10.7280/S9
217	uci-buslib	UCI Business Librarian	[email protected]	[email protected]	doi:10.7280/S9
220	ucla-library	UC Los Angeles Library	[email protected]	[email protected]	doi:10.15144/S4
232	ucsd_d3r	Drug Design Data (D3R) Resource	[email protected]	[email protected]	doi:10.15782/D6
233	ucsd_datamares	DataMares	[email protected]	[email protected]	doi:10.13022/M3
237	ucsd_hellyj	John Helly, California Coastal Atlas	[email protected]	[email protected]	doi:10.4246/
240	ucsd_lib	UC San Diego Library	[email protected]	[email protected]	doi:10.13022/M3
240	ucsd_lib	UC San Diego Library	[email protected]	[email protected]	doi:10.15782/D6
240	ucsd_lib	UC San Diego Library	[email protected]	[email protected]	doi:10.21224/P4
240	ucsd_lib	UC San Diego Library	[email protected]	[email protected]	doi:10.21228/M8
240	ucsd_lib	UC San Diego Library	[email protected]	[email protected]	doi:10.4246/
240	ucsd_lib	UC San Diego Library	[email protected]	[email protected]	doi:10.6072/H0
240	ucsd_lib	UC San Diego Library	[email protected]	[email protected]	doi:10.6075/J0
247	ucsd_signaling_gateway	UCSD Signaling Gateway	[email protected]	[email protected]	doi:10.6072/H0
295	ucsd_grush	UC San Diego Philosophy Dept.	[email protected]	[email protected]	doi:10.21224/P4
296	ucsd_mwb	UCSD Metabolomics Workbench	[email protected]	[email protected]	doi:10.21228/M8
298	sb-mostofi	UCSB ECE Dept Mostofi Lab	[email protected]	[email protected]	doi:10.21229/M9
314	ucb-geotech	UCB Geotechnical Engineering Research	[email protected]	[email protected]	doi:10.21418/G8
318	dash	CDL UC3 Dash	[email protected]	[email protected]	doi:10.6075/J0
318	dash	CDL UC3 Dash	[email protected]	[email protected]	doi:10.6078/D1
413	sb-istl	Issues in Science and Technology Librarianship	[email protected]	[email protected]	doi:10.5062/F4
438	ucla-physci	UCLA Physical Sciences Division	[email protected]	[email protected]	doi:10.15144/S4

Note:

  • doi:10.15144/S4 - may not need to be included;
    • DOI:10.15144 is not a supper shoulder and prefixes DOI:10.15144/LT, DOI:10.15144/MK, DOI:10.15144/PL are not in EZID.

@jsjiang
Copy link
Contributor

jsjiang commented Oct 14, 2024

Query:

select user.id as user_id, user.username, user.displayName, user.accountEmail, user.primaryContactEmail,
shoulder.prefix as shoulder_prefix
from ezidapp_user user
join ezidapp_user_shoulders u_s on u_s.user_id = user.id
join  ezidapp_shoulder shoulder on shoulder.id = u_s.`shoulder_id`
where upper(shoulder.prefix) in (
'DOI:10.13022/M3',
'DOI:10.15144/LT',
'DOI:10.15144/MK',
'DOI:10.15144/PL',
'DOI:10.15782/D6',
'DOI:10.18118/G6',
'DOI:10.18739/A2',
'DOI:10.21224/P4',
'DOI:10.21228/M8',
'DOI:10.21229/M9',
'DOI:10.21418/G8',
'DOI:10.25494/P6',
'DOI:10.4246/10',
'DOI:10.4246/CA',
'DOI:10.4246/CW',
'DOI:10.4246/UC',
'DOI:10.5060/D2',
'DOI:10.5060/D4',
'DOI:10.5060/D8',
'DOI:10.5062/F4',
'DOI:10.5063/08',
'DOI:10.5063/3X',
'DOI:10.5063/5T',
'DOI:10.5063/7P',
'DOI:10.5063/9K',
'DOI:10.5063/AA',
'DOI:10.5063/BG',
'DOI:10.5063/CF',
'DOI:10.5063/F1',
'DOI:10.5063/F7',
'DOI:10.5063/G7',
'DOI:10.5063/H7',
'DOI:10.5063/K0',
'DOI:10.5063/M0',
'DOI:10.5063/NS',
'DOI:10.5063/PR',
'DOI:10.5063/QR',
'DOI:10.5063/SJ',
'DOI:10.5063/TH',
'DOI:10.5063/VH',
'DOI:10.5063/X9',
'DOI:10.5063/Z8',
'DOI:10.5065/D6',
'DOI:10.6072/H0',
'DOI:10.6075/J0',
'DOI:10.6078/D1',
'DOI:10.6078/J8',
'DOI:10.6078/M7',
'DOI:10.6080/K0',
'DOI:10.6085/AA',
'DOI:10.7280/S9',
'DOI:10.7293/W2',
'DOI:10.7297/X2',
'DOI:10.7299/X7',
'DOI:10.7928/H6',
'DOI:10.7940/M3')
or upper(prefix) like 'DOI:10.15144%' 
or    upper(prefix) like 'DOI:10.4246%'
or    upper(prefix) like 'DOI:10.5060%'
or    upper(prefix) like 'DOI:10.5063%'
or    upper(prefix) like 'DOI:10.5065%'
or    upper(prefix) like 'DOI:10.6085%'
or    upper(prefix) like 'DOI:10.7293%'
order by user.id, shoulder.prefix
;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datacite Work related to DataCite support in EZID epic High-level project with multiple sub-issues
Projects
None yet
Development

No branches or pull requests

3 participants