Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Dataverse SWORD v2 endpoints #24

Closed
htpvu opened this issue Oct 22, 2019 · 5 comments
Closed

Investigate Dataverse SWORD v2 endpoints #24

htpvu opened this issue Oct 22, 2019 · 5 comments
Assignees

Comments

@htpvu
Copy link
Contributor

htpvu commented Oct 22, 2019

[#LAG-2936] Dataverse questions from Hanh.pdf

information relating to Dataverse known to us can be found in the attached image. additional questions can be routed to Mark Cyzyk. Please CC me in such communication.

@emetsger
Copy link

A general document that outlines the investigation and implementation tasks specific to the Dataverse implementation lives here.

@emetsger
Copy link

LAG-3018 has been opened which tracks access to Dataverse and its metadata requirements

@emetsger
Copy link

LAG-3018 has been closed. I have access to the stage dataverse for testing, and have minimal metadata requirements for the SWORD endpoint, and the JHU DA metadata reqs here.

My preliminary investigation and Dataverse documentation indicate that Dataverse SWORD deposit is a two-step process: 1) submit the metadata, 2) submit the user content (as a zip). Currently I'm verifying that this is the case by looking at the source code.

The review of the source code would conclude the work for this issue (i.e. complete the investigation of the Dataverse SWORD v2 endpoints).

Assuming that the review of the source code confirms the documented and observed behavior (i.e. that deposit is a two-step process), then that will impact the scope of updating core deposit services. The implementation doc will be fleshed out with consequences.

@emetsger
Copy link

Preliminary investigation is complete.

SWORD submission to DVN will need to occur as a three step process:

  1. POST an Atom Entry containing the package metadata (to the SWORD COL-IRI).
  2. POST a zip file containing the user's content using the SimpleZip packaging (to the SWORD EM-IRI)
  3. POST a zero-length entity body to the SWORD SE-IRI with the In-Progress header to false to publish the dataset.

(An open question is the degree to which metadata is supported by SWORD ingest.)

This three-step process is incompatible with the current architecture of Deposit Services, where a package and metadata are assembled together, and sent in a single transaction (i.e. a single SWORD API call containing both the package metadata and content, instead of two separate SWORD API calls). So there are a couple of options:

  1. Implement deposit to Dataverse separate and distinct from Deposit Services, e.g. as it's own service.
    1. This would be incurring technical debt if we wanted to add Dataverse as a supported repository platform for PASS, or if we wanted to use Deposit Services later with the Data Archive (i.e. incorporating it after the MVP)
    2. This is probably the fastest option, because Deposit Services would not be used in the MVP.
  2. Update Deposit Services to support this three-step deposit
    1. I can come up with a design and we can decide if it is worth implementing
  3. Investigate the Dataverse so-called "native" APIs, and see if that allows for the submission of a single package (containing metadata) in one step. This seems unlikely, since it would be allowed via SWORD if supported.

@emetsger
Copy link

emetsger commented Oct 30, 2019

For completeness, this issue indicates there is no planned support in Dataverse for incorporating metadata in the uploaded zip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants