Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datalad get fails with 404 due to :persistentId usage in pydataverse #310

Closed
shoeffner opened this issue Jul 17, 2024 · 1 comment · Fixed by #311
Closed

datalad get fails with 404 due to :persistentId usage in pydataverse #310

shoeffner opened this issue Jul 17, 2024 · 1 comment · Fixed by #311

Comments

@shoeffner
Copy link
Contributor

shoeffner commented Jul 17, 2024

While debugging #307, I noticed that the way datalad-dataverse calls the pydataverse API does not work, neither with the demo instance, nor with a local docker-compose setup.

$ datalad get test.csv
get(error): test.csv (file) [external special remote error: Client error '404 Not Found' for url 'http://localhost:8080/api/access/datafile/:persistentId/?persistentId=23&User-Agent=pydataverse&key=<the API token>']

Even when doing the call with curl it does not work:

$ curl 'http://localhost:8080/api/access/datafile/:persistentId/?persistentId=23&User-Agent=pydataverse&key=<the API token>'
{"status":"ERROR","code":404,"message":"API endpoint does not exist on this server. Please check your code for typos, or consult our API guide at http://guides.dataverse.org.","requestUrl":"http://localhost:8080/api/v1/access/datafile/:persistentId/?persistentId=23&User-Agent=pydataverse&key=<the API token>","requestMethod":"GET"}

However, I found out that replacing the placeholder with the actual ID directly, dropping the user agent AND moving the token to the headers seems to work (again, on both instances):

$ curl -H "X-Dataverse-key: ${DATAVERSE_API_TOKEN}" http://localhost:8080/api/access/datafile/23
a       b       c       d       e
1       2       3       4       5
6       7       8       9       0

This route is documented at https://guides.dataverse.org/en/latest/api/dataaccess.html, which makes me wonder if ":persistentId" should again be some kind of DOI.
The key-parameter SHOULD work (see https://guides.dataverse.org/en/latest/api/auth.html), but it does not seem to work; leaving out the User-Agent but adding the key as a get param is not working. This could be a server setting or a dataverse issue though; either way, since the response returns the actual request URL, it might make sense to change that anyways.
I am not sure what a good fix for this is. For one, this is an issue in pyDataverse/api.py:122-125, which is kind of an interesting thing as the get_request allows to pass params and auth which are then ignored completely.

Eventually, I found out that switching to is_pid=False seems to work.
Again, I don't know what a good fix is now. Will all siblings rely on the file id, or is there a way to use persistentIds? If the former is the case, setting is_pid=True in datalad_dataverse/dataset.py:174 resolves the issue. However, if both is possible, one option would be to simply try both modes:

response = self.data_access_api.get_datafile(fid, is_pid=False)
if response.status_code != 200:
    response = self.data_access_api.get_datafile(fid, is_pid=True)
    response.raise_for_status()

However, this might hide the first error, so I am really not sure. We could maybe also collect both responses and, if none is ok, raise a custom exception with both error messages. I will create a draft PR which simply sets is_pid=False for now, looking for your feedback and ideas though.

For more details, see also #307 and the corresponding PR.

@mih
Copy link
Member

mih commented Jul 19, 2024

Thanks for the analysis. It is spot on! I am looking into this right now.

mih added a commit to mih/datalad-dataverse that referenced this issue Jul 19, 2024
Despite the patch-level version change there were breaking API changes.

This change adjusts the test helpers only, as a first step to address
the API changes.

It is based on an analysis in
datalad#310

Co-authored-by: Sebastian Höffner <[email protected]>
mih added a commit to mih/datalad-dataverse that referenced this issue Jul 19, 2024
Despite the patch-level version change there were breaking API changes.

This change adjusts the test helpers only, as a first step to address
the API changes.

It is based on an analysis in
datalad#310

Co-authored-by: Sebastian Höffner <[email protected]>
@mih mih closed this as completed in #311 Jul 19, 2024
mih added a commit to mih/datalad-dataverse that referenced this issue Jul 19, 2024
Despite the patch-level version change there were breaking API changes.

This change adjusts the test helpers only, as a first step to address
the API changes.

It is based on an analysis in
datalad#310

Co-authored-by: Sebastian Höffner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants