-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File API download bypasses terms of use and guestbook #2911
Comments
Updating this to cover terms of use and not increasing download count, which is covered in #3331. |
We are looking forward to this functionality cause we are facing some issue related to copyright regarding organizations those harvesting our dataverse using the API.Since they are getting a direct download link to the file and puts it on their sits users are downloading them without any knowledge or agreement to the terms of use. |
@solhm thanks for your comment. I just brought up "File API download bypasses terms of use" with @djbrooke @scolapasta and @sekmiller while discussing #3758. |
@alejandratenorio brought up this issue today and we've been discussing it at https://dataverse.zulipchat.com/#narrow/stream/379856-security/topic/No.20Restricted.20Files.20.2F.20Access.20conditions/near/427955246 |
Hi all, Possibly CIMMYT could collaborate on this. As @pdurbin suggested, we would like to have a proposal validated by you before any development. These are our assumptions:
File download:
Guestbook:
We underline the proposed changes. |
Hi all, Due to some observations and comments, we have adjusted our proposal:
CIMMYT Proposal - File download:
Guestbook:
Terms of use:
Private link to accept Terms of Use:
We would like to hear your comments, if you think it could work. |
@alejandratenorio thank you for the detailed writeup! Overall, I think this makes a lot of sense. A few questions:
|
Hi @pdurbin, Thanks you very much for your comments. For this part... If you wish to download the datafile XXXX, please go to [insert Datafile URL]... would the second URL always be the same or would it vary and expire over time? If it's the latter, perhaps we could re-use SignedUrls from GDCC/7715 Signed Urls for external tools #9001.
What do you think about making the new behavior the default, since it's more secure... and if installations don't like it, the configuration option could revert to the old behavior?
For guestbook, what about required fields that aren't in the user account? Custom questions can be created and set as required, which complicates things.
Have you considered getting additional feedback from the Dataverse community by posting at https://groups.google.com/g/dataverse-community ? I think others might have opinions on this! I'll also mention this our internal Slack (DONE).
|
Sure. I think I'm still a bit confused about the proposed multistep solution for downloading files. Is it something like this?
I guess my question is, do they have to parse the text to find the URL? Will this be easy to do? What happens to the existing download URL? It stops working? Now the user get a text file instead? We can go back to Zulip if that's easier! 😄 Or maybe a Google doc where I can leave comments here or there? |
To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'. If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment. |
I left a note at https://dataverse.zulipchat.com/#narrow/stream/379856-security/topic/No.20Restricted.20Files.20.2F.20Access.20conditions/near/464124733 that anyone is welcome to open a fresh issue. |
Currently, when you download a file through the UI, all logic for creating a GuestbookResponse row is down before hitting the API to download the file.
If you download the file directly from the API, you don't create a row here, so the count does not go up. Also this bypasses the terms of use and guestbook completely. We need to make sure a ) a row gets created, so counts are accurate, b) that we determine how we want to handle the bypassing of the terms of use (via a token?) rather than just acting like they don't exist.
The text was updated successfully, but these errors were encountered: