Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear how to find more granularity of files beyond "File Type" (application, tabulardata, data, etc.) #3597

Closed
pdurbin opened this issue Jan 26, 2017 · 2 comments
Labels
Feature: Search/Browse Type: Bug a defect User Role: Curator Curates and reviews datasets, manages permissions UX & UI: Design This issue needs input on the design of the UI and from the product owner

Comments

@pdurbin
Copy link
Member

pdurbin commented Jan 26, 2017

If you got to https://dataverse.harvard.edu/dataverse/polbehavior and click "Files" you see "File Type" as in the screenshot below, but the detail is not very granular. You see files flagged as application, tabulardata, data, etc.:

political_behavior_dataverse_-_2017-01-26_12 20 34

What if you want a specific type of file such as an R file? It's possible but you have to know which undiscoverable field to search on, which in this case is fileContentType. Specifically, a search for fileContentType:type/x-r-syntax ( https://dataverse.harvard.edu/dataverse/harvard?q=fileContentType%3Atype%2Fx-r-syntax ) will show 1,553 as of this writing in the Harvard Dataverse:

harvard_dataverse_-_2017-01-26_12 30 24

fileContentType (and related fields, possibly) should be exposed on the Advanced Search page, which currently looks like this:

advanced_search_-harvard_dataverse-_2017-01-26_12 25 10

The way I figured out to look for type/x-r-syntax is by looking at a list of files for a dataset in JSON format with this: https://dataverse.harvard.edu/api/datasets/:persistentId?persistentId=doi:10.7910/DVN/SCN9LA

The background on this issue is that this morning @christophergandrud were discussing how one would identify all the R files in an installation of Dataverse. As a starting point, we looked at http://dx.doi.org/10.7910/DVN/SCN9LA (screenshot below) which I happened to know has an R file called "Thal PB Replication.R" that can be used to reproduce plots in a paper. (Once this platform for reproducibility has been publicly announced, I'll mention it. 😄 )

Anyway, with fileContentType:type/x-r-syntax as a starting point, one should be able to iterate through all these R files via Search API: http://guides.dataverse.org/en/4.6/api/search.html#iteration

(This issue is related: What are the allowed search fields for the Search API q parameter? #2558).

replication_data_for_class_isolation_and_affluent_americans perception_of_social_conditions-political_behavior_dataverse-_2017-01-26_12 11 27

@pdurbin pdurbin added Feature: Search/Browse UX & UI: Design This issue needs input on the design of the UI and from the product owner labels Jan 26, 2017
@christophergandrud
Copy link

Thanks so much for putting this detailed not together @pdurbin!

@pdurbin
Copy link
Member Author

pdurbin commented Jun 28, 2018

It was a good exercise for me to write up some thoughts here but this issue is referenced from #2822 and #4377 so we can find it again, if necessary (or through search, of course). Until users ask for this, I think this issue is just noise among 850 other open issues so I'm closing it. 😄

@pdurbin pdurbin closed this as completed Jun 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Search/Browse Type: Bug a defect User Role: Curator Curates and reviews datasets, manages permissions UX & UI: Design This issue needs input on the design of the UI and from the product owner
Projects
None yet
Development

No branches or pull requests

3 participants