-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Package JSON schema #4747
Comments
@pkiraly thanks for following up on the discussion at https://groups.google.com/d/msg/dataverse-community/ao-zFwN_M0M/LDlfR3hfBQAJ by creating this issue. |
Hi @pkiraly & @pdurbin! I'm the product owner for the Frictionless Data reproducible research project (https://frictionlessdata.io/reproducible-research/). I came across this issue while checking out Dataverse and wanted to say hi! I would be happy to chat with y'all more about if there is a potential collaboration with us and Dataverse, or answer any questions you might have about datapackage.json or any of our other software or specs. 😄 |
@lwinfree hi! I bet @pkiraly is enjoying his weekend by now but I'm in http://chat.dataverse.org for another hour and a half today. Or there's always next week or whenever. 😄 |
Fast forward a few years and @qqmyers has been adding excellent support for BagIt as a packaging standard in Dataverse (export first and now import!). A good entry point is the docs: https://guides.dataverse.org/en/5.12/installation/config.html#bagit-file-handler Just today the team talked about this issue... ... that mentions BagIt but another packaging standard we've talked about is RO-Crate: Finally, we've supported SWORD for a long time but that's just for import and we never did get around to having some sort of manifest inside the zip to populate file descriptions: (This is just as well, because we have BagIt import now.) First, @pkiraly are you still interested in this Data Package standard from Frictionless Data (by the way, thank you, @lwinfree for attending a community call a while back!)? Second, is your vision for export or import or both? Is it a zip file? Thanks. Oh, finally, other packaging stuff I hear about are AIPs, DIPs, and SIPs. I believe these are more conceptual that specific standards. I guess they come from OAIS. |
To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'. If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment. |
I just come across an interesting paper:
Fowler, Dan, Jo Barratt, and Paul Walsh. “Frictionless Data: Making Research Data Quality Visible.” International Journal of Digital Curation 12, no. 2 (May 13, 2018): 274–85.
https://doi.org/10.2218/ijdc.v12i2.577
http://www.ijdc.net/article/view/577
This is a summary of activities of Open Knowledge International regarding to data quality. One of their suggestion is a JSON schema called Data Package (the full description is available at https://frictionlessdata.io/specs/data-package/) which describes the structure of the underlying data, e.g. the column types of a CSV file, dictionaries, signals for NA etc. They have created Python and JavaScript libraries to read these metadata along with the CSV files, and tell the programs how to interpret the input file. External partners created R and Ruby packages).
I think the Dataverse support of it would be very useful. If you are interested, read first the paper, then the description of the Data Package.
The text was updated successfully, but these errors were encountered: