Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Browser validator: Unclear error "no dataset description found" #2

Open
pederisager opened this issue Sep 17, 2019 · 13 comments
Open
Labels
spec Needs input from specification experts

Comments

@pederisager
Copy link
Contributor

For two separate datasets I receive this error:
image

What does this error mean? The directories I submitted both have a file called "dataset_description.json" contained in the first layer of the directory, and this JSON file contains a " description" field.

In general I think novice users would find it very helpful if returned errors also included some indication of what the user should do to fix the problem.

Here is the content of "dataset_description.json" for one of the submitted projects:
{ "@type": "Dataset", "@context": "https://schema.org/", "name": "Red Square Meta-Data", "description": "This repository contains tables extracted from a large number of meta-analyses published in the journal Psychological Bulletin (ISSN: 0033-2909), as well as associated materials and analyses.", "schemaVersion": "0.1.0", "license": "", "author/creator": [ "Jochem Bek", "Remy Hertog", "Peder Mortvedt Isager", "Daniel Lakens", "Maximilian Maier", "Pepijn Obels" ], "citation": "", "funder": "", "url": "", "sameAs": "", "variableMeasured": [ { "@type": "PropertyValue", "name": "x_study", "description": "Author and publication year reference" }, { "@type": "PropertyValue", "name": "x_authors" }, { "@type": "PropertyValue", "name": "x_study_year" }, { "@type": "PropertyValue", "name": "x_study_description" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, { "@type": "PropertyValue", "name": "" }, ], "keywords": [ "meta-data", "meta-analysis", "Psychological Bulletin", "psychology", "database" ], "temporalCoverage": "", "spatialCoverage": "", "datePublished": "", "dateCreated": "2019-03-01", }

@FelixHenninger
Copy link
Collaborator

Hi @pederisager , and many thanks for your report -- this is very clearly a bug 🐛 , so let's track it down! Is there any way you could make a reproducible example from your dataset, for example by creating a zip file of the folder you're submitting for validation? That would be enourmously helpful.

Thanks again for taking the time for testing and reporting this issue! 😍 Best,

-Felix

@FelixHenninger FelixHenninger added the bug Something isn't working label Sep 20, 2019
@pederisager
Copy link
Contributor Author

My pleasure, thank you for tracking it down!

You should be able to reproduce the error with the zipped dir attached.

validator_bug_test.zip

@FelixHenninger
Copy link
Collaborator

FelixHenninger commented Sep 21, 2019

Hej @pederisager, thanks a lot! Your example looks a-ok (there are some trailing commas in the JSON at lines 225 and 237), so this is indeed a validator issue -- can you help my test a hypothesis/wild guess? My hunch is this one: Could it be that you're selecting the files by clicking on the validator 'button' and then selecting the folder? If so, does the output make more sense if you drag-and-drop the folder?

If I'm correct, this is my mistake, and a purely UI-level issue: The file selector passes the files to the validator in a different format than the drag-and-drop interface, and it's tricky to correct for that. For the time being, I thought I had removed the 'click to select' message, but apparently I haven't. I'm gonna do that until we figure out how to make the two file input modes equivalent.

Hope that makes sense -- have a great weekend!

@pederisager
Copy link
Contributor Author

The issue is not resolved by drag-and-drop, but I get a different issue message for the same file:
image

If I fix the trailing comma issue you spotted, I get yet another error message for the same file after drag-and-drop (click method yields the same "no data description found" issue as before):
image

@FelixHenninger
Copy link
Collaborator

Hi Peder,

I'm glad to hear that things work via drag-and-drop! The last two issues you've found are the validator working as it should: Out-of-the box, the JSON is indeed invalid, and with the file format fixed, the schema doesn't match our expectations. For example:

From my perspective, these are all issues where the spec is underdefined and/or I'm misunderstanding it. I suggest we move this discussion over to the spec document?

Thanks again for your feedback, and kind regards,

-Felix

@pederisager
Copy link
Contributor Author

Great! I resolved all the issues identified by the validator and then it gives the dataset a "looks great!"

One comment: funder datePublished url and temporalCoverage are all optional/recommended fields. If I delete these fields from the JSON the validator apparently doesn't complain. Instead of deleting them from the JSON however i just want to leave them as empty strings temporarily, so that I'll remember to fill them in when the info is available. Could the validator treat empty strings as if the field was not specified? That would be extremely convenient given that I mostly want to copy a template dataset_description.json from somewhere (that has all possible fields contained within it) and fill it in with info relevant to my data as I go.

@nicebread
Copy link

From a user's perspective, I'd second the wish to have empty templates which can be filled out later (and still be valid). But maybe null instead of empty string? Probably depends on the meaning we want to carry.

@FelixHenninger
Copy link
Collaborator

I see you both, but I also think that this is a spec issue and discussion should move to the spec document. Would either of you be a champion for this and also for the null proposal in issue #5?

The way things are right now stems from the fact that -- as you noticed (and as we discussed over in #5) -- JSON schemas require any present value to match the pattern. We can change this to allow empty values or null.

Personally I think that the role of a validator (as [ideally] a final arbiter of the spec) should be to alert users to any possible mistake that would result in a parsing error, even if the intention is to fill out values later. We could think about a lenient or template mode which ignores empty strings or the like, but in my view the effort would be better spent working together on the metadata builder tools. But that's just my two cents, again I think that this is a community choice to be made on the spec side.

@FelixHenninger FelixHenninger added spec Needs input from specification experts and removed bug Something isn't working labels Sep 22, 2019
@mekline
Copy link
Contributor

mekline commented Sep 23, 2019

(I brought this up on #5, but for tracking purposes: I agree that null/blank properties would be useful to the community, and happy for us to discuss it, but let's attempt to do whatever Schema.org Dataset expects first off? And then determine what decisions remain from there)

@pederisager
Copy link
Contributor Author

Agree @FelixHenninger, I had not considered the downsides of having null be an allowed value for the pattern match. You probably have a much better understanding of the nuts and bolts of this than I do so do take any suggestions with a healthy grain of salt. All I can say is that, as a user, it is confusing that the validator fails for an empty field that is specified as "optional" in the spec document. The fact that I can just delete this field to solve the issue is not obvious, since the validation will also sometimes also fail if I delete (required) fields. Some documentation is probably sufficient to resolve this.

@FelixHenninger
Copy link
Collaborator

FelixHenninger commented Sep 23, 2019

as a user, it is confusing that the validator fails for an empty field that is specified as "optional" in the spec document.

Ah, I think see now, thanks! (sorry, I was being thick, and didn't get this earlier). Would it, from your perspective, be useful for example to change the message text to "Optional field funder should be array" in the last screenshot above?

You probably have a much better understanding of the nuts and bolts of this [...]

Hell no! Making this up as I go along 😁. Super-glad to have you with us, and to figure this out together!

@pederisager
Copy link
Contributor Author

Would it, from your perspective, be useful for example to change the message text to "Optional field funder should be array" in the last screenshot above?

Yes, although I would prefer it to be even more specific. Something like "Optional field funder should be array if specified, or be deleted if the intention is to leave it unspecified". Otherwise I'm not sure it is cler to the user what they should do if they want to leave the field unspecified.

@mekline
Copy link
Contributor

mekline commented Sep 23, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec Needs input from specification experts
Projects
None yet
Development

No branches or pull requests

4 participants