Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"null" values allowed? (Clarify with spec) #5

Open
nicebread opened this issue Sep 17, 2019 · 7 comments
Open

"null" values allowed? (Clarify with spec) #5

nicebread opened this issue Sep 17, 2019 · 7 comments
Labels
help wanted Extra attention is needed spec Needs input from specification experts

Comments

@nicebread
Copy link

The DataSchema Shiny app returns null for some fields if no value is entered, e.g. "description": null in the variableMeasured object.

The spec says:
description |   | OPTIONAL | String | Description of the variable (for humans)

The validator complains: .variableMeasured[12].description should be string

The question is whether optional fields which are empty are allowed to be present and can have null as value.

@FelixHenninger
Copy link
Collaborator

Yes! Should we decide to do this, we can use the new support multiple types in our validator schema

@FelixHenninger FelixHenninger added help wanted Extra attention is needed spec Needs input from specification experts labels Sep 20, 2019
@mekline
Copy link
Contributor

mekline commented Sep 23, 2019

I know things might be in a temporary state during development, but does the app currently validate against a JSON schema that you're building up, or from an 'official' one referenced from schema.org ? In the end, it would be idea to be able to refer to the Psych-DS dataset_description.json object as 'Schema.org Dataset version X, plus these special fields'

(I don't know where this is available, and am lost googling for JSON schemas of schemas, but maybe @rubenarslan knows?)

I raise this here because, if the Dataset schema has an opinion about null values, I think we should start there.

@mekline
Copy link
Contributor

mekline commented Sep 23, 2019

Also - I chatted briefly with @vsoch a while back about a tool she's made which programmatically interacts with schema.org options: In python, but maybe some useful ideas there as well?

https://vsoch.github.io/2018/schemaorg/

@vsoch
Copy link

vsoch commented Sep 23, 2019

I would be glad to help - and yes schemaorg Python handles loading in a particular version, a Schema, and basic validation.

I wouldn’t use something like R for the validation - it’s a great language for scientific programming but is spread a little thin for actual software development. A good starting criteria would be to have a validator that doesn’t have excessive dependencies (node) and can run statically either on the command line or browser. An easy solution (and one that still is in the family of scientific programming) would be to use Python, a harder (but possibly more integrated) solution might provide a validator binary in a language like golang that can then compile and run with static JavaScript via Web Assembly.

Anyway if you want some help, glad to offer!

@FelixHenninger
Copy link
Collaborator

[Melissa:] does the app currently validate against a JSON schema that you're building up, or from an 'official' one referenced from schema.org?

Excellent point, thanks for bringing this one up! It's currently (following BIDS' example) validating against a custom schema that implements a subset of schema.org -- basically, the approach has been to start from the spec, rather from the schema.org dataset definition.

I've played with a more complete schema.org validation, but because of the near-infinite number of possible nested sub-schemas that can be pulled in, my experience has been that this gets messy very quickly. What I did was to use https://github.com/geraintluff/schema-org-gen (or rather, one of its more up-to-date forks) to generate JSON schemas for the schema.org standards, which the JSON validator can then read. (last I tried, this resulted in 100s of MBs of data, which we would need to bundle [and probably prune first somehow] because we can't assume internet connectivity). I'm going to take a look at how Vanessa does this! (which is an excellent heuristic in general, many thanks, Melissa, for the pointer!)

Is full to-the-letter schema.org compliance an important feature for you, right now?

I raise this here because, if the Dataset schema has an opinion about null values, I think we should start there.

I can't speak for schema.org, but Google's structured data validator isn't happy with either null or empty strings for required entries.


[Vanessa:] I would be glad to help - and yes schemaorg Python handles loading in a particular version, a Schema, and basic validation.

Hello @vsoch, great to see you here, and thanks a lot for your comments! 🤩👋🦕 I'm gonna take a good look at your code, would be psyched to chat, and thrilled to get your expert input on all of this -- I wasn't aware that you've worked on such a similar project!

I wouldn’t use something like R for the validation - it’s a great language for scientific programming but is spread a little thin for actual software development.

Preach! *ducks*

A good starting criteria would be to have a validator that doesn’t have excessive dependencies (node) and can run statically either on the command line or browser.

Is node a suggestion or an example for excessive dependencies? 😉 Right now, that's basically what we have in the prototype -- the validator core is built in JS, and runs in-browser, in a node-based CLI, or as an R package via V8 bindings.

An easy solution (and one that still is in the family of scientific programming) would be to use Python, a harder (but possibly more integrated) solution might provide a validator binary in a language like golang that can then compile and run with static JavaScript via Web Assembly.

Ooh, these are great ideas, it's going to be worth pinging the iodide folks, and WebAssembly might be the solution to the bottlenecks we're seeing right now.

Anyway if you want some help, glad to offer!

I can't speak for Melissa or anyone else around here, but 🙏😍

@vsoch
Copy link

vsoch commented Sep 23, 2019

Thanks @FelixHenninger - to clarify, I’d generally stay away from node.

@FelixHenninger
Copy link
Collaborator

FelixHenninger commented Sep 23, 2019

Sure thing @vsoch, such a pleasure to have you around!

To clarify, I’d generally stay away from node.

🙈 I share your sentiment here in general, but would suggest that we discuss the project requirements and possibilities in a more interactive channel. From my understanding of the implementation reqs, I don't see any other way right now, but of course that doesn't mean it doesn't exist. If there's a better solution out there, I'm all in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed spec Needs input from specification experts
Projects
None yet
Development

No branches or pull requests

4 participants