Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation endpoint? #156

Closed
nichtich opened this issue Jan 21, 2022 · 9 comments
Closed

Validation endpoint? #156

nichtich opened this issue Jan 21, 2022 · 9 comments
Labels
feature Additional functionality question Further information is requested

Comments

@nichtich
Copy link
Member

Maybe jskos-server might be the right place for gbv/jskos#105:

Add POST /validate and GET /validate?url= where you can send a JSKOS document (single object or array of objects) to be validated. Optional parameter

Validation could either be agnostic to the current database content (just use jskos-validate library >=0.5.1 with option rememberSchemes unless specific type is specified) or it could be a dry run of import script. The latter would also check whether concepts and/or mappings match to an existing vocabulary in the database.

@nichtich nichtich added feature Additional functionality question Further information is requested labels Jan 21, 2022
@stefandesu
Copy link
Member

We don't have a fixed return format for this yet, right? I think it would be good to know not only if the object is valid JSKOS, but also whether it can be important (and the reason if it can't). For an array of objects, it would also be good to have that result for each of the objects.

@nichtich
Copy link
Member Author

We don't have a fixed return format for this yet, right?

We can just pass errors of jskos-validate. This should do:

const validate = require("jskos-validate")
const { guessObjectType } = require("jskos-tools")

// additional parameters (optional)
const unknownFields = params.ignoreUnknownFields
const type = (guessObjectType(params.type, true) || "").toLowerCase()

const rememberSchemes = type ? [] : null
const validator = type ? validate[type] : validate

const result = input.map(data => {
  const result = validator(data, { unknownFields, rememberSchemes })
  return result ? true : validator.errors
})

@nichtich
Copy link
Member Author

nichtich commented Jan 26, 2022

To include information about concept schemes stored in jskos-server, the rememberSchemes array has to be set to an array of all these concept schemes. This should be enabled via an query argument. So we have three optional arguments:

  • unknown (ignore unknown fields)
  • type (expect a given JSKOS type)
  • withSchemes knownSchemes (boolean to check concepts against schemes)

Are there other checks when importing data, e.g. detection of circular narrower links, duplicate URIs/identifier etc.?

@stefandesu
Copy link
Member

stefandesu commented Jan 26, 2022

So, the first implementation is in Dev.

  • Validation for URL (via GET)
  • Validation for JSON data (via POST)
  • unknownFields parameter (I aligned it with jskos-validate)
    • implemented, but needs to be restricted to values 1 and true (analog to other boolean parameters)
  • type parameter
  • knownSchemes
  • rememberSchemes
    • see question below
  • Tests
  • Documentation

I still don't fully understand how rememberSchemes would work in this context. If I look at your example code above, rememberSchemes is only given when the type is given. But if I understand correctly, this only makes sense if the validate is called twice: First with type scheme and rememberSchemes as an empty array, validating the scheme, then a second time with type concept and the rememberSchemes that now includes the validated scheme (if successful). This is impossible to perform in an HTTP endpoint. Does rememberSchemes even have a use here? (I first thought that you would give the validation function an array that first includes the scheme and then the concepts, but when I look at the code for jskos-validate, this would not work because one call to it can't validate multiple types of objects.)

Are there other checks when importing data, e.g. detection of circular narrower links, duplicate URIs/identifier etc.?

No detection of circular narrower links or anything like that. Duplicate URIs/identifiers are also not detected directly, but trying to POST an existing object will return an error, except when bulk importing. I'm also unsure why this would be relevant for this issue. I thought we just want to check whether a JSKOS object is valid or not.

@nichtich
Copy link
Member Author

I've updated the code and it works like expected. For instance:

[
  {
    "type": ["http://www.w3.org/2004/02/skos/core#ConceptScheme"],
    "uri": "http://example.org/voc",
    "notationPattern": "[a-z]+"
  },
  {
    "type": ["http://www.w3.org/2004/02/skos/core#Concept"],
    "uri": "http://example.org/1",
    "notation": ["abc"],
    "inScheme": [{"uri": "http://example.org/voc"}]
  },
  {
    "type": ["http://www.w3.org/2004/02/skos/core#Concept"],
    "uri": "http://example.org/2",
    "notation": ["123"],
    "inScheme": [{"uri": "http://example.org/voc"}]
  }
]

results in

[
  true,
  true,
  [
    {
      "message": "concept notation 123 does not match [a-z]+"
    }
  ]
]

With type=concept the first is invalid but the third is valid because the notation is not checked.

question: should we better return false instead of true on success so truthy result elements indicate errors?

I thought we just want to check whether a JSKOS object is valid or not.

validation endpoint could be used as dry-run before import so additional integrity constraints of import should optionally be enforced on validation as well. But this is not the case and if so, another issue.

@stefandesu
Copy link
Member

question: should we better return false instead of true on success so truthy result elements indicate errors?

That's a good point. I'm a bit split on this issue because on the one hand, I agree that it would make things slightly easier to check, but on the other hand, false only makes sense if you reverse the logic, i.e. we don't ask if it's valid, we ask if there are errors. Also, I don't think it would be too bad if you have to do a strict check for true. That makes things safer anyway.

validation endpoint could be used as dry-run before import so additional integrity constraints of import should optionally be enforced on validation as well. But this is not the case and if so, another issue.

Yeah, I'm still not sure whether that makes sense, so please make a separate issue that's of lower priority.


I guess I'll now finish the rest of the tasks, in particular tests and documentation.

stefandesu added a commit that referenced this issue Jan 27, 2022
There might be issues with JSON parsing when only a boolean value is returned. This will be reflected in the documentation.
@stefandesu
Copy link
Member

I added the documentation with same example calls. @nichtich Could you please check the documentation to make sure there were no misunderstandings? I tried to explain how things like knownSchemes and rememberSchemes work, so it might not be 100% correct.

I will add tests after you checked it because it might change things.

@nichtich
Copy link
Member Author

Ok, I've completed the documentation. What's also missing is inclusion of validation endpoint at /status and at the HTML view at /.

@stefandesu
Copy link
Member

I will add some more tests tomorrow, in particular those that use the parameters, but after that I think this is finished. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Additional functionality question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants