Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating new resources and validating with JSON Schema #867

Closed
mvdstam opened this issue Aug 27, 2015 · 41 comments · Fixed by #1600
Closed

Creating new resources and validating with JSON Schema #867

mvdstam opened this issue Aug 27, 2015 · 41 comments · Fixed by #1600

Comments

@mvdstam
Copy link

mvdstam commented Aug 27, 2015

The official JSON API Schema (as described at http://jsonapi.org/schema) explicitly expects the ID field to be sent by clients when creating new resources. This conflicts with the following example (pulled from the official documentation):

POST /photos HTTP/1.1
Content-Type: application/vnd.api+json
Accept: application/vnd.api+json

{
  "data": {
    "type": "photos",
    "attributes": {
      "title": "Ember Hamster",
      "src": "http://example.com/images/productivity.png"
    },
    "relationships": {
      "photographer": {
        "data": { "type": "people", "id": "9" }
      }
    }
  }
}

When I validate the above to the official JSON Schema document, validation fails. Perhaps a seperate JSON Schema document can be provided solely for creating new resources, where the ID field is optional?

@ethanresnick
Copy link
Member

See #851, which is another reason not to use the published schema to validate incoming requests. Basically, that schema is only for validating output/response documents, and we're already thinking about whether to create a separate one for incoming requests (or just to keep a single one, but make it more lenient).

@sebilasse
Copy link

Citing the official documentation

Exception: The id member is not required when the resource object originates at the client and represents a new resource to be created on the server.

ethanresnick added a commit to ethanresnick/json-api-1 that referenced this issue Sep 13, 2015
ethanresnick added a commit to ethanresnick/json-api-1 that referenced this issue Sep 13, 2015
ethanresnick added a commit to ethanresnick/json-api-1 that referenced this issue Sep 13, 2015
ethanresnick added a commit to ethanresnick/json-api-1 that referenced this issue Sep 13, 2015
@ethanresnick
Copy link
Member

@dgeb @tkellen @steveklabnik Do you guys have an opinion here on whether it's worth it to maintain multiple, stricter schemas or just go with one looser one? I was leaning toward the looser schema approach and started mocking it up, as you can see from the links. But, on reflection, ditching the id requirement for all resource objects seems like a big loss, so now I'm not sure.

@mvdstam
Copy link
Author

mvdstam commented Sep 13, 2015

If I may offer my 2 cents (as someone who has used JSON-API in a production environment and had the chance to look at it quite thoroughly). One of the most nice aspects about JSON-API is that once a request goes through a JSON-Schema, you can trust it's okay because it's so strict, and you know that certain data is at your disposal.
I'd go for multiple stricter schemas instead of going with 1 looser schema, because going down that road means potentially making the schema even more looser when more use-cases show up thus making the standard less, well, standard. The id field thing is a major dealbreaker IMHO and should be required for all requests aside from updating existing resources.

@sebilasse
Copy link

@ethanresnick : same like @mvdstam
I would also go for 2 schemas. Most people will use it as "Request Schema" with it's target "Response Schema": This is an official part of JSON schema ("hyperschema") - this article (how they do it at heroku) might be helpful: https://brandur.org/elegant-apis
and btw:
I already designed a typescript interface for those two - but than I realized that typson (which I wanted to use for JSON schema conversion as a starter) only supports features <TS 1.3.
I am starting to rewrite typson with the official TS compiler API which will always be up to date so that it can make oneOf or allOf for union or constraint types.
Afterwards I'll publish the resulting schemas here ...

@tkellen
Copy link
Member

tkellen commented Sep 14, 2015

@ethanresnick I see no harm in maintaining multiple schemas. In fact, I think it's a pretty great idea (as long as their separate usages are clearly defined).

@eneuhauser
Copy link

Here is an updated schema to allow for creating new resources. I'd like to get some feedback before creating a pull request.

In this version, it allows for a new post type, which allows for only a "a single resource object as primary data". resource now extends a newResource definition to require the id. The one downside to this approach is that servers returning a single resource without an id could yield false positives in schema validation, but I feel that is a small compromise for a simpler implementation.

@ethanresnick
Copy link
Member

Thanks @eneuhauser!

I may be misunderstanding something, but I think that schema will accept a single id-less resource in any case, right? That is, it won't yield a false positive only when the server returns a single resource without an id but also, say, if a client tries to PATCH a resource without providing the id?

If that's right, your version is still very interesting because it's stricter than the version I proposed, so maybe it's a good compromise. But, if we do decide that we want to make all the schemas as strict as possible, I don't think we can get around actually providing distinct files for the different cases.

@eneuhauser
Copy link

@ethanresnick, you are correct, a PATCH without an ID would also yield a false positive. As an aside, the spec specifically says this is invalid, but in my implementation it's moot since the URI has the ID. The request schema and response schema solution would still have this problem.

I'm using RAML to define my API and extend the JSON API schema with my own schema to explicitly enumerate the attributes. With my proposed change, you could reference the #/definitions/post with POST calls and #/definitions/success with PATCH calls just as easily as having two different schema files.

@ethanresnick
Copy link
Member

The request schema and response schema solution would still have this problem.

Right. So I was imagining that one schema would be for the "create resource document" case and another would be for "all other request/response documents". Then, for each of these two cases, we'd need two files, one with with additionalProperties: true and one with it false, per #851. So we already have 4 schema files under the multiple files approach, actually. Then, if we want to subdivide the "all other request/response documents" case further—say, to separate out the response docs—we'd end up 2x as many schema files for each subsequent division.

That's why I was a bit hesitant about that approach. At the same time, the changes are small and pretty mechanical, and having these schemas helps implementors, so maybe it's the way to go. If @tkellen's still onboard given the above, then I am too.

With my proposed change, you could reference the #/definitions/post with POST calls and #/definitions/success with PATCH calls just as easily as having two different schema files.

Can you explain to me how this works? Is it something that a user can do only if they're using RAML, or does a generic JSON Schema parser allow the user to call out some definition in the schema and check just against that?

@sebilasse
Copy link

Can you explain to me how this works?

Have you read the article I posted above?

@ethanresnick
Copy link
Member

@sebilasse I just did, and here's what I got from it:

  1. That JSON References can reference definitions in other documents, which (if common tooling actually supports it) could allow us to define multiple schemas in a more DRY way.
  2. That Hyperschema allows one to specify the schema to use for a request to a certain link, and the schema that will be used in the response. This is cool, but I'm not sure how we can apply it (directly) to any schema we'd offer on the jsonapi.org site, because every link object in the Hyperschema needs an href (not just a method) and JSON API can't know which URIs the user's API will have.

@eneuhauser
Copy link

@ethanresnick, below is an example RAML. At the top, you can see the two custom schemas that extend the JSON API Schema. You don't have to use RAML, you could have the custom schemas without defining it in a RAML file.

In my custom schemas, I overwrote the attributes to enumerate the specific properties for my implementation. In this case, I specify that it only allows a title attribute. This schema allows for any relationships, but they could be explicitly defined here as well. This could just as easily be used outside of RAML as well.

@sebilasse, I hadn't heard of hyper-schema before. It looks really interesting and I would like to play with that more. In your specific implementation, you could use hyper-schema that references definitions in http://jsonapi.org/schema.

I believe implementations should be using a custom schema that extends the base JSON API Schema with more detailed rules. If you don't extend the base JSON Schema as demonstrated below, it opens the API up to more issues than just omitting the ID. Potentially, a client could be PATCHing a resource with a misspelled attribute. By explicitly declaring the attributes, you could avoid those issues.

Take a look how the below RAML uses the JSON API Schema as a base and extends it with the specific rules.

#%RAML 0.8
---
title: JSON API Examples
version: v1.0
baseUri: http://jsonapi.org
mediaType: application/vnd.api+json
schemas:
  - BlogPost: |
    {
      "$schema": "http://json-schema.org/draft-04/schema#",
      "title": "JSON API Existing Blog Post Example Schema",
      "allOf": [
        {
          "$ref": "http://jsonapi.org/schema#/definitions/success"
        },
        {
          "properties": {
            "data": {
              "allOf": [
                {
                  "$ref": "http://jsonapi.org/schema#/definitions/resource"
                },
                {
                  "properties": {
                    "attributes": {
                      "type": "object",
                      "properties": {
                        "title":  { "type": "string" }
                      },
                      "required": ["title"],
                      "additionalProperties": false
                    }
                  }
                }
              ]
            }
          }
        }
      ]
    }
  - NewBlogPost: |
    {
      "$schema": "http://json-schema.org/draft-04/schema#",
      "title": "JSON API New Blog Post Example Schema",
      "allOf": [
        {
          "$ref": "http://jsonapi.org/schema#/definitions/post"
        },
        {
          "properties": {
            "data": {
              "allOf": [
                {
                  "$ref": "http://jsonapi.org/schema#/definitions/newResource"
                },
                {
                  "properties": {
                    "attributes": {
                      "type": "object",
                      "properties": {
                        "title":  { "type": "string" }
                      },
                      "required": ["title"],
                      "additionalProperties": false
                    }
                  }
                }
              ]
            }
          }
        }
      ]
    }

/posts:
  get:
    responses:
      200:
        body:
          schema: BlogPost
  post:
    body:
      schema: NewBlogPost
  /{postId}:
    get:
      responses:
        200:
          body:
            schema: BlogPost
    patch:
      body:
        schema: BlogPost
    delete:

@sebilasse
Copy link

@ethanresnick

off course, common tooling supports it.
Finally some time to dive in now. I'll work on some schemas now and answer afterwards.

sorry, just wanted to underline the needs for different schemas for a consumer here ...

sebilasse pushed a commit to redaktor/json-api that referenced this issue Sep 14, 2015
please see / should fix
json-api#867
json-api#851

partially fixes
json-api#406
---> see my following answer in
json-api#867 regarding versioning
@sebilasse
Copy link

@tkellen @ethanresnick @eneuhauser @mvdstam @bf4
Ok - let me explain this:

We have three DRY schemas for

  • create
    • CAN have "additionalProperties" (which should be ignored by responder/server)
    • DOES NOT have to have "id"
  • request
    • CAN have "additionalProperties" (which should be ignored by responder/server)
    • MUST have "id"
  • response
    • MUST NOT have "additionalProperties"
    • MUST have "id"

regarding versioning
it is only visually implemented (by folder structure) yet.
I would recommend to implement it strict. JSON schemas should always correspond to the specs. version because most of the time the spec. changes, the schema will change as well...

This means that the schema would implement "jsonapi" with the version, e.g.: "1.1" as the default value in the schema -->

  • the schema in folder /1_1 only validates against a document if the document has a

    "jsonapi":{ "version": "1.1" }

https://github.com/redaktor/json-api/tree/gh-pages/schemas/1_1 🔍

sebilasse pushed a commit to redaktor/json-api that referenced this issue Sep 14, 2015
@ethanresnick
Copy link
Member

@sebilasse Thanks for putting these together! I'll take a look at them later tonight. On a quick glance, though, they look very nice 👍

One thing, though: whether additionalProperties are allowed or not isn't based on whether it's a request or response document. Instead, it's based on whether the schema is being used to test a program's output or validate input.

That is, imagine that a server generates a response document. If the server wants to check that it generated it correctly, it needs to test with additionalProperties: false. But, if a client wants to process that same document from the server, it needs to read it with additionalProperties: true, because the server could be using a newer version of the JSON API spec with extra properties that the client doesn't know about and should simply ignore. Conversely, if the client wants to check (e.g. in a unit test) that it's generating a request document properly, it should set additionalProperties: false, while the server receiving that request document should check it with additionalProperties: true.

Now, if it's too much work to make both the true and false versions for each of these—and I'm beginning to think it might be—then I'd suggest always setting additionalProperties: true, because using a schema to validate input (not check output) is more common, and rejecting documents improperly is more consequential for the health of the spec.

As far as versioning goes... json api's never remove, only add versioning strategy is atypical, and it means that having a separate schema for each version might not make sense. But, regardless, let's punt on that question for now since 1.1 isn't even out yet :)

sebilasse pushed a commit to redaktor/json-api that referenced this issue Sep 14, 2015
@sebilasse
Copy link

ok - I see - I did the request/response naming because when we think of using JSON hyperschema, there is a "requester" and a "responder" :

If the server wants to check that it generated it correctly

generated means 'response' in JSON hyperschema

if a client wants to process that same document

yes, but also if the client 'request' something at the server, the spec. says

Client and server implementations MUST ignore members not recognized by this specification.

and

having a separate schema for each version might not make sense

well, you have only only one "public" schema [ @ethanresnick : I did that after your comment ... ]
it makes sense for the user if he writes a service locator or has to discover APIs ...
If a server response e.g. says {"jsonapi": {"version": "8.8"}} the user can use the schema I posted just before and be sure that it is jsonapi 8.8 in fact. And for example if the user did not implement version 8.8 yet but for example 8.4 he can check if the response also conforms to the 8.4 schema ...
We could also specify in the top level schemas that
if it does not have jsonapior jsonapi.version use the 'original' (1.0) schema ...

-->
https://github.com/redaktor/json-api/tree/gh-pages/schemas/ 🔍

@sebilasse
Copy link

And @tkellen @ethanresnick regarding the original schema one question:
Is there any particular reason that you use
oneOf in the top level ?

oneOf in JSON schema means match EXACTLY one
Can't a schema match 'info' AND 'success' ?
Shouldn't it be an

anyOf ?

@ethanresnick
Copy link
Member

I did the request/response naming because when we think of a client / user using JSON hyperschema :

My point is that it's not a question of naming. Please see issue #851 and let me know if what I'm saying makes more sense then.

If a server response e.g. says {"jsonapi": {"version": "8.8"}} the user can use the schema I posted just before and be sure that it is jsonapi 8.8 in fact.

Yes, but remember that the jsonapi object is optional, and that a server (or client) can use features from any version of json api without declaring the version at all in the document. Therefore, the schemas can't require this element. More importantly, though, an implementation (whether client or server) is almost always going to want to validate input with the latest schema that was available when the implementation was written—no matter what version the other party says it's using. For example, if I write a server for JSON API 1.8, that means that I can understand any new features added up to 1.8 that a client might use. So, therefore, I just validate every document sent to me using the 1.8 schema. If the client was written for a version before 1.8, it won't use some of the 1.8 features, but it's input will still pass the 1.8 schema (again, because JSON API is an add-only specification). And, if the client was written after 1.8, my implementation doesn't know how to handle any of the post-1.8 features, so I have to ignore any new document members, which is exactly what validating against the 1.8 schema will do.

@ethanresnick
Copy link
Member

@sebilasse yes, i think it can be an anyOf. Also, success can require a data member.

@sebilasse
Copy link

@ethanresnick - important when you check the schemas: In JSON schema additionalProperties defaults to an empty schema ... So in the sense of extending I did not explicitly wrote it as 'true' where it is ...

I have read #851

one for the test suite case and one for incoming document validation

isn't
test suite case = response
and
incoming document validation = request
?

... and I would have commented there what @bf4 had commented there.
I also see a benefit that you can "document" each version with JSON Schema ...

@sebilasse
Copy link

this might get a bit philosophical ...
from my personal perspective currently it is dangerous to specify anything as "add-only" - nobody knows what the future will bring and what laws or copyrights will put restrictions on what - however

currently it is not add-only because if the
1.0 schema says id is required
but
1.1 schema should say for creation id is not required

;) going to bed now - late here...

@ethanresnick
Copy link
Member

In JSON schema additionalProperties defaults to an empty schema ... So in the sense of extending I did not explicitly wrote it as 'true' where it is

Ok. But the problem is the few places where you have additionalProperties set explicitly to false... see below.

isn't test suite case = response and incoming document validation = request?

No :) Let me give an example. Imagine the client requests GET /people/1 and the server responds with this document:

{
  "data": {
    "type": "people",
    "id": "1",
    "relationships": {
       "employer": { "data": { "type": "companies", "id": "4"} }
    },
    "first-name": "John",
    "last-name": "Smith"
  }
}

Here, the server has made a mistake: it's put the attributes directly on the resource object rather than under the "attributes" key. However, even though the server's made a mistake, the client MUST accept this response and ignore the unknown properties. That is, the client MUST read it as though the server had sent:

{
  "data": {
    "type": "people",
    "id": "1",    
    "relationships": {
      "employer": { "data": { "type": "companies", "id": "4"} }
    }
  }
}

So, on the client, the schema that's validating the server's response MUST have additionalProperties set to true. Otherwise, it would improperly reject the server's response.

However, if the server author had written some unit tests to make sure that their server was behaving properly, those unit tests would want to check the server's response with additionalProperties: false, so that the erroneous response I showed above would get flagged as a mistake by the tests.

That's what I mean when I say that the same document—which could be a request document or a response document—needs to have additionalProperties: true when it's being read by the party that's receiving it, but should have additionalProperties: false if the sender is instead using a schema to test the documents that it's writing.

Now, what I was saying here is that the test case situation is much less common then the other situation, so if we were only going to set additionalProperties to one value, we'd want to set it to true—on both requests and response documents.

Here's also a more concrete example of why additionalProperties has to be true on responses: Suppose a client is written when JSON API is at version 1.0. Then, in version 1.2, JSON API adds a new member to a resource object. Now, imagine that the 1.0 client tries to use a 1.2 server. Because the client was written when 1.0 was the latest version, it only has the 1.0 response schema. So, if that schema has additionalProperties: false, the 1.0 client will end up rejecting the response from the 1.2 server. But it should accept that response and just ignore the unknown members.

currently it is not add-only because if the 1.0 schema says id is required but 1.1 schema should say for creation id is not required

But the schema isn't normative :) The schema is just meant as a helpful tool for developers. Only the text on the specification page is normative. And, as you pointed out earlier, the 1.0 text already says that id is optional on creation. So we're not actually making any changes to that between 1.0 and 1.1.

@sebilasse
Copy link

@ethanresnick
preclaimer: In general I do not like the idea that the jsonapi.version is optional.
I would do it like jsonrpc does it, e.g. v2.0 : ” A String specifying the version of the JSON-RPC protocol. MUST be exactly "2.0". ”

... apart from that you wrote:

Here, the server has made a mistake: it's put the attributes directly on the resource object rather than under the "attributes" key.

But how can the server make this mistake at all ?
Any server response MUST answer with a valid response against the "response" schema
(in this case 'success' -> see -> /data -> /resource).

spec.:

Unless otherwise noted, objects defined by this specification MUST NOT contain any additional members.

schema ...response/#/definitions/data:

"additionalProperties": false

= the server is not able to put the properties 'directly' under 'data'

ok - about

However, if the server author had written some unit tests to make sure that their server was behaving properly, those unit tests would want to check the server's response with additionalProperties: false, so that the erroneous response I showed above would get flagged as a mistake by the tests.

Please let me ping @geraintluff - he has specified this v5 proposal which might be an alternative:
https://github.com/json-schema/json-schema/wiki/Ban-unknown-properties-mode-(v5-proposal)
(Geraint, I'd be more than happy if you could add your 5 cents here - your JSON schema ideas might be better than mine ;)

a more concrete example [...]

this is exactly the reason why I think, versioning is fine.
my logic is that each incoming data validates against request and each outgoing data validates against response :

step by step

1.0 client ~~ 1.2 server ~~> 1.0 client

  • The client sends any 1.0 'request'. When the server validates against the top schema 'request' it will go through each 1.0 property and check if its fine (additionalProperties are true for .../{v}/request) ...

Another benefit is now that the server can 'reason' the version and due to the "top schema version switch" the server knows that somebody wants to communicate in version 1.0. A clever server might now respond with a 1.0 response (same logic as in RPC)

– however –

  • if the server is stupid and responds with a 1.2 response, this response must be valid against .../1_2/response which is just saying that no more properties than 1.2 are allowed for outgoing.
  • the client off course validates the incoming .../1_2/response
    against
    .../1_0/request (again additionalProperties are true for .../{v}/request)

So think of
request = incoming for validator
response = outgoing from validator (e.g. you gathered different data sources for response and want to test if your generated response is valid)

@sebilasse
Copy link

@ethanresnick Please let me also answer this in detail (just fyi) :

That Hyperschema allows one to specify the schema to use for a request to a certain link, and the schema that will be used in the response. This is cool, but I'm not sure how we can apply it (directly) to any schema we'd offer on the jsonapi.org site, because every link object in the Hyperschema needs an href (not just a method) and JSON API can't know which URIs the user's API will have.

This is not entirely true. JSON Schema was invented by Kris Zyp and he is so genius ;)
E.g. hyper-schema supports the whole palette of URI Template, as defined in RFC 6570 -
so

a certain link

can be any template and it can also be an empty template, see 5.1.1. URI Templating

just in case anybody wants to communicate in realtime about this -
slack we're on slack with github integration ...

@ethanresnick
Copy link
Member

@eneuhauser

I believe implementations should be using a custom schema that extends the base JSON API Schema with more detailed rules. If you don't extend the base JSON Schema as demonstrated below, it opens the API up to more issues than just omitting the ID.

I agree with this in theory, but I think the picture changes somewhat when you consider that many server implementations will be built on top of generic libraries (like JSONAPI::Resources or endpoints). Those generic libraries will likely run a validation step where it would be convenient to check (against a schema) that the request document is valid JSON API at all. But, for validating the particular fields, it may be more convenient for those libraries to use whatever internal model class the user has defined, rather than forcing the user to provide an extended schema (or generating an extended schema from those models). So I don’t think we can just assume that subschemas will be created (even though that would be nice, because it would give the users a way to pull out just the relevant definitions without us needing separate files).

@sebilasse

But how can the server make this mistake at all ?

You’re right that the spec prohibits the server from responding that way. That’s why I called that response a mistake. It’s a bug in the server. But the whole point of unit tests (where you’d have additionalProperties: false) is that they’re meant to catch bugs.

the client off course validates the incoming .../1_2/response
against .../1_0/request (again additionalProperties are true for .../{v}/request)

Ok, I think I understand what you’re saying now. You’re saying that every incoming piece of data counts as a request (and is checked against the request schema), and every outgoing piece of data counts as a response (and is checked against the response schema).

There are two problems here:

First, there is a naming issue. (I apologize for saying the issue wasn’t naming earlier; I misunderstood.) In HTTP, “request” and “response” to mean different things than how you’re using them. Request means “something sent from the client to the server” and response is “something sent from the server to the client”. So, by that meaning, it wouldn’t make sense for the client to client check the incoming data from the server against something called a “request” schema.

But, more importantly, there’s a conceptual problem: the incoming data to the client is in a different format than the incoming data to the server—so they can’t both be validated by the same schema, whether that schema’s called “request” or anything else. To give an example: It’s completely valid for a client to receive only a ”meta” key from the server, but it’s not valid for the server to receive only ”meta” from the client.

No matter how you slice it, there are four basic cases:

  1. The client checking that it’s receiving proper data from the server
  2. The server checking that it’s receiving proper data from the client
  3. The client checking that it’s sending proper data to the server
  4. The server checking that it’s sending proper data to the client

As I’ve said two or three times already, though, maybe the latter two cases can be ignored, if we think that using schemas to handle them (e.g. in unit tests) is uncommon enough that we don’t need to make distinct, maximally-strict schemas for these case. But, if we were to make such a schema, it would need additionalProperties: false, while the first two schemas would need additionalProperties: true.

In general I do not like the idea that the jsonapi.version is optional.

I understand. But it’s not going to change now post 1.0, and implementations have to account for it being optional.

This conversation seems to be going in circles, so I’m going to bow out now, but I’d be more than happy to review PRs from either @eneuhauser or @sebilasse that take this thread into account.

@sebilasse
Copy link

First, there is a naming issue. (I apologize for saying the issue wasn’t naming earlier; I misunderstood.)

Right - and OK, now I've also got your point !!! (so it is not a circle anymore ;)
btw: I learned that JSON API exists two days ago - so please forgive me, still in learning curve.

Let me make it even more clear :
What I meant is
"Yes, case 3 and 4 can be ignored" - but they might have also been the reason that geraint wrote
https://github.com/json-schema/json-schema/wiki/Ban-unknown-properties-mode-(v5-proposal) and so we could use his wonderful validators supporting it already to use it for unit testing !

however before I can do a PR - what would you like ?
When we take the create ('id' optional) into account : 5 schemas or 3 (+ geraint's proposal as validator setting for unit tests)

For case 1 and 2 we should simply list the differences between those two here - so that I can change the schemas...

@sebilasse
Copy link

@ethanresnick btw - currently reading a nice article beginning with

What if a CMS could track how important every piece of content in the system is at any given time?

This leads me to think about the "$schema" keyword.
The 1.1 spec. could recommend any user to use it and include it in any JSON API document.
The purpose of the "$schema" keyword is exactly what is cited above :
To tell a machine against which "$schema" this JSON document should validate :

In the case of our JSON Schema it validates against the "meta-schema" (the spec. for JSON Schema)
and
in the case of our JSON API it should validate against our JSON Schema ...

And when you have annoying people saying "I want jsonapi.version required" he could easily add another extending schema for his system and so on ...

and pps. - our v5 proposal for multilanguage doc.

@ethanresnick
Copy link
Member

btw: I learned that JSON API exists two days ago - so please forgive me, still in learning curve.

I understand, and I appreciate you wanting to help out! Apologies if I was a bit snappy; yesterday was a long day.

When we take the create ('id' optional) into account : 5 schemas or 3 (+ geraint's proposal as validator setting for unit tests)

When we take create into account, I count 6 being needed; there are the four I mentioned above, except that cases 2 and 3 each get split into two, so we have:

  1. The client checking that it’s receiving proper data from the server
  2. The server checking that it’s receiving proper data from the client on create requests
  3. The server checking that it’s receiving proper data from the client on non-create requests
  4. The client checking that it’s sending proper data to the server on create requests
  5. The client checking that it’s sending proper data to the server on non-create requests
  6. The server checking that it’s sending proper data to the client

Now, as for whether to do all 6 or just do 3: let's start with 3. Then, once those are done, we can see if we can add the other 3 in a DRY way.

For now, let's avoid geraint's proposal because it isn't standardized yet. If it makes it into JSON Schema draft 5, we can reconsider.

For case 1 and 2 we should simply list the differences between those two here - so that I can change the schemas...

I don't remember them all by heart, so you might have to dig through the specification to get an exhaustive list. But here are the big ones:

  • The client must send "data" with a single resource object, while the server can send:
    • data with a single resource object or data with an array of resource objects, or no data at all
    • errors (but if the server sends errors it can't also send data)
    • optional meta and links

I think if we start with that, we'll be good, and we can improve it over time.

This leads me to think about the "$schema" keyword. The 1.1 spec. could recommend any user to use it

Let's talk about that in a separate issue. I'm definitely up for considering it (i.e. whether to recommend $schema), but you should know that it probably wouldn't make it into the spec until at least 1.2 (1.1 is already overloaded with goals).

@eneuhauser
Copy link

I have created two new files for creates and updates to have their own schema. In doing so, I had a question about the spec: in a client request, is it valid to include meta and jsonapi properties? I assume top-level links and included would not be acceptable, but I could see cases where meta and jsonapi would be passed by the client.

I ran into another problem in separating the schemas: should the schema be self-contained or should the create and update schema reference the primary schema? Referencing the primary schema would be DRY, but many tools do not respect the external schemas very well. I'll defer to the group whether it's better to be self-contained or DRY.

I had considered moving the schema under a /schema folder, but GH Pages does not support 301's and it's common for the schema to live at the root.

@ethanresnick and @sebilasse, please checkout my branches and let me know what you think. I wanted a ruling on which style before submitting a pull request.

I have not updated the FAQ page yet to point to the new schema. Once we decide on a direction, I will include the FAQ update in my pull request.

@ethanresnick
Copy link
Member

I have created two new files for creates and updates to have their own schema.

Thanks @eneuhauser! I'll try to take a look at them over the weekend.

For now, please make sure they account for the latest couple bugfixes. And, if these schemas are meant to be used for validating incoming data (i.e. a client validating a server's response, or server validating a client's request), make sure they've got additionalProperties: true everywhere as per the long discussion earlier.

in a client request, is it valid to include meta and jsonapi properties? I assume top-level links and included would not be acceptable, but I could see cases where meta and jsonapi would be passed by the client.

I think assuming that meta and jsonapi are allowed but that links and included aren't is a safe assumption. I may try to work up a PR to the spec itself with this language.

Referencing the primary schema would be DRY, but many tools do not respect the external schemas very well.

Right, that was my concern as well. Maybe we can host separate DRY schemas and then use some build tool that will create self-contained ones? Or even just link users to some tool online that does that, so they can make them for themselves?

@eneuhauser
Copy link

Thanks @ethanresnick for the feedback.

Now, as for whether to do all 6 or just do 3: let's start with 3. Then, once those are done, we can see if we can add the other 3 in a DRY way.

The files I submitted addressed:

*4. The client checking that it’s sending proper data to the server on create requests
*5. The client checking that it’s sending proper data to the server on non-create requests
*6. The server checking that it’s sending proper data to the client

Sorry to drum up this subject, but I feel that if there were only three, having the less restrictive schemas aren't as good.

Unless otherwise noted, objects defined by this specification MUST NOT contain any additional members. Client and server implementations MUST ignore members not recognized by this specification.

<rant>In the schema world, those two statements are mutually exclusive. For implementers of JSON API, it's more useful to have a schema say you're violating the current version of the spec than to give false positives in the event your using an old schema to validate against a newer version of the spec in a production environment. The schema already has to be vague, so allowing additional attributes takes away all its teeth. As a service implementer, I would not run schema validation on incoming/outgoing requests in general, particularly a schema that isn't specific enough for my particular implementation. This schema CANNOT define required attributes that actually would be useful for the client/server to validate.</rant>

All that said, I'd be all for creating the first three schemas so long as the last three schemas exist. Having a build tool would ease that process. It just so happens I already have a build tool that could work. It was designed to piece together RAML documents using Gulp and generate an HTML file. In it, there are tasks that piece together schema files with shared definitions. It works nicely if each of the definitions are their own file which gets concatenated into the self-contained files.

I will work on a new pull request that creates all 6 documents using my build tool. I'll be sure to include to two bugfixes you referenced.

@ethanresnick
Copy link
Member

Sorry to drum up this subject, but I feel that if there were only three, having the less restrictive schemas aren't as good... For implementers of JSON API, it's more useful to have a schema say you're violating the current version of the spec than to give false positives in the event your using an old schema to validate against a newer version of the spec in a production environment. The schema already has to be vague, so allowing additional attributes takes away all its teeth.

The problem with that logic, I think, is that clients and servers won't generally be maintained by the same organization. If organization X deploys a 1.0 server, they may well forget about it at some point (or at least not have the manpower to update it to the latest spec version). So that 1.0 server really, really should be set up to ignore but not reject additional properties, or it won't work with any 1.1+ clients. Independent evolvability of client/server is essential for REST and a good protocol in general.

In the schema world, those two statements are mutually exclusive.

I get that, so I understand why they seem odd. But, in the protocol design world, this apparent "contradiction" is actually a logical prerequisite for extensibility. Please take a look at this great article if you have a second. Basically, it proves that you need to make the set of legally-producable messages smaller than the set of legally consumable messages (which is what additionalProperties: true does) if you want protocol extensibility. So this decision isn't some arbitrary quirk of the spec text or esoteric schema point: it's a critical part of the design!

Having a build tool would ease that process. It just so happens I already have a build tool that could work.

Awesome! It'd be nice to incorporate this build tool into the rake command that the JSON API repo is already uses for building (e.g. to compile the sass). That way, we could keep the DRY versions of the schemas in the json-api repo too, and just edit those. Do you mind sharing this build tool somewhere and letting us incorporate it?

I will work on a new pull request that creates all 6 documents using my build tool. I'll be sure to include to two bugfixes you referenced.

👍 💯

@ethanresnick
Copy link
Member

@eneuhauser Any update on this?

@Petah
Copy link

Petah commented Oct 16, 2015

So is there a schema we can use to validate creating resources? I am wanting to use it for unit testing purposes.

(Sorry if I missed it in the long chain above.)

@aaronshaf
Copy link

This bit me today. Would be great to have schemas for both request and response.

@aaronshaf
Copy link

Perhaps instead of /schema, there could be /schemas/response and /schemas/request

/schema could be redirected?

@sebilasse
Copy link

@eneuhauser Sorry for missing your last reply.

but many tools do not respect the external schemas very well

I don't think so - and the DRY branch is fine as a start. There is always one or two leading tools in each language and personally I prefer to split the dereferencing and validation into two seperate tools and sometimes cache the dereferenced schema and so on.

Just please see the above comments by @aaronshaf and @bf4 – I am open to work on this.

@bf4
Copy link
Contributor

bf4 commented Jun 3, 2016

Would be great to use the schema to validate the examples :)

@junglebarry
Copy link

Hi,

On the basis that this issue doesn't seem to be progressing, could I suggest that the FAQ be updated with an interim message about the schema? For now, it could be explicit that the schema should be applied only to responses, not requests, which would help newcomers to avoid this pitfall until the replacements are ready.

Many thanks!

@handrews
Copy link

Hi folks- I'm popping in from the JSON [Hyper-]Schema project where we're about to release a new set of drafts. This issue was just brought to my attention- I think a lot of things we've been doing in the last two drafts would make things easier for you:

  1. Hyper-Schema is a lot more flexible now, and I suspect there might be some really interesting ways to use it with / describing JSON API. While most of the essential keywords (links, href, rel, targetSchema) are the same, nearly everything else has been tweaked or entirely reworked.
  2. readOnly and writeOnly are now in the validation spec. They (particularly when used with Hyper-Schema), make it much easier to write a single validation schema for a resource. An implementation can ignore readOnly fields in requests and writeOnly fields in responses. Or insist that they be absent. This is an area that needs more fleshing out, but that is part of what those keywords are intended to do.
  3. I need to catch up on your additionalProperties discussions here (I've just barely skimmed the high points of this issue), but there's a lot of discussion that's gone on in the past year about the best way to use it, and some of the things people want to do with it are often better handled in other ways. We intend to focus a lot on the use cases that are not being well-served by the current keywords in the next draft.

I'll comment more when I have a chance to dig through all of this, but I wanted to say "hi" and offer my assistance. I expect to formally publish the draft-07 documents by the end of the month, but if anyone wants a preview we have the work-in-progress documents posted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.