-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Definition of bitwise enums (flags) #24
Comments
@queequac I'm not entirely sure I follow your question. Are you trying to generate code? Validate data? Something else? Some of these things are better-supported than others by the current JSON Schema vocabulary. The best answer for you will depend on what you're trying to do. The validation vocabulary is actually not well-suited to code generation (although many people use it for that, usually with various restrictions and/or extensions). We hope to address code generation separately from validation in the near future somehow. The features overlap but are not identical. |
@handrews I think this is best explained with an example. An enumeration in C# can be defined as follows:
These values can then be combined using the bit-wise OR operator:
What's interesting about this is that Bit-wise enumerations in JSONA problem arises when attempting to serialize these values to/from JSON when using names for enumeration values because there is no explicit label for the combined value. Serializers can approach this in different ways. For example, Manatee.Json serializes into a delimited string. Enumerations in SchemaTo validate a string representation of a enumeration value in JSON, the native enumeration is typically translated to an enumeration schema.
While this works for enumerations that are merely explicit values, it fails to support these bit-wise enumerations. (e.g. There is no JSON Schema mechanism to support a value that is The interesting twist, however, is that JSON Schema does not define that enumeration values MUST or even SHOULD be strings. Instead they just need to be unique JSON values. The JSON being validated simply needs to be equal to one of these values. This gives rise to a question: how would one combine arbitrary JSON values in the way that these other languages combine their enumeration values? |
This is a good question, I'm really not sure how to handle it, or even if it is in-scope for JSON Schema. One thing that we've recommended recently for people who want to associate more descriptive strings with enumerated values is to use {
"oneOf": [
{"const": 1, "title": "Value1"},
{"const": 2, "title": "Value2"},
{"const": 4, "title": "Value3"},
{"const": 8, "title": "Value4"}
]
} This gets all of the information into the schema, but "title" is not a validation keyword. It's just an annotation- a bit of information that an application can use if the instance is valid against that subschema. Playing with this a bit more, we could come up with {
"oneOf": [
{"enum": [1, "Value1"]},
{"enum": [2, "Value2"]},
{"enum": [4, "Value3"]},
{"enum": [8, "Value4"]}
]
} Now we can validate the instance in either string or integer form. There is nothing in JSON Schema that says that those two forms are interchangeable- Hmm... I'm just tossing out some ideas here. No clear solution yet. |
This still doesn't address the topic of the issue, which is values that consist of two or more discrete values. For example, You'd still have to list out all possible combinations. |
@gregsdennis yeah that's why I said "No clear solution yet." I probably should have said "no solution at all yet" :-) Just trying to poke around and see what constructs we might be able to build off of. This may end up being outside of the scope of validation spec (meaning that you could add an extension keyword of some sort, but it wouldn't be standardized as part of what all validators need to implement). This is the sort of thing that you could easily implement as a preprocessing build step- add your own keyword, then have a script that goes through and works out all of the combinations and dumps them into an I'm not saying it's out of scope yet, but it's an option worth considering. This specific feature is a shorthand for how certain languages use enumerated values, so it's not as universal as most existing keywords (arguments to the contrary are welcome, though). |
The only thing I can think to do that would be within the domain of the spec would be to allow for an array of values where the items each were one of the enum values. Schema: {
"properties" : {
"flagEnumProp" : {
"type" : "array",
"items" : { "enum" : [ "Value1", "Value2", "Value3", "Value4" ] }
}
}
} Instance: {
"flagEnumProp" : [ "Value1", "Value3" ]
} but then, the spec wouldn't have to explicitly allow for this since that's currently supported. I think what's being asked is to have the |
@queequac would the array solution above allow you to do what you want? You can likely configure your serializer to (de)serialize flag enums as an array of named values. |
@gregsdennis To be honest, I am not 100% sure. Okay, I do see the point of having an option how to combine values. But that's just one half of the story. Personally I feel much more uncomfortable with the fact that enumerations in JSON Schema are just a bunch of anything. For me this is too theoretical. Why should someone use arbitrary objects? Having this absolute degree of freedom for enum values might only be plausible for people that come from languages that do not have a built-in enum type. But even then it's not handy since it's still anything and not an enumeration. Even in JavaScript projects you usually do see enumerations being expressed like the following very often: To sum things up: The current enum in JSON schema does not really help people with enumerations, it is too simplistic and does not address real-world-problems. For me it has a smell that it is more easy to mimic the current schema definition of enum with oneOf or other onboard means than expressing a "real" enum. Not everybody lives in a world where no built-in enum exists, so I'd prefer to have a standardized way that fits better for languages that do have. :) |
By the way, I like typescript's definition of enums: Enums allow us to define a set of named constants. This is pretty close to what you have in most languages. Would also be fine for me to leave open the constants' actual type and having a fallback strategy if no values got explicitly defined for an enumeration. Nevertheless: I do not think it's appealing trying to solve this with proprietary constructs with today's JSON schema. And if we'd find a solution that allows us to express sets of named numerical or string constants within the standard, this would be an huge improvement. Also the move to bitwise combinations would be within reach. (Restricting this capability to numerical only, of course.) |
So here it is, a maximum backward-compatible proposal... just as some first idea how we could extend today's specification while not breaking it. enumAn enum is a set of constants, each defined through a key. An instance validates successfully if its value is equal to one of the constants in the set. If keys and constants are identical, the key SHOULD be a string but might be of any other type. In this case the If keys and constants shall be different, keys MUST be strings and the
Note: You are still free to use numbers, strings (or any other type) for constants.* But knowing the schema you have not lost the meaning of the constant being just a magic number (or object) otherwise. * While I have to admit, it feels still strange allowing mixed types for constants instead of restricting them to a uniform type per set… but doing so it might no longer make sense to support Boolean, null and so on. And in case of objects, is it equal object types according to some schema? Personally, I would restrict constants here to a uniform type of either number or string per set. Option: In case of flagsThe keyword
Note: Since instances are dealing with constants only, validation is quite easy. While Greg's sample |
Is this just for validating instances, or is it for something else (user interfaces)? What would be an example of a schema, and valid and invalid instances? |
@awwright Primarily for validation, secondly for type introspection (and maybe code generation). If you'd leave out the keys, you'd end up with magic numbers only. As I said, this is just some first ideas and I gave some options to be considered. Samples have been included, but I will give some more with valid and invalid instances (while leaving out the first one with the array, since this is as of today).
Valid instances:
Valid instances: If I am not totally wrong, flags istances could be validated like the following:
Just thinking loud: Could make sense to have new stuff under one new keyword, instead of getting it mixed up with enum. Something like a keyword
Valid instances:
Valid instances: |
How would this be different than specifying something like |
Which aspect are you questioning? Just the note on how to validate flags and/or the keyword |
I realize that your example is exploratory, but I really am not a fan of contextual keywords, as you show Secondarily, you have these keywords under the That said:
|
It should be noted that with bitwise enums, sequencing has no impact on the overall value.
is the same as
|
@handrews I think that the meta-schema may benefit from this. Currently from draft 7 meta-schema {
"definitions" : {
...
"simpleTypes" : { "enum" : [ "array", "boolean", ... ] },
...
},
"properties" : {
...
"type" : {
"anyOf" : [
{ "$ref" : "#/definitions/simpleTypes" },
{
"type" : "array",
"items" : "#/definitions/simpleTypes",
"minItems" : 1,
"uniqueItems" : true
}
]
},
...
}
} Really the functionality you want is a set of values that may also be combined in a non-repetitive manner by use of an array. I think this could be simplified with a {
"definitions" : {
...
"simpleTypes" : { "flags" : [ "array", "boolean", ... ] },
...
},
"properties" : {
...
"type" : { "$ref" : "#/definitions/simpleTypes" },
...
}
} (This is really a return to the simplicity of the definition of Within the meta-schema,
(This assumes a simplistic interpretation of the values that don't have numerics behind the scenes, thus a numeric value would not be implicitly valid in an instance.) |
@gregsdennis Sorry that it took me that long to answer, was really offline during the holidays. :) Your proposal for flags is really simple and addresses one of the issues very well. Nevertheless, I'd still prefer not to keep the numbers totally behind the scenes. As we discussed above, nearly all programming languages are dealing with numbers in this case. We should at least keep the OPTION to have numbers for those who prefer to work with them (or even need to). Looking at your three questions above: Having this option on today's Final note: Having numbers (maybe even just defined implicitly), ordering as mentioned in #21 would be easy. |
@gregsdennis @queequac I'm still following this, but can't shake the feeling that this is very specific to how strongly typed low-level languages work with this sort of data. And I also have a vague feeling that that puts this outside of the standard validation vocabulary. But I don't have clear argument in support of that so I want to leave this open for further discussion. Since draft-08 is dealing with a lot of issues around extensibility and what it means to have the different standardize vocabularies, and how additional vocabularies can be added, I'm going to move this issue into the draft-future milestone. I think that once we sort out draft-08, it will become more clear where and how we should draw the line between standard and extension vocabularies. I think it will also provide practical guidance on how to successfully build and distribute an extension vocabulary, which we don't have right now. That makes people reluctant to do so, and therefore people want all of their ideas in the standard vocabulary. Right now, there's no good interoperability story for extensions. So definitely continue discussing ideas, but let's wait until we have a better feel on how this might work with, for instance, a set of strongly typed or low-level-storage keywords as an additional vocabulary. Node: the draft-future milestone doesn't preclude this being addressed in draft-09. The draft-09 milestone is, at this point, just for things defining the scope of draft-09. As we start that draft, we'll move in other things that seem like they'll fit. |
Yes, this is a language-specific thing. Not all languages support enums, and only a subset of those support bitwise operations on them. But then, enums are still included in the spec, so why not support bitwise operations, too? That some languages don't support a feature isn't necessarily grounds for JSON schema to not support that feature. I know for C# it's just syntactic sugar because underlying the enum is an integer. This is how bitwise operations are supported. That said, there is no checking for invalid values. If I can't define an integer value of 7 using bitwise operations on declared values, there's nothing preventing me from casting a 7 to my enum type. It's still valid, both at compile time and run time. (I'm not sure what my point is here. It works both as an argument against [any integer is valid] and for [not all integers should be valid] this feature.) Even considering that, with the current spec: Regarding namesI think that the way that the Regarding numerical equivalentsOne would have two options:
|
@gregsdennis it's not so much that I'm skeptical of this topic in particular, but that I don't think that we have a good general heuristic for what should and shouldn't go in the primary validation spec. In fact, most of the purely validation proposals that are still open are kind of marginal in some way: They apply to only some languages, or they are shortcuts that may break desirable schema design properties, or they get into very complex and rare scenarios. I feel like we've more or less covered the obvious, reasonably universal things, and now we need to decide how broadly relevant a concept needs to be before it is included in the main validation spec. In order to do that, I think we need to have a good feel for how easy it is to make extended vocabularies and have reasonable expectations of interoperability. Once we know those things, then I think it will be obvious whether this concept belongs in the main validation spec, or whether it is better thought of as the first proposal in some sort of extension vocabulary. I need to think about how to motivate that discussion and decision. But first we need to get through 512-515. |
I don't mind putting this off until post-draft-8. I think that we've nailed down what it is we're looking to support for this feature. I definitely agree with figuring out 512-515 first. |
This seems like a good candidate for an extension keyword. Moving to the vocabularies repository. |
So far enumerations had two drawbacks for me:
Values can be anything. Typically I have a name and a numerical value in most programming languages that support enums as a built-in type. So I have to decide, go with numbers or go with the name.
There seems to be no support for combination of enum values, so I cannot realize flags or a bitwise enumeration. Most likely this is because the value is not defined by a single built-in type (number or string for example). In case of strings a comma or pipe would be good for multiple flags. In case of numbers simply adding the value.
How could this be addressed? Unfortunately I guess this will be possible through a custom type only, if this can be expressed in JSON Schema at the moment at all.
The text was updated successfully, but these errors were encountered: