-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enum case sensitivity #193
Comments
If we go with the strict deserialization (option 3):
|
I'd lean towards specifying option 3, personally. However, it seems reasonable to have conjure-java continue deviating from that behavior until we can be confident that its current case-conversion behavior isn't being used in practice in the field. |
I'm in favor of option 3 as well -- I don't like complicated things. |
Enum values are not allowed to use spaces per the spec. This change does not add the negative test while we wait on a resolution to palantir/conjure#193
Enum values are not allowed to use spaces per the spec. This change does not add the negative test while we wait on a resolution to palantir/conjure#193
Enum values are not allowed to use spaces per the spec. This change does not add the negative test while we wait on a resolution to palantir/conjure#193
I'm also a fan of simple things -- and believe it's simpler in implementation to say enum values are case-insensitive, that implementations will always naturally produce values in uppercase, than it is to special-case handling in all languages that the deserialized values are uppercase, and may fill an |
Should we differentiate between unknown and unknowable? I would prefer not to deviate handling enum and union key interpretation. It would be odd if one were case senseitive and the other not. Do we have any prior art in the conjure spec for case insensitivity? |
I'd like to close this out and go with option (3), as implemented palantir/conjure-java#143 - we've been shoring up the spec throughout the open-sourcing process and I think this is just one more under-defined topic to iron out. I'm confident that this won't cause lots of disruption for users because TypeScript users have never been able to switch on lowercase enum values, so I don't think they've ever been used. |
Will closing this out also involve updating all implementations to explicitly treat received lowercase values with an explicit, differentiated error from unknown but uppercase values?
|
Yeah, I think it makes sense to limit the unknown variant handling to strings which could actually be valid but unknown variants. That's just an ASCII SCREAMING_SNAKE_CASE check, right? |
The same goes for unknown union variant names too, I suppose. |
Oh yeah - I meant that we should make sure that unknown union variant names are eligible to be union variant names. The check would be for snakeCase for those I believe rather than SCREAMING_SNAKE_CASE. |
lowerCamelCase |
Er yeah, that >_> |
We're already blocked from supporting this by existing typescript clients which require exact matches on both enum values and union type names. |
For context, @markelliot raised concerns in a slack thread that the newly imposed enum strictness imposes an unfair burden on any existing user-facing services we have that are relying on the old lenient behaviour. palantir/conjure-java#232 is a feature flag which would allow existing services to keep the old behaviour around. |
## Before this PR Pre-2.8.0, conjure-java generated very lenient code: an enum defined as `[BLUE, GREEN]` would accept values `blue`, `bLuE` and `bLUE` as BLUE. It would also tolerate unknown values, such as "RED", "", "!@%^£". When looking into tightening up the [Conjure spec](palantir/conjure#193), we decided that all conjure languages should not be required to implement this behaviour. In 2.8.0, we tightened up the validation of conjure-generated enums, I expected this to be safe because conjure-typescript only ever emitted uppercase values. Unfortunately, this caused P0s at some products where bad values had already been persisted into databases, from non-typescript origins. To alleviate some of this pain, we added a [`--useInsensitiveEnums`](#232) flag to bring back the old behaviour. Unfortunately, this would still throw when deserializing strange values like `""`, which have already been persisted to databases. Also, existing products do not know whether they have nonsensical values in their database already, and realistically do not have serialization tests that exercise every enum edge case, so they didn't know whether they needed to opt-in. ## After this PR To avoid causing lots more P0s and keeping feature flags forever, this PR reverts conjure-java's codegen to the old pre-2.8.0 behaviour. The Conjure spec still says only deserialize and serialize UPPERCASE values, but this PR intentionally diverges from the spec in order to avoid causing pain for existing products. We think this will be safe for inter-language communication because a string `"bLuE"` is deserialized as `MyEnum.BLUE` and re-serialized as `"BLUE"` so typescript clients keep working. In the future, we may want to implement a feature where servers can reject all 'unknown' values in incoming requests. cc @diogoholanda @markelliot @cakofony
Motivation
From discussion on palantir/conjure-java#134, it appears that the case sensitivity of Conjure enums is not defined precisely enough in the wire format spec. @markelliot provided context that early adopters may still be relying on the case insensitivity of enums. This needs to be defined precisely so that new languages (e.g. Rust) can be implemented correctly and precise positive and negative test cases can be added to the conjure-verification server.
Given an enum definition:
Should a message of
{"direction": "up"}
be accepted by a Conjure server as equivalent to{"direction": "UP"}
? Also should this lowercase presentation of"up"
be preserved when re-serialized?(Previously discussed #184)
Option 1: lenient deserialization, round-trip preserves case
The enum values
RIGHT
,right
,rIgHt
should all deserialize without error and be equivalent to a canonical enum value ofDirection.RIGHT
. Re-serializing these values should produce the same input strings:RIGHT
,right
andrIgHt
.Clearly this option minimizes the chance of servers rejecting an input. However, it does require all Conjure language implementations to store the original casing information for each enum to enable re-serialization, requiring slightly higher memory overhead for each enum instance. This is not currently supported by conjure-java and would be a significant undertaking to add to conjure-typescript (as typescript enums are currently pure strings and there isn't a convenient way of overriding equality in typescript so that a switch statement would match both
rIgHt
andRIGHT
).Option 2: lenient deserialization, no round-trip
This is conjure-java's behaviour, as of 2.6.2.
Notably conjure-typescript does not support this lenient deserialization as there isn't a serialization layer yet (palantir/conjure-typescript#48). As such, I think we can be sure that no conjure-typescript clients tolerate receiving lowercase enums from a server, even if they may somehow still be sending them.
Option 3: strict deserialization
A request containing
right
would be considered invalid, and would be rejected by the server. Clients could be certain that all enum responses would be uppercase. This minimizes the amount of memory necessary to store enums internally and also makes new language implementations simpler.Implementation for conjure-java: palantir/conjure-java#143
The text was updated successfully, but these errors were encountered: