-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scim users get distorted between construct, post, get. #1754
Conversation
7e84d4a
to
1f3c33a
Compare
d862722
to
7eb487a
Compare
7eb487a
to
33aeb1e
Compare
ef5e267
to
9eb2355
Compare
also: - no more Arbitrary instances for un-normalized types. - more coherent normalization. - fixes a couple of failing test cases.
35f9988
to
a31a22d
Compare
I don't remember why I did this, but I think the reason has evaporated. Now it seems quite silly.
failure on concourse:
failure when running tests locally:
I predict both are flakes and won't reproduce for quite a while, but I'm not entirely confident about the latter. |
I saw it one more time, then couldn't reproduce it in 20 or so other local runs, or on concourse. Can't think of any connection between this PR and that test. Hm... |
normalizeRichInfoAssocListInt = nubOrdOn nubber . filter ((/= mempty) . richFieldValue) | ||
where | ||
-- see also: https://github.com/basvandijk/case-insensitive/issues/31 | ||
nubber = Text.toLower . Text.toCaseFold . CI.foldedCase . richFieldType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might get away with only this:
nubber = Text.toLower . Text.toCaseFold . CI.foldedCase . richFieldType | |
nubber = richFieldType |
haven't tried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, Text.toCaseFold
should be exactly the same as CI.foldedCase
, since CI
wraps a Text
here, if I'm not mistaken. And we shouldn't call both toLower
and toCaseFold
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this will trigger that thing that I think is a bug in case-insensitive
again: basvandijk/case-insensitive#31 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonably clean. I'm not super confident, but it looks like we are now normalising everything at every stage, so it's more likely to be correct (especially if we decide to just ignore locale-specific issues and uncommon unicode characters like Cherokee letters), although I suspect some normalisation steps are unnecessary. I left a few comments below.
@@ -41,6 +41,23 @@ data Schema | |||
| CustomSchema Text | |||
deriving (Show, Eq) | |||
|
|||
fakeEnumSchema :: [Schema] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding a comment that this is just for testing is a good idea.
-- (FUTUREWORK: The "recursively" part is a bit of a waste and could be dropped, but we would | ||
-- have to spend more effort in making sure it is always called manually in nested parsers.) | ||
jsonLower :: Value -> Value | ||
jsonLower (Object o) = Object . HM.fromList . fmap lowerPair . HM.toList $ o | ||
where | ||
lowerPair (key, val) = (toLower key, jsonLower val) | ||
lowerPair (key, val) = (CI.foldCase key, jsonLower val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably not ideal to use case-folded strings as JSON keys (Unicode recommends to use case-folding only for comparison). Why is this JSON normalisation still needed? I would guess that once all the case comparisons on the haskell side are done correctly, JSON could be generated just using the "original" strings. Am I thinking about this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original idea of jsonLower
was to run it in json parsers initially, so that the actual parser could rely on everything being lower-case. This was an easy way of working around the fact that json and therefore aeson is strictly case-sensitive.
So morally, this is just lower-casing Value
s that are about to be deconstructed. But I will double-check and add this to the haddocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double-checked, and I was right:
git grep -Hn jsonLower
libs/hscim/src/Web/Scim/Capabilities/MetaSchema.hs:84: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Capabilities/MetaSchema.hs:95: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Capabilities/MetaSchema.hs:114: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Class/Group.hs:63: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Class/Group.hs:76: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/AuthenticationScheme.hs:70: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/Common.hs:97:jsonLower :: Value -> Value
libs/hscim/src/Web/Scim/Schema/Common.hs:98:jsonLower (Object o) = Object . HM.fromList . fmap lowerPair . HM.toList $ o
libs/hscim/src/Web/Scim/Schema/Common.hs:100: lowerPair (key, val) = (CI.foldCase key, jsonLower val)
libs/hscim/src/Web/Scim/Schema/Common.hs:101:jsonLower (Array x) = Array (jsonLower <$> x)
libs/hscim/src/Web/Scim/Schema/Common.hs:102:jsonLower same@(String _) = same -- (only object attributes, not all texts in the value side of objects!)
libs/hscim/src/Web/Scim/Schema/Common.hs:103:jsonLower same@(Number _) = same
libs/hscim/src/Web/Scim/Schema/Common.hs:104:jsonLower same@(Bool _) = same
libs/hscim/src/Web/Scim/Schema/Common.hs:105:jsonLower same@Null = same
libs/hscim/src/Web/Scim/Schema/ListResponse.hs:62: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/Meta.hs:70: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/ResourceType.hs:59: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/User/Address.hs:39: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/User/Certificate.hs:32: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/User/Email.hs:47: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/User/IM.hs:32: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/User/Name.hs:39: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/User/Phone.hs:32: parseJSON = genericParseJSON parseOptions . jsonLower
libs/hscim/src/Web/Scim/Schema/User/Photo.hs:32: parseJSON = genericParseJSON parseOptions . jsonLower
libs/wire-api/test/unit/Test/Wire/API/User/RichInfo.hs:171: jsonroundtrip = unsafeParse . Scim.jsonLower . Aeson.toJSON
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... aaand the haddocks already pretty much say this much. @pcapriotti if you can think of anything to add, please let me know (i can also do it in a separate PR).
normalizeRichInfoAssocListInt = nubOrdOn nubber . filter ((/= mempty) . richFieldValue) | ||
where | ||
-- see also: https://github.com/basvandijk/case-insensitive/issues/31 | ||
nubber = Text.toLower . Text.toCaseFold . CI.foldedCase . richFieldType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, Text.toCaseFold
should be exactly the same as CI.foldedCase
, since CI
wraps a Text
here, if I'm not mistaken. And we shouldn't call both toLower
and toCaseFold
.
very possible! i wanted to mention that my priority was soundness, not efficiency, but forgot. |
This was "needed" because hscim inconsistently used 'Text.toLower', 'CI.foldedCase' etc. throughout the code base, and since they are behaving slightly differently, we had to make sure here to catch them all. Since we have normalized that, we can simplify.
hm...
looks like this doesn't work out of the box? |
This reverts commit f368bde.
Case handling in scim is a bit of a mess.
One of our the bigger issues was this: we parse
RichInfo
data in scim from a json schema that contains the key/value pairs both as a json object, and for when an ordering of keys is required, an assoc assoc array. The parser constructs the union of the unordered and the ordered map. The problem is thatjsonLower
(a relatively new helper function that attempts to deal with the criminally insane idea that scim json needs to be case insensitive in its object attributes.jsonLower
honours this requirement, but of course does not go through all the string values inrichFieldType
in the assoc list that correspond to the object attributes in the map. Fixed eg. here (calling the smart constructor that eliminates duplicates rather than the ADT constructor).To make things more fun, there is this issue with lower-casing in haskell, which we fix here by running all lower-casers in sequence, and overall by using
case-insensitive
consistently over the various lower-casing functions intext
orbase
.This PR also introduces a function
normalizeLikeStored
that normalizes Scim users. It is used a lot in tests, but also to make sure that the spar responses are a bit more "normal".I also add some new tests, and increase the entropy when creating scim data in
/services/spar/test-integration/Util/Scim.hs
(because I ran into collisions there).Sorry, this isn't very coherent. Maybe the changes will make more sense than this attempt at summarizing them? You can skim through the commit history, but I don't recommend reading them in order for a review.
Checklist
changelog.d
.