Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Update UUID_REGEX #183

Merged
merged 2 commits into from
May 30, 2024

Conversation

kevin-glare
Copy link

@kevin-glare kevin-glare commented May 21, 2024

The current format (/\A\h{8}-\h{4}-\h{4}-[89AB]\h{3}-\h{12}\z/i) supports only the UUID variant DCE Security (89AB).
So I've updated the regular expression to support more uuid variants.
I also noticed that in the tests there is an option using such a regular expression.

Copy link
Owner

@davishmcclurg davishmcclurg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally went with this variant because the json schema spec says:

uuid:
A string instance is valid against this attribute if it is a valid string representation of a UUID, according to [RFC4122].

And the variant described in RFC4122 matches [89AB], but I'm ok with loosening the restriction here.

I'm curious: what UUIDs are you seeing that don't match?

@@ -74,7 +74,7 @@ module Format
IP_REGEX = /\A[\h:.]+\z/.freeze
INVALID_QUERY_REGEX = /\s/.freeze
IRI_ESCAPE_REGEX = /[^[:ascii:]]/
UUID_REGEX = /\A\h{8}-\h{4}-\h{4}-[89AB]\h{3}-\h{12}\z/i
UUID_REGEX = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should leave this using \A \h and \z. Should just need to change the {3} to {4} and remove [89AB].

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much! I have already fixed this to /\A\h{8}-\h{4}-\h{4}-\h{4}-\h{12}\z/i.

The regular expression UUID_REGEX =/\A\h{8}-\h{4}-\h{4}-[89 AB]\h{3}-\h{12}\z/i will not allow the following valid UUIDs to be checked:
• UUID version 1 with odd ninth octets and a sixth octet not equal to 4 or 5.

For example, the following valid UUID will not pass verification:
• d5b7f84c-1b49-11ef-9262-0242ac120002

This is because the regular expression checks whether the ninth octet is one of 89 AB, and in this UUID, the ninth octet is 2.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remembered another example - Nil UUID
00000000-0000-0000-0000-000000000000

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the following valid UUID will not pass verification: • d5b7f84c-1b49-11ef-9262-0242ac120002

This is because the regular expression checks whether the ninth octet is one of 89 AB, and in this UUID, the ninth octet is 2.

It looks like that UUID validates ok:

repos/json_schemer main % bin/console --simple-prompt
>> schema = JSONSchemer.schema({ 'format' => 'uuid' })
=> #<JSONSchemer::Schema @value={"format"=>"uuid"} @parent=nil @keyword=nil>
>> schema.valid?('d5b7f84c-1b49-11ef-9262-0242ac120002')
=> true
>> schema.valid?('invalid')
=> false

The [89AB] character is a 9 in that example.

I remembered another example - Nil UUID
00000000-0000-0000-0000-000000000000

Nil UUID is handled separately:

NIL_UUID = '00000000-0000-0000-0000-000000000000'

UUID_REGEX.match?(data) || NIL_UUID == data

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for the confusion. I spent a lot of time, but I could not figure out 100% why this uuid dff55168-b012-11e8-3c87-fa163eb33c9d does not pass validation.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dff55168-b012-11e8-3c87-fa163eb33c9d doesn't pass because it has a 3 where an 8, 9, A or B are currently expected ([89AB]). Do you know where that UUID came from?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uuid came to me from an external system. [89AB] is a prerequisite for all versions and variants of uuid? I'm trying to find a different uuid from this condition but I can't do it(
Therefore, I cannot guarantee that this uuid is correct. But in postgresql it is considered valid.

@davishmcclurg davishmcclurg merged commit 414ca1e into davishmcclurg:main May 30, 2024
26 checks passed
@davishmcclurg
Copy link
Owner

Thanks @kevin-glare! I just released as this in 2.3.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants