Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: custom comparators for array, map and struct #3385

Merged
merged 3 commits into from
Sep 27, 2019

Conversation

vpapavas
Copy link
Member

@vpapavas vpapavas commented Sep 18, 2019

Description

Do deep custom equality for array, map or struct. I.e., use our custom comparator for the nested fields in array, map and struct.

If container types (array, map, struct) contain for instance decimal schema, the UdfIndex code that tries to find the correct udf/udaf per parameter type fails as it checks whether the runtime parameter and signature parameter are equal. I happened upon this then when trying to support decimals for the UDAF sum_list.

I changed the code to check whether schemas are deep compatible.

Testing done

Added extra tests in SchemaUtilTest

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@vpapavas vpapavas requested a review from a team as a code owner September 18, 2019 23:48
@ghost
Copy link

ghost commented Sep 18, 2019

@confluentinc It looks like @vpapavas just signed our Contributor License Agreement. 👍

Always at your service,

clabot

@vpapavas vpapavas requested a review from agavra September 18, 2019 23:48
Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

final Field fieldA = fieldsA.get(i);
final Field fieldB = fieldsB.get(i);
if (!fieldA.name().equals(fieldB.name())
|| fieldA.index() != fieldB.index()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the order of fields matters 🤔 - it might be better to go through structA and get by name in structB. We can leave that for a follow-up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I checked is that in Connect, Field equals compares the index of two fields, hence order matters?

@agavra agavra requested a review from a team September 19, 2019 15:33
Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vpapavas

When reviewing this PR it would have been useful to have some context on why you're making the change. Is there some bug you're trying to fix? Or some future functionality this is needed for?

Your changes LGTM on the surface and without knowing the context...

}

private static boolean structEquals(final Schema structA, final Schema structB) {
return structA.fields().isEmpty()
|| structB.fields().isEmpty()
|| Objects.equals(structA.fields(), structB.fields());
|| compareFieldsOfStructs(structA, structB);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not particularly pertinent to your change, but do you know why a two structs are considered equal if one side has no fields?

Maybe this has something to do with the semantics being "are they compatible" and not "are they equal"...?

As above, if that's the case I think we should rename these methods

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good question. I don't know but I will discuss with Almog today and try to find out why.

@@ -348,24 +349,57 @@ public static boolean areCompatible(final Schema arg1, final Schema arg2) {
}

private static boolean mapEquals(final Schema mapA, final Schema mapB) {
return Objects.equals(mapA.keySchema(), mapB.keySchema())
&& Objects.equals(mapA.valueSchema(), mapB.valueSchema());
return areCompatible(mapA.keySchema(), mapB.keySchema())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the change in semantics from equals to compatible I wonder if we should change the name of these methods? eg. mapCompatible

@big-andy-coates
Copy link
Contributor

BTW, do you think this might be related to: #2376???

@vpapavas
Copy link
Member Author

Thank you for your comments @big-andy-coates ! You are right that there was no context, sorry about that. I edited the description.

I don't think it is related to #2376 as the code I changed is only called by UDF/UDAF code (KsqlFunction and UdfIndex). But I can look into it some more if they provide details on how to reproduce the error.

@vpapavas vpapavas merged commit fe63d21 into confluentinc:master Sep 27, 2019
@vpapavas vpapavas deleted the decimal-array branch November 23, 2020 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants