-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-15473: Hide duplicate plugins in /connector-plugins #14398
KAFKA-15473: Hide duplicate plugins in /connector-plugins #14398
Conversation
Signed-off-by: Greg Harris <[email protected]>
@yashmayya Could you take a look at this? I'd also appreciate your opinion on whether this warrants blocking the 3.6.0 release, or if this is just a cosmetic improvement that can wait for a future release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gharris1727 thanks for the fix and the great explanation here. The change looks good to me and I think it does qualify as a release blocker because there could definitely exist clients that don't expect duplicate plugins in the GET /connector-plugins
response (and fail to handle it gracefully). I would note that while such clients are potentially unlikely to exist in the wild, we cannot guarantee the non-existence of such clients and should ensure as best as we can that we don't cause any breakages.
Edit: I just saw this comment which states that clients would already need to handle duplicate plugins. However, the duplicates introduced in #14089 (that is being fixed here) would be a lot more in number for most installations of Kafka Connect. I'm now on the fence about whether or not this qualifies as a release blocker.
I was initially concerned with the discrepancy in the implementations of the compareTo
and equals
methods in the PluginDesc
class and wondered if this could lead to other unexpected issues or bugs in the future. However, it looks like the contract for Comparable::compareTo does allow for such a discrepancy:
It is strongly recommended, but not strictly required that (x.compareTo(y)==0) == (x.equals(y)). Generally speaking, any class that implements the Comparable interface and violates this condition should clearly indicate this fact. The recommended language is "Note: this class has a natural ordering that is inconsistent with equals."
Could we add such a note to the PluginDesc
class for posterity?
I'd also like to note that we're using SortedSet<PluginDesc>
in the PluginScanner
and PluginScanResult
classes but Set<PluginDesc>
in the Plugins
class. For instance, PluginScanResult::sinkConnectors
returns a SortedSet<PluginDesc<SinkConnector>>
but Plugins::sinkConnectors
returns a Set<PluginDesc<SinkConnector>>
. From the SortedSet Javadoc:
Note that the ordering maintained by a sorted set (whether or not an explicit comparator is provided) must be consistent with equals if the sorted set is to correctly implement the Set interface. (See the Comparable interface or Comparator interface for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a sorted set performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the sorted set, equal. The behavior of a sorted set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
I think this could definitely lead to at least some confusion if not potential new bugs in the future and I wonder if we should re-visit this.
Thanks @gharris1727 for the quick fix. While the changes are in place, IMO it is ok to not qualify this as a release blocker as duplicates plugins can be seen in other ways as well looks like. But I do agree that with Yash that we might see more number of duplicates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Yes, the number of duplicates is much higher and might actually be noticed now. Previously it was really only the dual converter/headerconverters which caused duplicates, and there would just be pairs of them. Now, if i install every public plugin, some are really duplicated a lot:
|
I don't believe we have this discrepancy, and don't need this caveat. Because PluginDesc::equals is using klass.equals, and PluginDesc::compareTo is using name.compareTo and location.compareTo, they should be consistent. We can change the equals method to use name.equals and location.equals to make it more obvious, but that would be mostly a readability change. (The same class name but loaded from different classloaders produces Class objects where equals is false). |
Ah, good point, I totally missed the implication of the |
@gharris1727 Please reach out to other committers in KafkaConnect area to expedite the review. It will be helpful to review and accept it sooner if we determine it as a blocker for 3.6.0. |
cc @mimaison and @C0urante please see the context for this solution in the ticket: https://issues.apache.org/jira/browse/KAFKA-15473 Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the quick fix Greg!
@satishd I'd like to backport this to 3.6 so that, even if we determine this is not a release blocker, it can be picked up in the event that we generate a new release candidate for 3.6.0. Is that acceptable to you? |
That seems reasonable to me. We can push this into 3.6 branch and make sure a followup 3.6.x release will have the fix. |
Reviewers: Yash Mayya <[email protected]>, Sagar Rao <[email protected]>, Hector Geraldino <[email protected]>, Chris Egerton <[email protected]>
Reviewers: Yash Mayya <[email protected]>, Sagar Rao <[email protected]>, Hector Geraldino <[email protected]>, Chris Egerton <[email protected]>
Reviewers: Yash Mayya <[email protected]>, Sagar Rao <[email protected]>, Hector Geraldino <[email protected]>, Chris Egerton <[email protected]>
Reviewers: Yash Mayya <[email protected]>, Sagar Rao <[email protected]>, Hector Geraldino <[email protected]>, Chris Egerton <[email protected]>
In #14089 the PluginDesc::compareTo method was changed, such that plugins with the same name and version but different classloaders could be given a consistent order. This caused compareTo to return non-zero values where previously it returned zero.
The compareTo method is used by TreeSet to determine uniqueness in lieu of the more typical equals() in other Set implementations. This means that the PluginScanResult now stored multiple copies of a plugin with the same "name", which was beneficial for the connect-plugin-path script where the distinction was valuable. DelegatingClassLoader::computePluginLoaders already disambiguates among multiple PluginDescs for the same plugin name by picking the "last" one, as defined by compareTo.
However, the ConnectorPluginsResource does not disambiguate in the same way, and instead just forwards the list of plugins as PluginInfo objects, which contain only the class, type, and version of the plugin. This means that when a plugin is provided by multiple sources but with the exact same class, type, and version, the
/connector-plugins
endpoint contains confusing duplicates.Instead of showing duplicates without any information to distinguish them, we should hide the trivial duplicates (duplicates where PluginInfo::equals is true). In order to verify this, the ConnectorPluginsResourceTest is changed to assert on the returned List rather than a Set, to detect duplicates.
Committer Checklist (excluded from commit message)