Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement equals/hashcode for named DocValueFormat inner classes #6357

Merged
merged 1 commit into from
Mar 7, 2023
Merged

Implement equals/hashcode for named DocValueFormat inner classes #6357

merged 1 commit into from
Mar 7, 2023

Conversation

bryanlb
Copy link
Contributor

@bryanlb bryanlb commented Feb 17, 2023

Description

Implements equals and hashCode methods for the named inner class DocValueFormat.DateTime.

Per the contract as described in the Writeable interface, any class implementing equals/hashcode must ensure that a copy made be serializing and deserializing must be equal. As InternalDateHistogram / InternalHistogram implement equals/hashcode and are Writable classes, this requires that DateTime also implement equals and hashcode , as the DateMathParser will otherwise cause it to fail equivalence.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:


@Override
public boolean equals(Object obj) {
if (obj.getClass().equals(this.getClass()) == false) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please point out where the parser participates in equals / hashCode path? We are talking about DateTime that implements DocValueFormat, rigth?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran across this when attempting to compare the equivalence of an InternalDateHistogram instance that had been serialized and deserialized.

The equals method of the InternalDateHistogram fails on the format field (DocValueFormat$DateTime), which contains a JavaDateMathParser instance in the parser field. The other fields on the DateTime class (formatter, timezone, resolution) do successfully equate, due to both JavaDateFormatter and ZoneOffset implementing equals, and DateFieldMapper.Resolution being an enum.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran across this when attempting to compare the equivalence of an InternalDateHistogram instance that had been serialized and deserialized.

This is not in scope of the OpenSearch, right? The parser class is just an utility class and is not supposed to be used for equivalence or anything like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today you cannot serialize an InternalDateHistogram using the provided writeTo method, and re-hydrate that back into an InternalDateHistogram using the InternalAggregations.readFrom and pass equivalence due to this object reference. I would expect an item that had been serialize/deserialized to maintain equivalence, unless this is not something that is intended to be supported?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser is synthesized from formatter: it is not serialized and consequently, not deserialized (there are no reasons to do that, parser is a typical transient property).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is the issue I'm running into. Because it's not serialized and instead is derived from the formatter it breaks the ability to compare instances of any InternalAggregation that has a DateTime.

Perhaps a better approach on this would be to override the equals()/hashcode of DateTime and exclude the parser field, since it is directly derived from the formatter. Since DateTime implements a Writeable it would make more sense that this should be able to maintain equivalence across serialization.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU the comparison logic you are using seems to be relying on per-field comparison and not the equals (since it would have failed even before this point). In this case, the parser should be excluded or, alternatively, you could rely on toString() comparison. I don't think this is an issue in OpenSearch, there are no requirements / promises / use cases to have such strict equivalence guarantees, especially for utility classes like parsers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a higher-level example would better illustrate the issue:

public void testEquivalence() {
   DateFormatter formatter = DateFormatter.forPattern("epoch_second");
   ZoneOffset zoneOffset = ZoneOffset.ofHours(1);
   Resolution resolution = Resolution.MILLISECONDS;

   DocValueFormat.DateTime dateFormat1 = new DocValueFormat.DateTime(formatter, zoneOffset, resolution);
   DocValueFormat.DateTime dateFormat2 = new DocValueFormat.DateTime(formatter, zoneOffset, resolution);

   assertEquals(dateFormat1, dateFormat2);
}
java.lang.AssertionError: expected: org.opensearch.search.DocValueFormat$DateTime<DocValueFormat.DateTime(format[epoch_second] locale[], +01:00, MILLISECONDS)> but was: org.opensearch.search.DocValueFormat$DateTime<DocValueFormat.DateTime(format[epoch_second] locale[], +01:00, MILLISECONDS)>
Expected :org.opensearch.search.DocValueFormat$DateTime<DocValueFormat.DateTime(format[epoch_second] locale[], +01:00, MILLISECONDS)>
Actual   :org.opensearch.search.DocValueFormat$DateTime<DocValueFormat.DateTime(format[epoch_second] locale[], +01:00, MILLISECONDS)>

I do think after this conversation though, that this is probably better resolved in the DocValueFormat.DateTime class instead of the JavaDateMathParser, otherwise this above example would still fail due to the fact it uses a RoundupParser.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only DocValueFormat.DateTime but DocValueFormat in general

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #6357 (fe01144) into main (c772596) will increase coverage by 0.12%.
The diff coverage is 42.85%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #6357      +/-   ##
============================================
+ Coverage     70.58%   70.71%   +0.12%     
- Complexity    58841    59041     +200     
============================================
  Files          4799     4799              
  Lines        282421   282428       +7     
  Branches      40717    40720       +3     
============================================
+ Hits         199341   199711     +370     
+ Misses        66584    66315     -269     
+ Partials      16496    16402      -94     
Impacted Files Coverage Δ
...org/opensearch/common/time/JavaDateMathParser.java 96.46% <42.85%> (-3.54%) ⬇️
...java/org/opensearch/client/indices/DataStream.java 0.00% <0.00%> (-76.09%) ⬇️
...adonly/AddIndexBlockClusterStateUpdateRequest.java 0.00% <0.00%> (-75.00%) ⬇️
...readonly/TransportVerifyShardIndexBlockAction.java 9.75% <0.00%> (-73.18%) ⬇️
...rch/test/junit/listeners/ReproduceInfoPrinter.java 9.52% <0.00%> (-68.26%) ⬇️
...a/org/opensearch/client/cluster/ProxyModeInfo.java 0.00% <0.00%> (-60.00%) ⬇️
...n/admin/indices/readonly/AddIndexBlockRequest.java 17.85% <0.00%> (-53.58%) ⬇️
...ch/transport/ReceiveTimeoutTransportException.java 50.00% <0.00%> (-50.00%) ⬇️
...pensearch/indices/breaker/CircuitBreakerStats.java 27.77% <0.00%> (-41.67%) ⬇️
...indices/readonly/TransportAddIndexBlockAction.java 20.68% <0.00%> (-41.38%) ⬇️
... and 537 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@dblock
Copy link
Member

dblock commented Feb 28, 2023

@bryanlb What would you like to do with this?

@bryanlb
Copy link
Contributor Author

bryanlb commented Mar 1, 2023

@dblock I'd still like to see this addressed, with the most recent suggestion from @reta. We can close this PR out and if there's interest in fixing this, I can followup with a PR that introduces those equals()/hashcode() fixes on the DocValueFormat class.

@andrross
Copy link
Member

andrross commented Mar 1, 2023

I ran across this when attempting to compare the equivalence of an InternalDateHistogram instance that had been serialized and deserialized.

@bryanlb Can you share more details about what you're trying to do at a higher level that led you to need to do this?

@bryanlb
Copy link
Contributor Author

bryanlb commented Mar 1, 2023

@andrross sure! I have a slightly unique case where I'm importing the server package as a dependency, and then using the AggregationBuilder classes to create Lucene collectors. I've had no problems with this approach - consuming the collectors, or utilizing the returned InternalAggregation classes.

I only ran into this issue when attempting to verify that serializing an InternalAggregation and deserializing it resulted in an equivalent object. While the Writeable interface notes that implementers aren't required to implement equals/hashcode, it does state that "If the implementer also implements equals and hashCode then a copy made by serializing and deserializing must be equal and have the same hashCode". In the case of InternalDateHistogram this class does implement both an equals and hashcode, but does not maintain this stated contract.

@andrross
Copy link
Member

andrross commented Mar 1, 2023

@bryanlb So are you using the OpenSearch server as a library in a separate (non-OpenSearch) application? It's always interesting to hear unique use cases :) I will caution that there are no stability guarantees with these internal classes so you may frequently encounter breaking changes.

That's a fair point about not correctly implementing the contract defined by the Writeable interface. I'm not sure about implementing equals/hashcode in DocValuesFormat (though I am no expert in this area). A possible alternative: the only information InternalDateHistogram sends over the wire for its DocValuesFormat field is it's name (DocValuesFormat is a NamedWriteable), would it be fair to use format.getWriteableName() in the equals/hashcode implementation of InternalDateHistogram? However, if it is fair to always uniquely identify a DocValuesFormat by its name, then this can be implemented within the DocValuesFormat implementations themselves.

@reta
Copy link
Collaborator

reta commented Mar 3, 2023

That's a fair point about not correctly implementing the contract defined by the Writeable interface. I'm not sure about implementing equals/hashcode in DocValuesFormat (though I am no expert in this area).

@andrross Just to give some context on the issue at large, the InternalDateHistogram and others like InternalHistogram do implement the equals / hashCode using internal state. That basically assumes that internal components do implement the equals / hashCode as well, which in case of InternalDateHistogram means equality for DocValueFormat:

    ...
    private final DocValueFormat format;
    ...

It think this is indeed a problem (may be uncovered in unusual way by @bryanlb) but we need to fix.

@andrross
Copy link
Member

andrross commented Mar 3, 2023

It think this is indeed a problem (may be uncovered in unusual way by @bryanlb) but we need to fix.

@reta Got it. Seems like we should override equals/hashcode in each DocValueFormat concrete class. Should be pretty straightforward.

@bryanlb
Copy link
Contributor Author

bryanlb commented Mar 3, 2023

@andrross @reta I can get an update done for those changes next week. Do we want to keep it tied to this PR or would you prefer a new one that references this?

@andrross
Copy link
Member

andrross commented Mar 3, 2023

@bryanlb Either way is fine. Just make sure you update the title and commit message if you continue to use this PR.

@bryanlb bryanlb changed the title Override equals & hashCode for JavaDateMathParser Implement equals/hashcode for named DocValueFormat inner classes Mar 6, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

Gradle Check (Jenkins) Run Completed with:

@reta
Copy link
Collaborator

reta commented Mar 6, 2023

@bryanlb LGTM, could you please add the entry to CHANGELOG? thank you

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

Gradle Check (Jenkins) Run Completed with:

@andrross andrross added the backport 2.x Backport to 2.x branch label Mar 7, 2023
@andrross andrross merged commit 5a7fd4f into opensearch-project:main Mar 7, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-6357-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5a7fd4f4c0efc1f300b05a250f484517426fb2e4
# Push it to GitHub
git push --set-upstream origin backport/backport-6357-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-6357-to-2.x.

@kotwanikunal kotwanikunal added backport 2.x Backport to 2.x branch and removed backport 2.x Backport to 2.x branch labels Mar 8, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-6357-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5a7fd4f4c0efc1f300b05a250f484517426fb2e4
# Push it to GitHub
git push --set-upstream origin backport/backport-6357-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-6357-to-2.x.

kotwanikunal pushed a commit to kotwanikunal/OpenSearch that referenced this pull request Mar 8, 2023
@kotwanikunal
Copy link
Member

Backport: #6584

kotwanikunal pushed a commit to kotwanikunal/OpenSearch that referenced this pull request Mar 8, 2023
…nsearch-project#6357)

Signed-off-by: Bryan Burkholder <[email protected]>
(cherry picked from commit 5a7fd4f)
Signed-off-by: Kunal Kotwani <[email protected]>
kotwanikunal added a commit that referenced this pull request Mar 8, 2023
* Implement equals/hashcode for named DocValueFormat inner classes (#6357)

Signed-off-by: Bryan Burkholder <[email protected]>
(cherry picked from commit 5a7fd4f)
Signed-off-by: Kunal Kotwani <[email protected]>

* Fix compiler warning and base blob path case (#6558)

Signed-off-by: Samuel Herman <[email protected]>
Co-authored-by: Samuel Herman <[email protected]>
(cherry picked from commit 8c0e5d0)
Signed-off-by: Kunal Kotwani <[email protected]>

---------

Signed-off-by: Bryan Burkholder <[email protected]>
Signed-off-by: Kunal Kotwani <[email protected]>
Signed-off-by: Samuel Herman <[email protected]>
Co-authored-by: Bryan Burkholder <[email protected]>
Co-authored-by: samuel-oci <[email protected]>
Co-authored-by: Samuel Herman <[email protected]>
mingshl pushed a commit to mingshl/OpenSearch-Mingshl that referenced this pull request Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants