-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use new Enso Hash Codes and Comparable #6060
Use new Enso Hash Codes and Comparable #6060
Conversation
90b5d40
to
410cbdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments, but I need to understand the changes more...
@@ -49,3 +53,14 @@ type Day_Of_Week | |||
Day_Of_Week.Thursday -> DayOfWeek.THURSDAY | |||
Day_Of_Week.Friday -> DayOfWeek.FRIDAY | |||
Day_Of_Week.Saturday -> DayOfWeek.SATURDAY | |||
|
|||
## PRIVATE | |||
type Day_Of_Week_Comparator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks exactly as I have envisioned that when dreaming about Comparator
design.
distribution/lib/Standard/Base/0.0.0-dev/src/Network/HTTP/Header.enso
Outdated
Show resolved
Hide resolved
distribution/lib/Standard/Database/0.0.0-dev/src/Data/SQL_Type.enso
Outdated
Show resolved
Hide resolved
if Java_Image.is_equals x.opencv_mat y.opencv_mat then Ordering.Equal else | ||
Nothing | ||
|
||
hash x = x.opencv_mat.hashCode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good. Delegating to Java hashCode
when one knows there is a Java object is good.
import java.util.function.BiFunction; | ||
import java.util.function.Function; | ||
|
||
public class EnsoObjectWrapper implements Comparable<EnsoObjectWrapper> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the class final
. Anyway, I am not yet sure what EnsoObjectWrapper
is good for...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's me put Enso objects into the MulitValueIndex and hence aggregate and distinct on them using the Enso hash codes and equality.
When @Akirathan has time to do a nicer integration can switch to it but now let's all the stuff work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. #5855 was a good start for the integration, I believe.
- custom_comparator: | ||
If `Nothing` will get an ordered comparator from each element. | ||
Otherwise can support a custom fallback compare function. | ||
new : Nothing | (Any -> Any -> Ordering) -> ObjectComparator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removal of this class is probably good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is good indeed. I was always confused by the name.
std-bits/base/src/main/java/org/enso/base/ObjectComparator.java
Outdated
Show resolved
Hide resolved
private static void initCallbacks() { | ||
if (EnsoCompareCallback == null) { | ||
var module = Context.getCurrent().getBindings("enso").invokeMember("get_module", "Standard.Base.Data.Ordering"); | ||
var type = module.invokeMember("get_type", "Comparable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of poking around in Enso doesn't make me particularly happy.
However the only alternative I can offer is to always make sure each call from Enso to Java (that needs this callback) passes the compare_callback
function to Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't like such a poking in Enso. But I guess it is OK for this PR, as a temporary solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. I am glad that we got rid of Standard.Base.Data.Ordering.Comparator
, as that name always confused me. Great that we will have the ability to aggregate over enso objects in columns, although the performance won't be good, but that we be tackled in time.
std-bits/base/src/main/java/org/enso/base/ObjectComparator.java
Outdated
Show resolved
Hide resolved
std-bits/base/src/main/java/org/enso/base/polyglot/EnsoObjectWrapper.java
Outdated
Show resolved
Hide resolved
- custom_comparator: | ||
If `Nothing` will get an ordered comparator from each element. | ||
Otherwise can support a custom fallback compare function. | ||
new : Nothing | (Any -> Any -> Ordering) -> ObjectComparator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is good indeed. I was always confused by the name.
Also, if you could please add, at least a very simple, benchmark to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good as a workaround to get Enso objects comparisons/hashing supported in the Table library, although it's not what I'd imagine for the longer-term solution.
The logic for folding primitives (added to be consistent with how Enso compares them) is duplicated between the engine and our Table utilities - which may lead to inconsistency - but it does not have to be anymore as we can have this logic unified in common-polyglot-core-utils
. I hope #5259 will be implemented in the not-very-far future to ensure this is consistent.
shifted = case first_day of | ||
Day_Of_Week.Sunday -> day_number | ||
_ -> (day_number + 7 - (first_day.to_integer start_at_zero=True)) % 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shifted = case first_day of | |
Day_Of_Week.Sunday -> day_number | |
_ -> (day_number + 7 - (first_day.to_integer start_at_zero=True)) % 7 | |
shifted = (day_number + 7 - (first_day.to_integer start_at_zero=True)) % 7 |
btw. won't just that work? I'm not sure if we have to special-case the Sunday given that we do %
later anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stack overflow - to_integer would be called each time,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the culprit is #6065.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right! Okay then let's keep it.
I guess it would have worked if we did:
day_number day = case day of
Day_Of_Week.Sunday -> 0
Day_Of_Week.Monday -> 1
Day_Of_Week.Tuesday -> 2
Day_Of_Week.Wednesday -> 3
Day_Of_Week.Thursday -> 4
Day_Of_Week.Friday -> 5
Day_Of_Week.Saturday -> 6
shifted = ((day_number self) + 7 - (day_number first_day)) % 7
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This was only changed to avoid having ==
in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was just wondering on a side if we could further simplify it, removing the possibility of the infinite loop altogether.
wrapped a b = case a of | ||
Nothing -> if b.is_nothing then Ordering.Equal else if missing_last then Ordering.Greater else Ordering.Less | ||
_ -> case b of | ||
Nothing -> if missing_last then Ordering.Less else Ordering.Greater | ||
_ -> by a b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it could be a separate method that will be useful in other places:
make_comparator_nullsafe missing_last by = a-> b-> ...
public int compare(Object thisValue, Object thatValue) throws ClassCastException { | ||
public int compare(Object thisValue, Object thatValue) { | ||
// NULLs | ||
if (thisValue == null) { | ||
if (thatValue != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, this should also be removed and delegate to some common method in common-polyglot-core-utils
. But ofc ok as a workaround, just - I assume this will be implemented by @Akirathan in #5259 (although I see that this is not scheduled nor assigned to him - do we plan to do this anytime soon @jdunkerley @JaroslavTulach?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code delegates to EnsoObjectWrapper
and that class is using org.graalvm.polyglot.Context
. We cannot/shouldn't use org.graalvm.polyglot.Context
in the engine - e.g. I am afraid that isn't much to share in current implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand?
This is just a workaround that should be replaced with a proper solution. The workaround uses a Context, because it is its way to 'access' the current implementation and delegate to it (through a slow Java-to-Enso call).
The whole idea of extracting the shared code is so that the logic for computing the hash for particular primitive types can be exposed in pure Java, so that our library could call into it directly in Java, without the overhead of an Java-to-Enso call. That is what is described in #5259, although maybe I did not describe this well enough. If it's not clear - let's discuss.
std-bits/table/src/main/java/org/enso/table/data/table/Column.java
Outdated
Show resolved
Hide resolved
// We could do a fast-path for some known primitive types, but it doesn't matter as it will be | ||
// replaced with hashing soon anyway. | ||
return equalityFallback.apply(leftValue, rightValue); | ||
return new EnsoObjectWrapper(leftValue).equals(new EnsoObjectWrapper(rightValue)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another place that will need to be addressed by #5259 - especially as here we are lacking the fast-path for primitives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could move foldObject
from MultiValueKey
to something next to ObjectComparator
and then delegate to such helper for equality? Then we would get the fast-path for primitives, which is pretty important.
Although I'm fine if we leave this as-is for now and handle the proper solution in #5259.
Test.specify "until hashing is supported, should throw an error when trying to aggregate a custom object" <| | ||
Test.specify "should be able to create distinct on Enso objects" <| | ||
t = Table.new [["X", [My.Data 1 2, My.Data 3 4, My.Data 1 2]]] | ||
t.distinct . should_fail_with Illegal_Argument | ||
t.distinct ["X"] . at "X" . to_vector . should_equal [My.Data 1 2, My.Data 3 4] | ||
|
||
t2 = Table.new [["X", [Day_Of_Week.Monday, Day_Of_Week.Tuesday, Day_Of_Week.Monday, Day_Of_Week.Monday, Day_Of_Week.Tuesday, Day_Of_Week.Wednesday]]] | ||
t2.distinct ["X"] . at "X" . to_vector . should_equal [Day_Of_Week.Monday, Day_Of_Week.Tuesday, Day_Of_Week.Wednesday] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add tests like this to Table.join
too.
Also, ideally would also test some My_Atom
that has a custom comparator defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just use My
type - there is custom_objects
column here, tested on line 844.
std-bits/table/src/main/java/org/enso/table/data/table/join/scan/MatcherFactory.java
Outdated
Show resolved
Hide resolved
* develop: Layout fixes (#6066) Use new Enso Hash Codes and Comparable (#6060) Search suggestions by static attribute (#6036) Use .node-version for pinning Node.js version (#6057) Fix code generation for suggestion preview (#6054) Implementation of EnsoGL predefined Rectangle shape. (#6033) Tidy up the public module level statics (#6032) Cursor aware Component Browser (#5770)
Pull Request Description
Enables
distinct
,aggregate
andcross_tab
to use the Enso hashing and equality operations.Also, I rewired the way the ObjectComparators are obtained in polyglot code to be more consistent.
Add Comparator for
Day_Of_Week
,Header
,SQL_Type
,Image
andMatrix
.Also, removed the custom
==
from these types as needed. (Closes #5626)Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
./run ide build
.