-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report only unique warnings #6372
Report only unique warnings #6372
Conversation
@@ -172,4 +172,11 @@ spec = | |||
|
|||
file.delete_if_exists | |||
|
|||
Test.specify 'should not duplicate warnings' <| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels more like integration test but at least it demonstrates that we don't generate rubbish anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to have such integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment about the test
Oh, I feel bad like I'm constantly criticising. But I'm really having doubts if this is the right solution. Can we discuss it? I think we should deduplicate the warnings when adding/combining them. Otherwise, what this does is it hides the duplication - so from UI perspective it's all good, but the problem still stays under the hood. If the warnings duplicate so easily, they can very easily use a lot of memory. Why keep the duplicates in memory when we can just easily run the deduplication on merge instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am glad this change helps the UI, but I am not sure I understand the logic that drives it.
engine/runtime/src/main/java/org/enso/interpreter/runtime/error/Warning.java
Outdated
Show resolved
Hide resolved
My first attempt at this PR had uniqueness introduced at all data structures that export One potential positive side of this PR, apart from being relatively simple, is that we can still make use of duplicate reassigned warnings, if we ever decide to resurrect them for the IDE purposes. It won't be possible if we de-duplicate at the insertion. But first, I also want to add some benchmarks to see how (a large number of) warnings affect simple computations. |
Displaying of warnings is done, like for any other structure, by attaching an appropriate visualization. By doing the de-duplication at the point when we return all warnings of the value, we only display the unique ones. |
OK, it still SOs on a fold with a vector of 10000 warnings. This needs more work. |
Indeed, I also thought that it may be useful at some point. But I'm afraid the performance drain could be too much. And also, at least currently I'm not quite certain we need it that much from Libs perspective - the major goal is to be able to display a warning when there is a risk of data loss/corruption and be sure that the user can track its source. A single path should be enough for most cases I think.
That would be ideal. |
...runtime/src/bench/java/org/enso/interpreter/bench/benchmarks/semantic/WarningBenchmarks.java
Outdated
Show resolved
Hide resolved
Warning.get_all result_2 . length . should_equal 1 | ||
|
||
result_3 = b + b + d | ||
Warning.get_all result_3 . map (x-> x.value.to_text) . should_equal ["Foo!", "Foo!"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shows that the warning message isn't important - two warnings can have the same message and they will be treated as different. OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note that the warnings don't really have a message but a generic payload which can be any object (in these examples they are just Text
payloads).
Btw. looking around the |
sgtm |
a1f9039
to
bb80aa9
Compare
Took me a while to bring it into an acceptable state and make native image happy. In the end, the change is rather minimal and perf is pretty good. Previously it would run out of heap space or SO pretty easily. |
engine/runtime/src/main/java/org/enso/interpreter/runtime/error/WithWarnings.java
Outdated
Show resolved
Hide resolved
Arrays.sort(arr, Comparator.comparing(Warning::getCreationTime).reversed()); | ||
} | ||
|
||
/** Converts set to an array behing a truffle boundary. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** Converts set to an array behing a truffle boundary. */ | |
/** Converts set to an array behind a truffle boundary. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Enso part (tests) looks perfect - that's exactly what we needed and I'm really glad we also got a benchmark.
One small suggestion/question in line.
@@ -172,10 +172,10 @@ Object doWarning( | |||
} | |||
} | |||
arguments[thatArgumentPosition] = that.getValue(); | |||
ArrayRope<Warning> warnings = that.getReassignedWarnings(this); | |||
ArrayRope<Warning> warnings = that.getReassignedWarningsAsRope(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, when storing the warnings as sets, what is the point of getting these warnings as a rope?
Won't a simple array suffice? I don't see any reason to keep using ArrayRope
s in the warning system at this moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. Gathering warnings means that we don't know the target size of the array/collection. So then a lot of methods that create/append collections/arrays would have to be behind truffle boundary as they are blacklisted. ArrayRope<Warning>
is not an ideal data structure, yes. Eventually I would like to get rid of it completely but it would be a bigger change.
You also have to remember that this is added in generated code e.g. ReadArrayElementMethodGen
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so we are appending the warnings using the ArrayRope and only at the end merging them into a Set? Ok I think I missed that part. Makes sense now.
} | ||
|
||
@Benchmark | ||
public void sameWarningVecSum() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the benchmark results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am interested in seeing the benchmark results. I like the overall changes. I'd like us to move even further, especially by not using EconomicSet
directly, but rather relying on some Enso specific implementation of hash map/set.
engine/runtime/src/main/java/org/enso/interpreter/runtime/data/Array.java
Outdated
Show resolved
Hide resolved
@Override | ||
public boolean equals(Object a, Object b) { | ||
if (a instanceof Warning thisObj && b instanceof Warning thatObj) { | ||
return thisObj.getCreationTime() == thatObj.getCreationTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got confused by the "creation time" in the past and (almost) again right now. Can't we rename to something like sequenceId
?
} | ||
|
||
@CompilerDirectives.TruffleBoundary | ||
private EconomicSet<Warning> createSetFromArray(Warning[] entries) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should share the Map
implementation with "EnsoHashMap" and together move towards a hash map/set that works in PE mode well and doesn't have to hide behind @TruffleBoundary
. CCing @Akirathan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't the EconomicSet
supposed to be PE friendly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graal's collections are primarily memory-friendly. But they sometimes do call some blacklisted methods, requiring @TruffleBoundary
annotations. It's not a bug, it's by design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't the
EconomicSet
supposed to be PE friendly?
The primary usage of EconomicXyz
is in Graal compiler. Originally the compiler was using plain HashMap
or HashSet
, but when the compiler is running in JVM mode, the usages of HashXyz
by the compiler and by user program influenced each other. Particularly: if the user program uses HashXyz
with objects with a "bad" hashing function with too many collisions the HashXyz
switches to slower "defensive mode" - however that influences all the HashXyz
usages across the whole JVM. To isolate the compiler from such performance degradation the EconomicXyz
implementations were created.
When Thomas wrote them, I don't think he was thinking about PE much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should share the Map implementation with "EnsoHashMap" and together move towards a hash map/set that works in PE mode well and doesn't have to hide behind @TruffleBoundary.
Tracked in #5233. I have just added a new task to that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking as a real issue
Paths.get("../../distribution/component").toFile().getAbsolutePath() | ||
) | ||
.option("engine.MultiTier", "true") | ||
.option("engine.BackgroundCompilation", "true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an unknown reason, we set engine.BackgroundCompilation
to false
as the default, even in benchmarks. Here https://github.com/enso-org/enso/blob/develop/build.sbt#L1029. Good that you override it here, but we should maybe get rid of the default option in build.sbt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @JaroslavTulach made this possible recently in #6335 so maybe that was just forgotten there?
noWarningsVec = createVec.execute(INPUT_VEC_SIZE, elem); | ||
sameWarningVec = createVec.execute(INPUT_VEC_SIZE, elemWithWarning); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to randomize the contents of these vectors a bit, for example like in EqualsBenchmarks.generatePrimitiveVector - to be sure that there will be no magical constant fold. But I guess that even without the randomization, this benchmark should give us relevant numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although later I'm checking the consistency of the end result which now is pretty easy to calculate :)
engine/runtime/src/main/java/org/enso/interpreter/runtime/data/Array.java
Show resolved
Hide resolved
} | ||
|
||
@CompilerDirectives.TruffleBoundary | ||
private EconomicSet<Warning> createSetFromArray(Warning[] entries) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should share the Map implementation with "EnsoHashMap" and together move towards a hash map/set that works in PE mode well and doesn't have to hide behind @TruffleBoundary.
Tracked in #5233. I have just added a new task to that issue.
engine/runtime/src/test/java/org/enso/interpreter/test/WarningsTest.java
Outdated
Show resolved
Hide resolved
engine/runtime/src/test/java/org/enso/interpreter/test/WarningsTest.java
Outdated
Show resolved
Hide resolved
This change makes sure that reported warnings are unique, based on the value of internal clock tick and ignoring differences in reassignments. The `Warning.get_all` is the main entry point to all requests for values' warnings. So rather than implementing de-duplication all over the place, we just do it in one place. There is a cost involved in deduplication (sorting, copying the array + going over the whole array) but that seems like an acceptable cost.
Benchmarks:
|
2678d35
to
74392dd
Compare
28362e5
to
5a7ad12
Compare
Pull Request Description
This change makes sure that reported warnings are unique, based on the value of internal clock tick and ignoring differences in reassignments.
Before:
After:
On the positive side, no further changes, like in LS, have to be done.
Closes #6257.
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
./run ide build
.