-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enso stopped working properly when creating node during Libs compilation #6841
Comments
Reproduced the issue. Two weird things were spotted: File version mismatch, or "oh no, we are desynchronized again, aren't we"?The file is opened properly, and the first file update (with id_map) is successful:
However, the second, which seemingly sets up the metadata is not:
So during the next update we reopen the text file, and all subsequent applyEdits are successful. I don't know which version is "right", so it's hard to say if it's a problem of engine or IDE. Engine does not send any expression updateSimply, they do not arrive. I attach the logs from the offending run: |
More notes:
|
I see following exception in Greg's log:
that doesn't look like a clean shutdown... Moreover I am not sure who initialized the shutdown? IDE? Why? |
This is fine. Will reduce/remove that in the future to avoid unnecessary confusion. |
I am trying to reproduce with:
but I don't see any problem. Possibly related to the fact that I don't know how to force the "Compiling standard libraries..." message. Just removing IRs: |
Interesting. I see the message every time I open a project. AFAIK IDE shows the message by default and hide it on the first received "executionComplete" message. |
I added some sleep time between loading of libraries, to simulate the problem locally. I haven't yet been able to reproduce exactly this issue, |
It does indeed only happen on new projects. Very weird. |
@farmaazon So I've added some debugging to figure out where the mismatch is happening.
as |
Hubert Plociniczak reports a new STANDUP for yesterday (2023-05-30): Progress: Trying to reproduce the problem described in the ticket. So far unsuccessful. Closed a few bugs that were likely fixed by changes in GUI/LS changes - no longer able to reproduce them. Various meetings and syncs. It should be finished by 2023-05-31. Next Day: Next day I will be working on the #6841 task. Continue the investigation. |
The difference appears to be in the first section of metadata.
and here is what client expects
Hence different SHAs. I think that's the source of the problem. The IDE never sends any patches for that. |
And
wasn't sent either to the server. |
I'm sorry, but probably you missed an information in my comment #6841 (comment) (the third point: the version mismatch seems to be an unrelated issue) I already did some investigation and reported in #6843, however, I just stopped at "the IDE is wrong, we need to check deeply". So your investigation went already deeper, and was referred to in that issue. Thanks for your effort. I thought this task will be mainly about "why engine does not report expression updates", not the synchronization issue. |
There are two tasks really. The first one I was able to reproduce, the second I was not. At least not yet.
If it helps, that's OK. I hacked IDE a bit to include a patch from the |
Re lack of expression updates: |
At the beginning of the execution `EnsureCompiledJob` acquired write compilation lock. When compiling individual modules it would then - acquire file lock - acquire read compilation lock The second one was spurious since it already kept the write lock. This sequence meant however that `CloseFileCmd` or `OpenFileCmd` can lead to a deadlock when requests come in close succession. This is because commands: - acquire file lock - acquire read compilation lock So `EnsureCompiledJob` might have the (write) compilation lock but the commands could have file lock. And the second required lock for either the job or the command could never be acquired. Flipping the order did the trick. Partially solves #6841.
Hubert Plociniczak reports a new STANDUP for yesterday (2023-05-31): Progress: Managed to reproduce reliably the problem with desynchronization. Investigated and reported the findings, looks like it is mostly a GUI issue where it does not send the necessary file patches thus resulting in version mismatch. It should be finished by 2023-05-31. Next Day: Next day I will be working on the #6841 task. Investigate why no expression updates appear after re-sync. |
Hubert Plociniczak reports a new 🔴 DELAY for yesterday (2023-06-01): Summary: There is 2 days delay in implementation of the Enso stopped working properly when creating node during Libs compilation (#6841) task. Delay Cause: The ticked has multiple issue, it seems. Fixing one problem unblocks others. |
Hubert Plociniczak reports a new STANDUP for yesterday (2023-06-01): Progress: Investigated backend getting stuck after re-sync on new project startup. Turns out we had a deadlock that could be reproduced reliably. PR is ready. But there is still a problem with expression updates not including information about newly added expressions (everything appears to be Nothing). It should be finished by 2023-06-02. Next Day: Next day I will be working on the #6841 task. Investigate another part of the problem. |
Hubert Plociniczak reports a new STANDUP for the provided date (2023-06-02): Progress: Started investigating the remaining problem with wrong expression updates for #6841 but got distracted and decided to pick it up next week. Instead started looking into #6897. Runtime exception was preventing IDE from receiving any expression updates (including method pointers and calls). PR is ready, although runtime exceptions can still wreak havoc if they appear again. It should be finished by 2023-06-02. Next Day: Next day I will be working on the #6841 task. Go back to #6841, also investigate if we can remove the remaining runtime exceptions. |
At the beginning of the execution `EnsureCompiledJob` acquired write compilation lock. When compiling individual modules it would then - acquire file lock - acquire read compilation lock The second one was spurious since it already kept the write lock. This sequence meant however that `CloseFileCmd` or `OpenFileCmd` can lead to a deadlock when requests come in close succession. This is because commands: - acquire file lock - acquire read compilation lock So `EnsureCompiledJob` might have the (write) compilation lock but the commands could have file lock. And the second required lock for either the job or the command could never be acquired. Flipping the order did the trick. Partially solves #6841. # Important Notes For some reason we don't get updates for the newly added node, as illustrated in the screenshot, but that could be related to the close/open action. Will need to dig more. ![Screenshot from 2023-06-01 16-45-17](https://github.com/enso-org/enso/assets/292128/900aa9b3-b2b2-4e4d-93c8-267f92b79352)
Hubert Plociniczak reports a new 🔴 DELAY for yesterday (2023-06-06): Summary: There is 5 days delay in implementation of the Enso stopped working properly when creating node during Libs compilation (#6841) task. Delay Cause: Temporarily moved to other issues, picking the problem again now that deadlock is resolved. The bug is hard to track down. |
Hubert Plociniczak reports a new STANDUP for yesterday (2023-06-06): Progress: Back to investigating lack of expression updates. Confirming that edits are recorded and applied to the source. For some reason the corresponding nodes for new expressions aren't created. Will need to dig further. It should be finished by 2023-06-07. Next Day: Next day I will be working on the #6841 task. Continue the investigation, sync with Dmitry for potential hints. |
Hubert Plociniczak reports a new 🔴 DELAY for yesterday (2023-06-07): Summary: There is 3 days delay in implementation of the Enso stopped working properly when creating node during Libs compilation (#6841) task. Delay Cause: Put wrong delay in the last report. Race condition is proving to be rather hard to identify & fix. |
Hubert Plociniczak reports a new STANDUP for yesterday (2023-06-07): Progress: Ideintified a race condition between edits and open file cmd which eliminates them. Put up draft PR demonstrating the problem. The problem is that command and job are executed by different schedulers making the synchronization rather hard. Potentially the order of execution should be sequential. Investigating potential fixes. It should be finished by 2023-06-10. Next Day: Next day I will be working on the #6841 task. Continue the investigation now that problem is identified. |
Hubert Plociniczak reports a new STANDUP for yesterday (2023-06-08): Progress: PR with a fix is ready. runtime commands can now be executed asynchronously, as before, as well as synchronously. Next Day: Next day I will be working on the #6841 task. Address PR review, pick up next task. |
There was an inherent race condition between edit, close & open commands which could not be prevented solely using locks. `EditFileCmd` triggered `EnsureCompiledJob` which was applying edits collected over time. At the same `CloseFileCmd` and `OpenFileCmd` were executed asynchronously and required locks on compilation unit and file lock. Additionally, open file was resetting the module's runtime source irrespective of any edits that could already have been applied with the asynchronous execution in `EnsureCompiledJob`. This was visible especially during early manipulation of the project when open/close was performed due to a bug in IDE (#6843). Now commands can be run either synchronously or asynchronously. Only that way can we ensure that `close` & `open` commands finish by the time any editions are being applied to module's sources. Closes #6841. # Important Notes In the given video, `"foo"` would be greyed out because it would never be part of the module's (runtime) sources. Therefore no IR would be generated for it or instrumentation, meaning it would be present in `expressionUpdates` information necessary for IDE. [Kazam_screencast_00014.webm](https://github.com/enso-org/enso/assets/292128/226a17b8-729a-415a-803f-003a9695b2f1)
Hubert Plociniczak reports a new STANDUP for the provided date (2023-06-09): Progress: More testing of the fix for the race condition, PR merged. Investigated issues with using Scala-generated classes in Java code and the related IDE problems. Looks like the problem could be fixed either on javac or scalac side but both are unlikely in a short term (will file a ticket for the latter). We will likely rewrite Next Day: Next day I will be working on the #6841 task. Continue investigation into visualizations. |
Hubert Plociniczak reports a new 🔴 DELAY for yesterday (2023-06-15): Summary: There is 2 days delay in implementation of the Enso stopped working properly when creating node during Libs compilation (#6841) task. Delay Cause: Discovered some serious flaws in the existing design of job execution that needed to be addressed. |
Please ignore. Wrong ticket. |
Discord username
No response
What type of issue is this?
Intermittent – Occurring irregularly
Is this issue blocking you from using Enso?
Is this a regression?
What issue are you facing?
When I create node too fast (before Libs are compiled) IDE looses connection to the Engine.
2305242131_shareX.mp4
Expected behaviour
User shouldn't be allowed to create a node before Libs are compiled.
How we can reproduce it?
No response
Screenshots or screencasts
No response
Logs
2305242131_shareX 20230524-192640-619-enso-project-manager.log
Enso Version
2023.5.24 nightly
Browser or standalone distribution
Standalone distribution (local project)
Browser Version or standalone distribution
standalone
Operating System
Windows
Operating System Version
Win11pro 22H2 22621.1555
Hardware you are using
12th Gen Intel(R) Core(TM) i9-12900HK / RTX3060 Laptop / Nvidia Drivers 531.68
The text was updated successfully, but these errors were encountered: