-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"no entry found for key" panic in helix-view #3459
Comments
I believe this is the line in question |
If I remove content of one rust module to delete it and paste into another module, when saving empty module file with |
I can't reproduce that but I'm not sure that I'm following the steps correctly. Can you create an asciinema recording with the crash? |
Made simple example:
|
I just had a bunch of panics back to back, and can confirm that I'm seeing it triggered when I save the file(s). |
Okay this is pretty useful data points! When saving the view id is used to manipulate document history. On |
One more data point: I just experienced the panic (same line number) on |
Most definitely! I've built #2267 locally, I'll test with that for a bit and update the thread. |
I think I can see an issue that might happen with helix/helix-term/src/commands/typed.rs Lines 578 to 592 in dd2b9e5
However, this does not explain helix/helix-term/src/commands/typed.rs Lines 489 to 501 in dd2b9e5
As we can see, it's supposed to block on the async jobs before proceeding to @bcspragu it would be helpful if you could figure out the issue with your backtrace and get it populated with symbols correctly. I'm not sure if this is a platform issue, or if maybe you just need to comment out this line: Line 18 in dd2b9e5
I also can't think of how this could be happening with |
Ah, didn't even occur to me that my stacktrace was useless. I've been building
I just tried taking the same steps, but I ran into a different issue while trying to replicate the original one, getting an error like:
And the helix.log
I'm going to switch back to the main branch in the mean time, with debug symbols + |
Thanks for running my branch! This is useful, I'll take a look through it. |
Legible stack trace, taken at master/HEAD:
The logs from the session were tens of thousands of lines, so I've trimmed to ~the last minute of logs before the crash: helix.log
EDIT: A
|
Ok yeah that stack trace pretty much confirms for me that it's a race: the write-quit command is finishing the whole process of writing the file and closing the view before the auto format callback gets a chance to run, and by the time it does, the view is closed. Thanks for taking the time to help debug! Theoretically, this should be fixed by #2267 because it is serializing the whole process, so the file write and close should not happen until the auto format is done and applied to the in-memory document, but I'll try to figure out what's causing the channel close issue. As a side note, I'm really curious what machine you're running this on. Judging by your earlier logs, I can see there are a lot of interrupts in the tokio tasks by observing all the times that it reawaits the future. |
Nothing too esoteric, a 2021 Lenovo Slim 7 Pro with a Ryzen 4800H, 8C/16T, running Arch Linux. I'm not familiar with tokio/have no idea what would cause interrupts, but system load is fairly low (1-2%) most of the time. |
@bcspragu ok, I think I may have found the problem. Can you pull the latest change on my branch and try again? |
I wasn't able to cause the crash by moving yanked + pasted chunks around between a few different files, which is good! I'm going to use it for a day of work and I'll report back. Recently, I've been seeing crashes every few hours, so if it doesn't crash after a full day, that's a good sign |
Awesome, thanks for testing! |
@dead10ck I'm still getting a panic on the same line (line number has changed in your branch though): stack trace
Nothing interesting in the logs prior to the crash, though I hadn't specified helix.log
|
Hmm, ok, thanks so much for testing and the stack trace. I'll try to dig some more. |
things were smooth till today, I got a big
|
Ok, I'm fairly certain I've narrowed down the cause. It's pretty easily reproducible. I'll post details and hopefully a fix later when I can get to it. But short version is this can happen basically any time you have more than two splits and do a Edit: alternatively, you can turn off auto formatting |
Yup, that matches my experience. Using only |
Okay, so the problem is basically that when we do auto formatting, the code that takes the LSP's response and applies the changes to the document are just getting the currently focused view and giving that to the function, basically always assuming that the document that we're applying the change to is in focus, and not in a background view. helix/helix-term/src/commands.rs Lines 2475 to 2498 in 86a8ea5
This is usually fine for a single view, even if it's a buffer in the background, because it's still the same view and the selection will get updated accordingly for when you switch back to it. But it's obviously a problem for when there are multiple views, because if you don't have the target document in focus, it will ask the document to update the wrong view, hence the crash. This is easily reproducible:
You need 3 or more documents open to hit this because when you split, you actually start with the previous document open in it before opening another, so it is in fact associated with that view. So in using the above example, view 1 is associated with doc 1, view 2 is associated with doc 1 and doc 2, and view 3 is associated with docs 2 and 3. So I suspect that we haven't had any reports of this until now because OP is the first user that makes use of many views as part of their normal workflow. I think the solution is to explicitly pass in the view we want to apply the change to as part of the function arguments. However, this raises a UX issue: when we're doing a Furthermore, part of what the view is used for is to save the old state so that we can undo it if we want: helix/helix-view/src/document.rs Lines 795 to 805 in 86a8ea5
So, our options seem to be:
The latter seems the least desirable. 3 would seem like the best balanced approach, but seems like it could end up with confusing undo history. e.g. if we had two views of a document, and we pick view 1, we'll save the document state with the selection of view 1. So if you go to view 2 and undo the formatting change, you'd end up with the selection you previously had in view 1 in view 2. However, this seems like the best option for the time being, without having to totally redesign the history system all at once in order to fix this crash. @archseer do you care to weigh in? Also, it's worth noting that no matter what we end up picking, I'm fairly certain it will still have to be based on top of #2267 because without it, the race condition is still possible, i.e. we could end up closing the view before we got the chance to apply the auto formatting change to it. |
I opted to just pick the first view of the document we're trying to save. I have a branch up here that's based on top of #2267, and will submit a PR once that is merged. @bcspragu could you please try out this branch when you get a chance? I was no longer able to reproduce the issue locally after this change. |
I think that's a good choice, we do this in a couple of other scenarios too 👍🏻 |
I used your branch for a full work day yesterday, frequently with 5+ splits open (frequently with 2 of them the same large file), using |
Awesome, glad to hear it! Thanks so much for working with me to solve it. |
Actually I was waiting to submit the fix for this because it needed #2267 as well, but I didn't want to just keep piling on the same PR. I'll submit a PR for a fix for this now though :) |
PR here (it's much smaller): #4384 |
Summary
Helix panics (about 1-2 times a day with ~8 hour usage) with the following error:
Error
I think it happens on save, but not every time I save (I'm saving with
:wa
very frequently because of this issue), but it always catches me off guard, so I can never remember exactly what I was doing when it happened.I don't think
GOPACKAGESDRIVER
has anything to do with it, it's just what I use to get Go's LSP working with Bazel. I'm pretty confident I've seen this failure even when not running withGOPACKAGESDRIVER
.Reproduction Steps
Unfortunately, I don't know exactly what triggers it. I usually have lots of splits open, and it seems to be either a
:w
or:wa
command that triggers the failure.Helix log
~/.cache/helix/helix.log
Most of these errors are related to
$PROJECT/.postgres-data/data
, which is owned by some other user (it's running in a Docker container). I believe I've seen this failure in other projects that wouldn't have this.postgres-data
directory, and also when editing non-Go projects (e.g. TypeScript)Platform
Linux
Terminal Emulator
st (4ef0cbd8)
Helix Version
helix 22.05 (8deaebd) (built from HEAD ~4 days ago)
The text was updated successfully, but these errors were encountered: