Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pasting specific texts into vim (WSL) breaks some keys #16655

Closed
csdivad opened this issue Feb 2, 2024 · 8 comments · Fixed by #17738
Closed

Pasting specific texts into vim (WSL) breaks some keys #16655

csdivad opened this issue Feb 2, 2024 · 8 comments · Fixed by #17738
Labels
Area-Input Related to input processing (key presses, mouse, etc.) Area-VT Virtual Terminal sequence support In-PR This issue has a related PR Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Tag-Fix Doesn't match tag requirements Priority-1 A description (P1) Product-Conhost For issues in the Console codebase

Comments

@csdivad
Copy link

csdivad commented Feb 2, 2024

Windows Terminal version

1.20.10303.0 and 1.19.10302.0

Windows build number

10.0.22621.0

Other Software

vim 9.1.31 in WSL
nvim v0.9.5 in WSL

Steps to reproduce

  1. Open vim in WSL using Terminal 1.20.10303.0 or 1.19.10302.0. Version 1.18.10301.0 is unaffected.
  2. Enter insert mode by pressing i.
  3. Paste in the following text with right click or shift+insert: https://gist.githubusercontent.com/csdivad/8c1069a35b52fb806ee8e4484b38f1f3/raw/37616af9daee626df6e7065059de5d5d87d8c9ec/test.txt

This input is a base64 encoded chunk of Lorem Ipsum.
The issue only happens with specific inputs. I encountered it first when pasting another base64 encoded text, but that had sensitive data in it.
Then I played around with dd if=lorem.txt bs=1 count=... | base64 -w 180 until I found a configuration that also triggered the issue.

Expected Behavior

Escape, backspace and Function keys works properly after the paste.
Only the pasted text appears in the editor.

Actual Behavior

You are now stuck in editing mode. Esc, backspace and other keys are now sending weird escape sequences.
After the last character a 1~ appears. Pressing esc inserts ^[, pressing backspace inserts ^?.

@csdivad csdivad added Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Feb 2, 2024
@csdivad csdivad changed the title Pasting specific texts into vim (WSL) sometimes breaks special keys Pasting specific texts into vim (WSL) sometimes breaks some keys Feb 2, 2024
@csdivad csdivad changed the title Pasting specific texts into vim (WSL) sometimes breaks some keys Pasting specific texts into vim (WSL) breaks some keys Feb 2, 2024
@j4james
Copy link
Collaborator

j4james commented Feb 3, 2024

Git bisect suggests that this broke in revision efbfb12. I'm not sure why though.

@csdivad I think you should be able to escape from this state in vim by pressing Ctrl+C.

@j4james
Copy link
Collaborator

j4james commented Feb 3, 2024

I have a little more information now. It looks like the bracketed paste sequences are getting corrupted somehow. The paste starts with the opening sequence (\e[200~), but only ends with part of the closing sequence (1~).

As a result, vim thinks you're still pasting content, so it won't process any command keys unless you manually terminate the paste. Ctrl+C is one way of doing that, but you could also enter the closing bracket sequence manually: Esc [ 2 0 1 ~.

The length of the paste seems to be a factor in the bug, because for every additional character I add to the pasted content, one more character is included in the closing bracket sequence, until it's sending the full sequence. At that point, additional content doesn't seem make any difference.

@j4james
Copy link
Collaborator

j4james commented Feb 3, 2024

I think I know why it's failing now. There's assumedly a 4096 byte buffer somewhere, and the pasted content along with the bracketed paste sequences adds up to just over 4096 bytes. As a result, the closing bracket sequence is cut off, and what we end up receiving in the initial packet is something like this:

\e[200~ ... PASTED CONTENT ... \e[20

When the parser gets to that last escape, we have a heuristic that looks for a single escape, or a two byte Alt+key sequence, but this fits neither of those patterns, so it seems we just drop those characters!

Then when the next packet arrives, we get the remaining characters for the closing bracket (1~), but since the start of the sequence has been lost, it's now just treated as plain text.

The more I think about, the more convinced I am that what we're doing here is completely wrong, but I don't know what the correct solution would be.

@DHowett
Copy link
Member

DHowett commented Feb 5, 2024

if (_isEngineForInput)
{
const auto win32 = _engine->EncounteredWin32InputModeSequence();
if (!win32 && run.size() <= 2 && run.front() == L'\x1b')

I wonder if it's sufficient to always switch into a normal VT state machine mode once we've encountered a Win32 Input Mode sequence, e.g.:

        if (_isEngineForInput && !_engine->EncounteredWin32InputModeSequence())
        {

@zadjii-msft
Copy link
Member

Could we just amend the heuristic to also account for bracketed paste mode? Like, if we've started a bracketed paste, don't ever flush the state machine at the end of a write? Not sure how bad that would be.

@zadjii-msft zadjii-msft added Product-Conhost For issues in the Console codebase Area-Input Related to input processing (key presses, mouse, etc.) Area-VT Virtual Terminal sequence support labels Feb 6, 2024
@j4james
Copy link
Collaborator

j4james commented Feb 6, 2024

Sorry, I think I've been misleading everyone. I've just been looking at this again, and I now realise that my initial understanding was wrong.

When the first packet ends with half an escape sequence (the \e[20), I thought it just dropped those characters, but that's not actually what's happening. They've already been parsed and are now "preserved" in the state machine, which is why it's not in the ground state. When the next packet arrives, it does actually continue parsing the sequence from where it left off, but that's the point when something goes wrong which I'm not sure I understand.

It gets to a point where it decides it should flush the content to the input queue. However, the only thing it can flush is what it has in the current buffer, which is the second half of the sequence. So I think that's how the first half ends up being dropped, and vim sees the second half of the sequence as additional pasted text.

I think it's best I leave this to someone else to figure out, because I seem to be out of my depth here.

@carlos-zamora carlos-zamora added Priority-1 A description (P1) and removed Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Feb 7, 2024
@csdivad
Copy link
Author

csdivad commented Mar 3, 2024

Any update on this?
The issue is now present in the Store release (1.19.10573.0).

@j4james
Copy link
Collaborator

j4james commented Aug 11, 2024

I've just had this issue come up again in a different context (a VT query response that exceeded 4K), and I think I have a better understanding of what's going wrong now.

As I mentioned above:

It gets to a point where it decides it should flush the content to the input queue. However, the only thing it can flush is what it has in the current buffer, which is the second half of the sequence.

But the state machine already has a process for dealing with this. If you exit the state machine and you're not in the ground state, it can save the current run in the _cachedSequence field. Then if it needs to flush at a later point in time, it can combine the _cachedSequence with the latest run so you won't lose anything.

The problem is that this process is only applied to the output state machine. See here:

// If the engine doesn't require flushing at the end of the string, we
// want to cache the partial sequence in case we have to flush the whole
// thing to the terminal later. There is no need to do this if we've
// reached one of the string processing states, though, since that data
// will be dealt with as soon as it is received.
if (!_cachedSequence)
{
_cachedSequence.emplace(std::wstring{});
}
auto& cachedSequence = *_cachedSequence;
cachedSequence.append(run);

So I think all we need to do is copy that chunk of code up a couple of lines into the _isEngineForInput branch and all will be good.

I've done a basic test with this patch, and it seems to fix my problem as well as the vim pasting bug described in this issue. I don't know if there are any edge cases I'm missing though.

@microsoft-github-policy-service microsoft-github-policy-service bot added the In-PR This issue has a related PR label Aug 18, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs-Tag-Fix Doesn't match tag requirements label Aug 20, 2024
DHowett pushed a commit that referenced this issue Sep 26, 2024
## Summary of the Pull Request

When conhost receives input from a conpty connection, and that input
arrives in a block larger than our 4K buffer, we can end up with a VT
sequence that's split at the buffer boundary. Previously that could
result in the start of the sequence being dropped, and the remaining
characters being interpreted as individual key presses.

This PR attempts to fix the issue by caching the unprocessed characters
from the start of the sequence, and then combining them with the second
half of the sequence when it's later received.

## Validation Steps Performed

I've confirmed that pasting into vim now works correctly with the sample
data from issue #16655. I've also tested with a `DECCTR` report larger
than 4K which would previously have been corrupted, and which now works
as expected.

## PR Checklist
- [x] Closes #16655

(cherry picked from commit 131728b)
Service-Card-Id: PVTI_lADOAF3p4s4AmhmszgSCpCk
Service-Version: 1.21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Input Related to input processing (key presses, mouse, etc.) Area-VT Virtual Terminal sequence support In-PR This issue has a related PR Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Tag-Fix Doesn't match tag requirements Priority-1 A description (P1) Product-Conhost For issues in the Console codebase
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants