Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Norwegian characters trick highlighting #28

Closed
Osse opened this issue Jun 12, 2011 · 11 comments
Closed

Norwegian characters trick highlighting #28

Osse opened this issue Jun 12, 2011 · 11 comments

Comments

@Osse
Copy link

Osse commented Jun 12, 2011

If a word in the text starts with æ, ø or å the highlighting is disturbed from the next word on. The highlighted character is the second or third character in the word. Some words aren't highlighted at all.

See the following picture. The last sentence of the second paragraph contains some Norwegian gibberish.
Without EasyMotion

In the following picture <Leader>w is started.
Without EasyMotion
From the first word after "øl", the highlighting is incorrect. E.g. the 'g' is highlighted instead of the 'v' in 'og'. 'øser' isn't highlighted at all, nor is is the word changed to show the target key. But if I press 'w', which is the correct target key, the cursor moves to the first character of the word.

After 'øser', 'og' and 'ærlige' are correctly changed to show the target key but the highlighting is wrong. On the word 'mennesker' the target key is put on the second character instead of the first and the highlighting is moved to the third character. Again, pressing 'z' takes me to the first character of the word. From the next line on, everything is correct.

Long story short: The movements are always correct; the highlighting is disturbed, but only is the character starts with æ, ø or å. I am guessing there is a regex somewhere that messes things up. I'm not very good with Vimscript. I'll try to take a look at it when I have the time.

PS: The same bug is also induced with also happens with ä, ö, é and è.

@mathstuf
Copy link
Contributor

Seems to be that word locations are found and stored as byte offsets. When easymotion turns ø into u, the bytes are now different, hence the offsetting.

@Lokaltog
Copy link
Member

I thought I fixed this issue quite a while ago. It was a bug with the way vim handles regexps and international characters, but I managed to work around it. Let me look into it. :)

P.S. Bra eksempler på "Norwegian gibberish" :P

@Lokaltog
Copy link
Member

Ok, so it looks like this issue is caused when international charachers are replaced with the jump target char. It seems this issue isn't affected by other international characters (which was the issue I fixed earlier).

@Osse
Copy link
Author

Osse commented Jun 17, 2011

Which other international character did you fix it for? I'm wondering since I also experienced with with é and è. I think mathstuf is right about the multibyte issue. If I use latin-1 I don't have any issues. If I use utf-8 I have issues for all the characters I mentioned initially, which are all multibyte.

Takk for skryten av norsken min :P

@mathstuf
Copy link
Contributor

On Fri, Jun 17, 2011 at 10:51:34 -0700, Osse wrote:

Which other international character did you fix it for? I'm wondering
since I also experienced with with é and è. I think mathstuf is right
about the multibyte issue. If I use latin-1 I don't have any issues.
If I use utf-8 I have issues for all the characters I mentioned
initially, which are all multibyte.

It was fixed when the character existed. This issue is when the
multibyte char is replaced with the one byte ASCII value for the target
marker. Similar issues would, I imagine, occur with multibyte target
chars in an ASCII file.

@Lokaltog
Copy link
Member

3df6a76 is an old, incomplete fix, but it partially solves a related issue with unicode chars. The problem seems to be that unicode chars (e.g. é, æ, ø, å) can occupy more than one byte, but they only occupy one screen column. The fix basically counts the number of bytes every character occupies and compensates for this. The best solution would be if vim provided a function to highlight a screen column and not a "byte column".

@mathstuf
Copy link
Contributor

On Fri, Jun 17, 2011 at 11:20:12 -0700, Lokaltog wrote:

The best solution would be if vim provided a function to highlight a
screen column and not a "byte column".

Try :help /\%v. That might be what is really wanted. Just need a
function to convert from byte column to visual column.

@Lokaltog
Copy link
Member

I've tried %v before, but it matches the virtual column, which doesn't seem to work correctly.

@mathstuf
Copy link
Contributor

On Fri, Jun 17, 2011 at 12:15:22 -0700, Lokaltog wrote:

I've tried %v before, but it matches the virtual column, which
doesn't seem to work correctly.

Right. The problem is that searchpos() returns byte column. If there is
a function to get the vcolumn from the byte column, that'd be the
solution.

There is, of course then the problem with characters that take up n
vcolumns being replaced with characters that take up some different
number of vcolumns (tabulators and Asian languages for the most part). I
can think of two solutions for this. The first is to pad the result to
the vwidth of the replaced character (lots of fun with tabulators...).
The second is to do the search for the next character to be replaced
after each replacement to properly handle byte and vcolumn changes due
to the replacement. Or we could, of course, ignore such a problem and
say that characters wider than one vcolumn are not supported at all and
tabulators are probably a dumb thing to jump using EasyMotion anyways
(throw an error if it is the character given).

I don't think the problem can be completely ignored since any of these
multi-width characters (byte or vcolumn) can pop up on a 't' or 'T'
jump.

@Lokaltog
Copy link
Member

I think I managed to solve this issue in 3cd718f, please let me know if any of you experience a similar issue again (note that the fix is on the develop branch).

@Osse
Copy link
Author

Osse commented Jun 18, 2011

After I found out that the default leader key had changed it seems to work for me! :P Thanks! :)

By the way, I really love that EasyMotion now also can jump to searches! Brilliant!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants