-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
META - UTF-8 and BiDi support for the various languages #19
Comments
#include <stdio.h>
int main(void) {
printf("ÄÖæxßf®\n");
return 0;
} Actual result: Expected result: |
flags cant contain strange chars, and by strange i mean non-7bit-ascii thats how the r_name_filter() works not going to fix/change this before the release. flags needs to be rewritten to use sdb
|
@radare updated the bug, thanks! |
In Hebrew, using XTERM terminal - the comments are being represented, however they are written the other around as a problem in XTERM in general that the first character is in the left and the next one is afterwards, ltr instead of rtl(reverts the text). Example given:
As written in xterm/README(https://github.com/joejulian/xterm/blob/master/README.i18n) 👍 From a little bit of research I am taking a look into another console which might offer the support for rtl languages and it will be more correct to operate r2 under that console |
@holdsworth try mlterm |
@XVilka mlterm doesn't represent Hebrew at all, with Konsole it works perfectly fine for some reason. |
please update the checkboxes |
please be more precise |
cc @kazarmy |
CC @queenp |
I've run r2 in a (Linux) Emacs shell and it works fine for RTL and Arabic shaping (visual mode is unusable though). For consoles that don't support RTL, implementing a full-fledged Unicode Bidirectional Algorithm in r2 appears to require humongous and probably-not-worth-it effort but a simple algorithm that reverses Arabic and Hebrew character sequences shouldn't be hard to do. An Arabic shaping algorithm shouldn't be hard to do either. |
May be have an option to work with FriBiDi somehow for bidirectional texts. But first, all visual modes should be fixed to properly work with Unicode and RTL chars. |
Yep. Output of:
appears promising ( |
We need better page for BiDirectional text support across terminals, like we did for TrueColors. So it will be easier to push terminal developers and easier track the progress. |
radareorg/radare2#9608 is also related |
@Vane11ope can you please try to enable the unicode reflines and stuff in the visual panels code? |
@cyanpencil @Vane11ope @kazarmy please help to review the current state, and what is needed to be done. |
@kazarmy do you for any chance have a binary on which we can test the things you just posted? Or did you test them by inserting a comment in the disasm? |
Try looking at the r2r bins The test binaries were produced using string literals (C / C++). Btw, there are some UTF utility functions in I see you did:
Thanks! |
merged @kazarmy checkbox with main post |
Unfortunately I knew the existence of that file, but did not read it throughly and re-implemented function Thanks for the heads up! |
From what I know shaping and diacritics will require dependency on ICU, no less. Those are very complex algorithms and depend highly on the language. |
Actually at least for Arabic, shaping is not that complex. Remember that in real life shaping has to be done manually by kids. |
For a future reference - mintty/mintty#837 |
This library seems also interesting https://github.com/JuliaStrings/utf8proc |
See also radareorg/radare2#12629 |
See https://terminal-wg.pages.freedesktop.org/bidi/ for details/proposal about BiDi |
So it started: now implemented in VTE: https://terminal-wg.pages.freedesktop.org/bidi/implementations.html#vte See also https://gist.github.com/XVilka/a0e49e1c65370ba11c17 |
Note, that with the release of GNOME 3.34 the support of BiDi is available in Gnome Terminal out of the box, which makes testing/implementing it in the other programs, such as radare2 way easier. FYI @deepakchethan |
This issue has been moved from radareorg/radare2 to radareorg/ideas as we are trying to clean our backlog and this issue has probably been created a long while ago. This is an effort to help contributors understand what are the actionable items they can work on, prioritize issues better and help users find active/duplicated issues more easily. If this is not an enhancement/improvement/general idea but a bug, feel free to ask for re-transfer to main repo. Thanks for your understanding and contribution with this issue. |
p=
commands and Unicode (UTF-8) Support scr.utf8=true in charts output (p=) radare2#14351p-
commands and Unicode (UTF-8) Support Table API andscr.utf8=true
inp-h
command radare2#14352The text was updated successfully, but these errors were encountered: