-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(adr): Textual switch to UTF-8 #13163
Conversation
@@ -51,11 +51,71 @@ This also prevents users signing over any hashed transaction data (fee, transact | |||
|
|||
We propose to maintain functional tests using bijectivity in the SDK. | |||
|
|||
### 2. Only ASCII 32-127 characters allowed | |||
### 2. UTF-8 characters allowed, but signing devices MAY convert them before display | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a relevant thread on why we chose ASCII in the first place: #10701 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should explain this in a paragraph and not directly in a markdown header?
Ledger devices have limited character display capabilities, so all strings MUST only contain ASCII characters in the 32-127 range. | ||
The SIGN_MODE_TEXTUAL specification allows all UTF-8 characters. The textual strings will contain all characters as-is, with the modifications below: | ||
- the line feed `\n` character (ASCII: 10) is escaped using quotation marks: `"\n"`. This is to disambiguate with the `\n` control character used to signal a screen change on the signing device. | ||
- the quotation mark character `"` (ASCII: 34) is escaped with a backslash prefix: `\"`. This is to allow bijectivity if the signing device decides to convert UTF-8 characters into its own set of displayable characters, using `"` as a control character. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually the backslash character \
, not quotation marks, that needs to be quoted in order to legibly and bijectively quote newlines. As written, this allows all other ASCII control characters, e.g. nul, bel, etc, to transmit as themselves despite having no printable representation, even on Unicode-enabled devices.
I strongly recommend following an established standard quotation algorithm rather than trying to invent something new. There are a large number of existing designs that follow the basic pattern of backslash followed by:
b
: backspacef
: form feedn
: newliner
: carriage returnt
: horizontal tabv
: vertical tab0
: nul\
: backslash- some means of quoting other control characters by number
These transformations ought to be part of the textual spec, rather than relegated to the Ledger, because they're necessary for bijectivity and accurate legible quoting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok. I completely misunderstood you during our monday call, when you were mentioning "quotation" i understood quotation marks, hence this PR. A lot of this needs to be rewritten then.
Memo: foo | ||
|
||
// JSON: {"memo": "\"foo\""} | ||
Memo: \"foo\" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't use quotation marks as metacharacters, we could also just say the following without risk of ambiguity:
Memo: "foo"
Memo: \"foo\" | ||
|
||
// JSON: {"memo": "foo\nbar"} | ||
Memo: foo"\n"bar // Where \n is the single line-feed character |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or foo\nbar
.
closing in favor of #13434 |
Description
Closes: #XXXX
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!
to the type prefix if API or client breaking changeCHANGELOG.md
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
!
in the type prefix if API or client breaking change