Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifying the legibility of binary values in text memo #2310

Closed
MisterTicot opened this issue Oct 10, 2019 · 5 comments
Closed

Clarifying the legibility of binary values in text memo #2310

MisterTicot opened this issue Oct 10, 2019 · 5 comments
Labels
docs issue/PR that affects documentation

Comments

@MisterTicot
Copy link

Documentation from various places, such as there states that MEMO_TEXT must be a string encoded in ASCII or UTF-8.

In practice, this is possible to use any arbitrary (=binary) value. For example, I just validated a transaction with base64 memo ABCD: here.

One can make the case that an ASCII/UTF-8 strings can be made of any binary data anyway, but that's not what most reader assume when reading that the value must be an ASCII/UTF-8 string.

I suggest that the documentation gets updated to better match the network behavior, or alternatively that the desired behavior gets enforced by the network.

Meanwhile, please let me know if this is intended, as there's a discussion going on about whether or not Trezor should allow for binary data in MEMO_TEXT: trezor/trezor-firmware#610

@MisterTicot MisterTicot added the docs issue/PR that affects documentation label Oct 10, 2019
@MonsieurNicolas
Copy link
Contributor

You are correct, the payload for MEMO_TEXT can be anything, I think it was called TEXT as it's the main use case, but in reality at the core layer it's really a "variable size buffer", the same way you can stuff any character (including \0) in std::string.

@MisterTicot
Copy link
Author

Thank you for the quick reply.

@prusnak
Copy link

prusnak commented Oct 12, 2019

@MonsieurNicolas How hard would it be to introduce another type MEMO_BINARY which would cleanly indicate there are non-printable characters present in the payload. They way MEMO_TEXT is misused for this is just awful. Having another type would be perfect for distinguishing this (especially when you have types for ID, HASH, etc. so it seems there is a will to distinguish the fields).

@MisterTicot
Copy link
Author

@prusnak

I can't answer for Stellar teams, but it seems like a backward-compatibility breaking change. So far, my observation is that Stellar devs have been avoiding this like plague, which is the usual practice in blockchain environments.

Of course, in term of classical software development, it is a reasonable change. In fact, in Cosmic.link, I use a virtual "base64" (binary) type.

To answer your question, I think the issues could be:

  1. Introducing additional validation logic into core. Not the end of the world, but checking for printable character is not exactly light either.
  2. Breaking ecosystem logic that use binary MEMO_TEXT. This is bad, but fixable. The more annoying issue is:
  3. Introducing an inconsistency into ledger history. Basically, services that parse the history for whatever reason will need to be aware that:
    • Before ledger N, MEMO_TEXT can be either printable characters or binary value, and MEMO_BINARY doesn't exists.
    • After ledger N, MEMO_TEXT can only be made of printable characters and binary values goes into MEMO_BINARY.
    • If parsing memos related to a particular service, the logic must be aware of how that service handled the change.

Finally, after all those troubles, you could still fail to guarantee that your string is universally printable because:

  • UTF-8 is so big that most systems can print only a subset of it - and you don't know which beforehand.
  • UTF-8 has characters that are printable only in combinations with other. Something that is even more annoying to check for.

So really, you would have printable strings only on the paper.

I think this is more reasonable to implement an "unclean" fix into Trezor right now than asking Stellar to go through all that trouble.

What you could do is handle the MEMO_TEXT type the same way you handle manageData value parameter. Additionally, you could use a simple check to decide if the value is to be printed as characters or as binary - I do it cheaply this way. This is not fail-safe, but has a very high probability of success (I'd say > 99.99%).

@MonsieurNicolas
Copy link
Contributor

@MisterTicot is right on point. It doesn't seem to really improve what SDKs have to do.
The simplest thing to do would be to rename "TEXT" into "BINARY" (which doesn't break on the wire compat) and add a new "TEXT" type that is super restrictive like alphanum+space but that seems maybe too much, and SDKs still have to deal with heuristics to print "BINARY".
If you guys want to discuss further, I'd recommend the dev mailing list as we do not discuss protocol changes in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs issue/PR that affects documentation
Projects
None yet
Development

No branches or pull requests

3 participants