Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_custody_atoms: avoid application of modulo operation (%) to a negative value #1923

Closed
ericsson49 opened this issue Jun 19, 2020 · 5 comments · Fixed by #1943
Closed

get_custody_atoms: avoid application of modulo operation (%) to a negative value #1923

ericsson49 opened this issue Jun 19, 2020 · 5 comments · Fixed by #1943
Labels
general:presentation Presentation (as opposed to content)

Comments

@ericsson49
Copy link
Contributor

get_custody_atoms applies modulo operation to a negative dividend: (-len(bytez) % BYTES_PER_CUSTODY_ATOM).
Modulo operation implementations differ from language to language, when dividend (or divisor) is negative. Therefore that may lead to a bug in a language, which implements modulo operation in a different way than python does (i.e. returning a negative result when a dividend is negative).

@protolambda in a personal communication has suggested that such an ambiguity is better to be avoided in the spec.

@dankrad
Copy link
Contributor

dankrad commented Jun 20, 2020

This is a very useful shorthand to get the number of missing elements to pad to a certain length. It's certainly unfortunate if we can't use it :(

@ericsson49
Copy link
Contributor Author

I have mixed feelings about it. An alternative is definitely bulky and this padding approach is short and elegant.

However, we encounter more and more cases when concise is not clear (e.g. #1746 or #1924).
I.e. a short form may be ambiguous and, as a result, unsafe, in the sense that client implementers may implement it in different ways, potentially leading to a consensus break.

In the particular case, the modulo operation is clearly defined in Python, which is great. However, people are people and make mistakes, the particular case is not obvious to humans. On the other side, bugs here are easy to reveal with appropriate tests.

As a part of overall efforts to make the spec more formal and less ambiguous, my goal is at least to inform EF team about such ambiguities (and suggest a solution, when possible).

@mkalinin
Copy link
Contributor

mkalinin commented Jun 22, 2020

We used to have this problem with Java and JavaScript is affected as well. Both, Java and JavaScript by default does it in the following way:

-4 % 3 = -1

While Python follows the canonical math approach:

In mathematics, the result of the modulo operation is an equivalence class, and any member of the class may be chosen as representative; however, the usual representative is the least positive residue, the smallest non-negative integer that belongs to that class, i.e. the remainder of the Euclidean division.

EDT:

Thus -4 % 3 = 2 in Python and Ruby. And some other languages. There is a helper method in Java SDK Long.remainderUnsigned that does the calculation in the same way as Python My bad, this this the wrong method (thanks to @ericsson49). We've spent a time to learn this ambiguity and make a workaround. And I think that modulo operation should at least be explicitly defined in the spec and covered with test that will cut this ambiguity off on the early stage.

Java spec also have this operation description: https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.17.3 -- in Java % is for remainder, not for a modulo operation.

@hwwhww hwwhww added the general:presentation Presentation (as opposed to content) label Jun 22, 2020
@mratsim
Copy link
Contributor

mratsim commented Jun 22, 2020

Many languages, including C, Rust, Java Nim, have the mod or % operator be the remainder, which is the same sign as the dividend, while Python, R and solidity uses the sign of the divisor (https://en.wikipedia.org/wiki/Modulo_operation#In_programming_languages)

Either we define modulo like @mkalinin sugested, or we use explicit sign(b) * (abs(a) % abs(b)) in the spec to remove ambiguities, which is the approach taken by IETF specs. This makes it clear to implementers that expected sign.

@djrtwo
Copy link
Contributor

djrtwo commented Jun 23, 2020

A couple of things:

  1. We previously removed this usage from the spec (long ago, not sure the example) because the usage is very unclear to implementers in other languages
  2. We don't support negative integers in any consensus types and want to avoid them in an intermediate calculations (because a strongly typed language would not handle them "magically" like python).

I suggest we avoid this all together

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general:presentation Presentation (as opposed to content)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants