-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply strict uint64 casting #1746
Conversation
I doubt this would introduce any requisite changes on clients.
What do you mean? It did add a bunch of casts. Are you saying that the linter didn't force any of these?
Slightly resistant for this reason. Does the |
Thanks to @protolambda, it does check the boundaries and raise
I only added casting for the variables where x = 1
y = -1
return unit64(x + y) The linter can't detect that We could cover more cases once (i) |
@hwwhww Remerkleable already does that, I implemented that shortly after that issue. See changelog: https://github.com/protolambda/remerkleable/blob/master/CHANGELOG.rst |
051a9ee
to
b8e5941
Compare
Is this ready to go out of draft mode? And which release would we like to target with this? |
b8e5941
to
224ef35
Compare
@protolambda @djrtwo It does make the spec a bit more lengthy.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with some of the changes, but feel uneasy about some of the other changes. Would like a review from others here. At least it's all cosmetic/presentation/type checks.
specs/phase0/beacon-chain.md
Outdated
@@ -729,14 +728,18 @@ def compute_shuffled_index(index: uint64, index_count: uint64, seed: Bytes32) -> | |||
|
|||
# Swap or not (https://link.springer.com/content/pdf/10.1007%2F978-3-642-32009-5_1.pdf) | |||
# See the 'generalized domain' algorithm on page 3 | |||
for current_round in range(SHUFFLE_ROUND_COUNT): | |||
pivot = bytes_to_int(hash(seed + int_to_bytes(current_round, length=1))[0:8]) % index_count | |||
for current_round in map(uint64, range(SHUFFLE_ROUND_COUNT)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the type of casting I'm most uneasy with. It just feels like a Python hack.
Does the following work for mypy?
current_round: uint64
for current_round in range(SHUFFLE_ROUND_COUNT):
Or is that too verbose for spec also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't work for mypy since the issue is that the range(SHUFFLE_ROUND_COUNT)
generator returns int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current_round
is also only converted to a single byte (see other comment about uint8
). Avoiding this map
would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about changing the type of the first parameter of int_to_bytes
to int
?
current_round
can be kept int
then.
Or the goal is to get read of unnecessary int
s?
Alternatively one can introduce something like urange
method, which will be the same map
applied to range
, though.
def urange(*args):
return map(uint64, range(*args))
As uint64
is heavily used in pyspec, introducing unsigned range
looks pretty reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Agreed with that
current_round
can beuint8
. -
Currently,
uint8
is not compatible withuint64
formypy
in remerkerble layer when we want to pass it toint_to_bytes
. Although it's fine to setcurrent_round
touint8
type in pyspec right now since we haven't changed our CI linter setting to as strict as Add SSZ utils to mypy checking scope #1707. -
For implementation-wise, I feel having
int_to_bytes
be an instance function in SSZ level would be neater.- We can define
uint.uint_to_bytes
inremerkerble
:
def uint_to_bytes(self, byteorder: str) -> bytes: return self.to_bytes(self.type_byte_length, byteorder=byteorder)
- To decrease the requirements for pyspec readers to understand remerkerble, we can modify
int_to_bytes
to:
def int_to_bytes(n: uint) -> bytes: """ Return the serialization of ``n`` in ``ENDIANNESS``-endian in the byte length of ``uint`` type. """ return n.uint_to_bytes(byteorder=ENDIANNESS)
- We can define
^^^^ Updated: should be n.uint_to_bytes(byteorder=ENDIANNESS)
instead of uint.uint_to_bytes(byteorder=ENDIANNESS)
What do you think?
@ericsson49
Hi!
- To me, the goal is setting a boundary of all variables in pyspec since Python
int
is unbounded. I hope it would decrease the ambiguity and make other implementers' lives better? - IMHO It seems better to have
urange
supports alluint
types. Especially if we want to setcurrent_round
touint8
as @protolambda suggested. I'd say if we really want to avoidmap
for readability, the simplest way is:
for current_round in range(SHUFFLE_ROUND_COUNT):
current_round = uint8(current_round) # cast `int` to `uint8` type
What do you think?
- What do you think about the above proposal for
int_to_bytes
change?
Thank you both! 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can define uint.uint_to_bytes in remerkerble:
We already have that, as well as stream variants, for every type :).
But for spec style and less hidden things, we generally avoid it.
The methods:
def encode_bytes(self) -> bytes:
def serialize(self, stream: BinaryIO) -> int:
def decode_bytes(cls: Type[V], bytez: bytes) -> V:
def deserialize(cls: Type[V], stream: BinaryIO, scope: int) -> V:
So uint8(123).encode_bytes()
would just work to get it as bytes.
And there are a ton more util methods for size bounds, working with merkle backings, etc., but it's meant for tooling more than the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh great!
But for spec style and less hidden things, we generally avoid it.
Agreed with that we should generally avoid exposing SSZ APIs. Another solution is that we can describe int_to_bytes(n: uint)
in the same way we describe hash_tree_root
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created an alternative solution PR: #1935
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hwwhww
Hi! Sorry, I missed a notification, so responding somewhat late.
To me, the goal is setting a boundary of all variables in pyspec since Python int is unbounded. I hope it would decrease the ambiguity and make other implementers' lives better?
That's a great goal. It definitely helps. Basically, a typical developer would prefer using a bounded integer type whenever possible. And resolving possible range can be a problem.
In the particular case, it's however obvious that current_round
is bounded.
But as current_round
is intended to be uint8
it definitely worth using an explicit type/conversion here.
However, unbounded int
cannot be completely avoided, I guess, e.g. legendre_bit
uses it and private keys are sometimes used. But it would be definitely great to separate cases where bounded int
s are enough from those, where arbitrary precision int
s are more appropriate.
IMHO It seems better to have urange supports all uint types. Especially if we want to set current_round to uint8 as @protolambda suggested. I'd say if we really want to avoid map for readability, the simplest way is:
I think it's a matter of taste. Basically, all variants are clear, but some slightly less concise.
specs/phase0/beacon-chain.md
Outdated
index=state.eth1_deposit_index, | ||
root=state.eth1_data.deposit_root, | ||
) | ||
|
||
# Deposits must be processed in order | ||
state.eth1_deposit_index += 1 | ||
state.eth1_deposit_index = uint64(state.eth1_deposit_index + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm it's so strange. exit_queue_epoch += Epoch(1)
is okay, but state.eth1_deposit_index += uint64(1)
returns error:
Incompatible types in assignment (expression has type "uint", variable has type "uint64")
Strange dark mypy hole. 🤦♀️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integer additions etc. are implemented on the uint base class in remerkleable, to not duplicate it everywhere. I will try to add typevar annotations that might help mypy understand it, then you can try this in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that epoch thing works, then state.eth1_deposit_index += DepositIndex(1)
should also work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nvm, DepositIndex doesn't have its own type in the pyspec. Hmm, that error is weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still wonder about this thing, but suppose the other typing fixes outweigh this style issue.
9a789cd
to
cd91380
Compare
Hey, we didn't include this issue since the v0.12 release was kept small where possible, but it could still be useful for better type checking. I'll try and resolve the merge conflicts, and I think we can define a |
See |
50252ec
to
4428a6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typing fixes are generally good, but we need to iterate some more on things like the shuffle rounds type, otherwise we only confuse implementers with the new typing.
specs/phase0/beacon-chain.md
Outdated
@@ -729,14 +728,18 @@ def compute_shuffled_index(index: uint64, index_count: uint64, seed: Bytes32) -> | |||
|
|||
# Swap or not (https://link.springer.com/content/pdf/10.1007%2F978-3-642-32009-5_1.pdf) | |||
# See the 'generalized domain' algorithm on page 3 | |||
for current_round in range(SHUFFLE_ROUND_COUNT): | |||
pivot = bytes_to_int(hash(seed + int_to_bytes(current_round, length=1))[0:8]) % index_count | |||
for current_round in map(uint64, range(SHUFFLE_ROUND_COUNT)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current_round
is also only converted to a single byte (see other comment about uint8
). Avoiding this map
would be nice.
specs/phase0/beacon-chain.md
Outdated
source = hash(seed + int_to_bytes(current_round, length=1) + int_to_bytes(position // 256, length=4)) | ||
source = hash( | ||
seed | ||
+ int_to_bytes(current_round, length=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this count as a limitation to the shuffle-round counts, making its maximum value 255 (inclusive)? Should we change it to a uint8 type? And the SHUFFLE_ROUND_COUNT
too? I don't think clients actually use uint64
here at least. E.g. lighthouse uses uint8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my response above #1746 (comment)
For some of the |
Merged new arithmetic operators for better typing into |
There are some issues of the It disallows the Python sugar syntax like: num_validators = spec.SLOTS_PER_EPOCH * 8
return [spec.MAX_EFFECTIVE_BALANCE] * num_validators Error message:
We can change it to: num_validators = spec.SLOTS_PER_EPOCH * 8
default_balances = []
for _ in range(num_validators):
default_balances.append(spec.MAX_EFFECTIVE_BALANCE)
return default_balances But it would be much verbose than before. Is it possible to allow |
Ah yes, forgot about non-integer |
Updated |
Made |
Update: releasing a new |
…exing, avoid negative comparison in test
Released See |
remerkleable==0.1.17 + infer type
Updating some python code that uses the spec, and noticed that with these new changes, we can change previous v0.12 type hacks like |
Where are we at on this PR @hwwhww? I'm interested in getting this into v0.12.2 |
Ok |
@djrtwo I'm slightly inclined to merge this one before #1935. But it's okay either way. Also, to double confirm with you and @protolambda, do you think #1935 is ready to go? I haven't seen any objections from client teams so far. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@djrtwo I think this can go into next release, when would that be?
Child of #1707, #1739, and #1701.
Issue
We've already stated that all calculations should be in
uint64
domain (#1739). However, it doesn't apply for Python nativeint
.This draft applies some strict
uint64
castings to known function parameter typings by fixing #1707int
v.s.uint64
incompatible errors thatmypy
found.Pros:
remerkleable
) layer!Cons:
Set it to draft since I'm wondering if it would introduce a serious implementation burden for the client teams.