Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSZ implementation for exec. spec - Support for Python 3 typing. #1077

Merged
merged 52 commits into from
Jun 4, 2019

Conversation

protolambda
Copy link
Contributor

@protolambda protolambda commented May 12, 2019

I'm working on tests now, and shifted away from the SSZ readability/typing ambitions after my POC presented in #1064. I hope others can pick up the SSZ typing idea now, with this code ready to transform the spec. See comments in issue. Finished it (testing PRs feature-wise complete, merging coming along, I got to work on SSZ again)

What still needs work:

  • less hacky spec builder / type field translation
  • check that when constants are changed (config override), types adapt. Otherwise, re-introduce the re-init function we have now for old non-python type system. Introduced re-init function back. May explore module-reloading as alternative later, after phase-1 spec-building is in.
  • fix integer/bytesN types being recognized or not. Just need to be unique declarations to work for the SSZ matching code. Otherwise no-overhead int/bytes objects. (Or maybe Dankrad wants to box them anyway for bounds checking?)
    • After discussion with @vbuterin, boxing just the less-used integer types seemed like the easy yet reasonable solution. We let int default to uint64, and don't have to worry about ugly slow integer boxes in 99% of the spec.
  • make it run outside of POC setting, in exec. spec. minor detail. Fixed it. The ssz_static build output now matches that of dev, but with a type based ssz system 🎉

@protolambda
Copy link
Contributor Author

Let's don't forget about our oldest current open spec PR: #695 (from before executable-spec time 😂). We finally get to do SSZ typing well with this PR, and have static-type checking in the spec. 🎉

@protolambda
Copy link
Contributor Author

  • is_list_type(typ) can be issubclass(typ, list), is_vector_type(typ) can be issubclass(typ, Vector), and similar for Container
    • Although, not completely sure about issubclass, functions may be more readable for non-python programmers.
  • issubclass(typ, bool) can typ == bool for readability in serialize_basic. Bools don't generally have subclasses (do we plan to?)
  • type inference like List[infer_type(obj[0])] is not going to work. Integer typing is lost. And no annoations available sadly (reason for inferring). We can either default to uint64, or throw an error when someone tries to infer a bar List[uintX] or uintX type.
  • is_uint(obj.__class__) is not going to work. NewType is not changing the __class__ afaik
  • is_list_type doesn't have to check attributes, it can do issubclass. Same for vector. Same for container.
  • I like the read_elem_typ usage in hash-tree-root, abstracts away the difference between bytes and a List nicely.
  • zip(obj.get_field_values(), typ.get_field_types()) can be replaced with obj.get_fields().items(). Similar change for signing root
  • we may want to rename SSZContainer to just Container, more consistent with List and Vector, nice and brief.
  • we need to update the spec builder to just keep classes as is :)
    If we're ambitious, we can take the following steps:
    1. extract the constants from the spec (just define names + types), or define them in python style with values (comments still around for X& seconds?)
    2. import all helpers at once with a from spec_things import *, then manage helpers / type declarations / imports from elsewhere there.
    3. Run and edit the spec as a python notebook, converted back and forth to markdown. See https://github.com/aaren/notedown and issue RST/Sphinx docs #835. We should have had it during the experimentation phase months ago, to make life easy. Not sure if it is still a great time. (and if we can make it work for phase-1 modification compatibility, or still need the spec builder)

I'm pro NewType, as it is good for performance. I also know others are in favor of boxed integers (please make a POC whoever has the time), for nice type checking + under/overflow bounds checking, at a performance cost. Can we make a decision yet? @dankrad @hwwhww

@vbuterin
Copy link
Contributor

is_list_type(typ) can be issubclass(typ, list), is_vector_type(typ) can be issubclass(typ, Vector), and similar for Container

I tried this and it actually didn't work:

>>> issubclass(ssz.List[ssz.uint64], list)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: issubclass() arg 1 must be a class

type inference like List[infer_type(obj[0])] is not going to work. Integer typing is lost. And no annoations available sadly (reason for inferring). We can either default to uint64, or throw an error when someone tries to infer a bar List[uintX] or uintX type.

So the approach I was assuming we would take here is that plain int objects would be uint64's, and all other types of uints would need a wrapper class like Vector. As right now we don't even use any other kind of int anywhere, so it would not really cost much. I want to fight hard for full type inference capability 😄

zip(obj.get_field_values(), typ.get_field_types()) can be replaced with obj.get_fields().items(). Similar change for signing root

Great! Will change.

we may want to rename SSZContainer to just Container, more consistent with List and Vector, nice and brief.

Agree! Will change.

@hwwhww
Copy link
Contributor

hwwhww commented May 26, 2019

Questions:

  1. Are you taking typing.List as the SSZ List type?
  2. Have you made the serialize work well with NewType() uint yet? The latest commit and 81cb4a2 are both broken, just wonder if the type issues are solved wrt using NewType instead of class. :)

--

  • My gut feeling is that adding ListMeta and UintMeta/Uint would enable us to apply similar rules to deal with the types easily and more readable.
  • TBH function attribute NewType(...).byte_len does look hacky. I’d say I’m fine with status quo if it does work. I’m interested in making the UintMeta/Uint PoC after the current code passes the tests. That is, I don’t think the box checks are required of "minimal" SSZ, but if the performance is not that bad, it may help increase more confidence in test generator, but that could be in other iterations.

@protolambda
Copy link
Contributor Author

@vbuterin

issubclass(ssz.List[ssz.uint64], list)

Right. It works for Vector, but List is very different. List is from the typing package, and more of an add-on to make a normal list appear to be typed.

from typing import List, NewType

print(issubclass(List, list))
print(issubclass(List[str], list))
print(issubclass(list, List))
# print(issubclass(list, List[str])) # TypeError: Parameterized generics cannot be used with class or instance checks

uint8 = NewType('uint8', int)
print(issubclass(List[uint8], list))
# print(issubclass(list, List[uint8])) # TypeError: Parameterized generics cannot be used with class or instance checks

Prints all True in Python 3.7

I can't seem to reproduce your error however. I wonder if it's the Python version, or something else.

Here's the current Vector behavior:

print(issubclass(Vector, Vector))
print(issubclass(Vector[str, 123], Vector))
print(issubclass(Vector, Vector[str, 123]))

uint8 = NewType('uint8', int)
print(issubclass(Vector[uint8, 123], Vector))
print(issubclass(Vector, Vector[uint8, 123]))
False
True
False
True
False

We may want to change the behavior of issubclass(Vector, Vector), but creating an instance of Vector, without element type and length, is not allowed, so we don't normally see this case.

@protolambda
Copy link
Contributor Author

protolambda commented May 26, 2019

@hwwhww

Are you taking typing.List as the SSZ List type?

Sort of. We use List[abc] in type annotations, and those annotations are used to deal with SSZ encoding/hashing fields of our containers. However, values are just list, and there element type cannot be inferred (afaik) :(. We could just create another class similar to the Vector + VectorMeta metaclass hack, and have typed lists. But then we would need to do: SSZList[Validator](validator0, validator1, ...) when creating the list, and need to convert lists to typed lists. However, I believe one can still do something like SSZList[FooBar](foo.bar() for foo in foos), similar to normal list comprehension. And I'm writing SSZList here, but if don't deal with typing.List, we can just call ours List.
It would make SSZ-life a lot easier and consistent, but I don't think it's very Python at all 🙈.

Have you made the serialize work well with NewType() uint yet? The latest commit and 81cb4a2 are both broken, just wonder if the type issues are solved wrt using NewType instead of class. :)

uintX NewTypes are very hard to deal with. They are not classes. They act like the int underneath in the type system during runtime (hence the speed boost over boxed integers). However, since e.g. the uint8 declaration doesn't disappear, it's a unique object, and annotations still reference it during runtime, we get to use this non-class object as if it is a type, and just check type equivalence by checking the equivalence of the referenced NewType output. It's a hack, but gets us zero-overhead integers. At the cost of only being able to deal with these integers when the values are passed with either explicit type argument to the serialize/hash_tree_root functions or this typing is retrieved from annotations from the encompassing container.

One work-around would be to default to uint64 for a bare integer. And then box (i.e. wrap in actual class) or remove the others. (note that we have BytesN[123] to deal with data of such sizes still.)
Either way, I feel the limits of Python typing here, and I'm open to suggestions.

My gut feeling is that adding ListMeta and UintMeta/Uint would enable us to apply similar rules to deal with the types easily and more readable.

See above comment on custom List type. We can do it, but we're not using the default list anymore. On the other hand, we don't need a typed list everywhere, just in containers etc. So it may actually be a fine choice. (i.e. we only read the verbosity of our custom List in type declarations and when we create a new instance)

TBH function attribute NewType(...).byte_len does look hacky.

It does work. But yes, it is super hacky. But the alternative of boxing integers doesn't sound great either. Maybe we can just go for the default uint64 route, and box / remove the others? (We're only really using uint64, and byte only in a bytes(N) context. And then BytesN[x] covers all other raw data.

@protolambda protolambda marked this pull request as ready for review May 27, 2019 22:53
@protolambda
Copy link
Contributor Author

@hwwhww The CI doesn't see the flake8 version in the Makefile (I updated it) since it's not in the requirements file (where the checksum is taken from), and still uses the old cached flake8, with bugged linting for type annotations. :(
Also see comment about CI + venv requirements in the other PR about merging in the contract. Similar problems there.

All issues (merge conflicts, testing, linting) resolved. Just need the partials finished + conforming to PEP-8, and we can get this merged.
I do think the partials are nice, but also a bit of scope-creep in this PR. We're not implementing SSZ-unions/nulls, yet we do partials already. @vbuterin your call when you like them to get in. PR is otherwise ready for review/merge.

@protolambda
Copy link
Contributor Author

Had to fix a dependency issue (fixed in commit 8e8ef2d), but verified that the ssz_static test suites generated by the old ssz implementation currently on dev are identical to the tests generated by the implementation in this branch. 🎉

@protolambda
Copy link
Contributor Author

@hwwhww Fixed the linting caching problem. Can we get this PR into final review, then merge?

Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments. Awesome python wizardry!
I did not do a deep review of generalized indices

specs/core/0_beacon-chain.md Outdated Show resolved Hide resolved
specs/core/0_beacon-chain.md Outdated Show resolved Hide resolved
specs/core/0_beacon-chain.md Outdated Show resolved Hide resolved
specs/core/0_beacon-chain.md Outdated Show resolved Hide resolved
specs/core/0_beacon-chain.md Outdated Show resolved Hide resolved
specs/core/0_beacon-chain.md Outdated Show resolved Hide resolved
test_libs/pyspec/eth2spec/utils/ssz/ssz_partials.py Outdated Show resolved Hide resolved
Copy link
Contributor

@hwwhww hwwhww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Witchcraft! 🧙‍♂️

I only went through ssz_typing.py!

I’d (strongly) suggest moving the new merkle_minimal.py and ssz_partials.py to another PR so that it could get a proper review.

@protolambda
Copy link
Contributor Author

Ran the SSZ static generators again, and they result in the same test-vectors as running the generator on the dev branch (master has different type definitions, but same ssz implementation as dev now).
Ready to merge, if reviewers are all good with it 👍

@protolambda
Copy link
Contributor Author

I’d (strongly) suggest moving the new merkle_minimal.py and ssz_partials.py to another PR so that it could get a proper review.

merkle_minimal.py didn't really change much. Just a few simple efficiency gains while we are modifying (and re-testing) the code around it anyway. Fixed the power-of-2 util to be even more efficient, yet readable (see #1115 (comment))

The partials implementation is moved out to the ssz-partials branch (based on discussion in the Tue 4 Jun call). When this is merged we'll rebase the partials branch, and open a PR for it. @vbuterin

@djrtwo djrtwo merged commit e8b4c4c into dev Jun 4, 2019
@djrtwo djrtwo deleted the ssz-impl-rework branch June 4, 2019 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope:SSZ Simple Serialize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants