-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Static layout analysis hooks for near-zero-cost serialization/deserialization #24
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…atch the specification
… seem to be shaping up the way I want them to.
…arate attributes compute_bit_length_set() and bit_length_range
…for implicit fields (variable-length arrays and unions)
Pull Request Test Coverage Report for Build 213
💛 - Coveralls |
1 similar comment
Pull Request Test Coverage Report for Build 213
💛 - Coveralls |
Pull Request Test Coverage Report for Build 223
💛 - Coveralls |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In Libuavcan, there is a huge chunk of heavily templated code responsible for data serialization, which upon monomorphization resolves into a series of invocations of low-level bit-level data copy routine: https://github.com/UAVCAN/libuavcan/blob/fd8ba19bc9c09c05a1ab60289b3e7158810e9bd0/libuavcan/src/marshal/uc_bit_array_copy.cpp#L12-L58. In Libcanard, there are basic functions
canardEncodePrimitive(..)
andcanardDecodePrimitive(..)
which serve a similar purpose, except that they lack any type safety because C.Bit-level copying is very slow. I don't have exact numbers at hand, but I'm considering to obtain them later once our new UAVCAN v1.0 implementations are available. Bit-level serialization is slow, but it is also generic, meaning that it is applicable always regardless of data alignment; yet, if one were to cast a very careful look at our standard definitions (and most of the known third-party definitions, e.g., ArduPilot, OlliW), they would see that majority of fields are always at least byte-aligned, meaning that slow bit-level copying could be avoided if we could determine such always-aligned fields statically.
Earlier we invented the concept of compile-time alignment checks, which was discussed at length here: https://forum.uavcan.org/t/new-dsdl-directive-for-simple-compile-time-checks/190. This same concept can be extended to facilitate static bit layout analysis in order to allow code generators to emit the fastest possible serialization code, resorting to slower methods only when faster ones cannot be proven to be safe. One can define the following arbitrary categories of serialization approaches:
memcpy()
applied to the whole data structure. This is applicable when the alignment, size, and byte order requirements of the native platform match those of UAVCAN. As you will see later, these properties are discoverable and checkable at the code generation time, provided that we can make reliable assumptions about the target platform (such as byte order and type alignment requirement, which is usuallysizeof(type)
; violation of such assumptions can be detected later at compile time so they are safe).memcpy()
applied to a given field. E.g., if we have auint64
field in a data structure, the byte order of the platform is little-endian, and the field is always aligned at the byte boundary, we can directly memcpy the field into the final buffer, again avoiding bit-level copying.The proof of alignment is obtained by PyDSDL by manipulating bit length sets of serialized representations of data types. A code generator can request PyDSDL to determine if there is such serialized representation of a composite or array type which would NOT meet a specified alignment goal (say, a byte (8 bit), or 64-bit if we're serializing a field of uint64), and then use the answer to choose the appropriate (fastest safe) serialization strategy at code generation time (there are some known performance issues: #23).
This new logic is exposed to the user via the following new API entries:
BitLengthSet
and its overloaded+
operator.CompositeType.iterate_fields_with_offsets(base_offset: BitLengthSet = None) -> Iterator[Tuple[Field, BitLengthSet]]
FixedLengthArrayType.enumerate_elements_with_offsets(base_offset: BitLengthSet = None) -> Iterator[Tuple[int, BitLengthSet]]
Early selection of the serialization strategy has an important implication on the serialization of nested data structures. A data structure can be nested into another one at an arbitrary alignment, which would defeat the purpose of static layout analysis since the code generator wouldn't be able to make any assumptions about the base offset. Additionally, the misaligned origin of a data structure does not necessarily imply that every following field of it will be misaligned as well. Consider the following example:
Due to the one-bit union tag preceding the actual value, both
a
andb
are misaligned. Now, imagine that the above union is nested into another structure as follows:The union is padded so that the one-bit tag brings the following data fields into alignment. The example demonstrates that in order to take full advantage of the layout analysis, a code generator must model nested object hierarchies holistically rather than atomically. The per-type generated serialization functions/methods may have some basic alignment requirement chosen by the author of the code generator (for example, they may require the serialization buffer to be always byte-aligned, or require that its alignment matches the largest alignment requirement of any nested type); the serialization code of an outer (containing) type would then determine statically whether the alignment requirement of a serialization function is met. If the alignment requirement is not met, the code generator would emit serialization code in-place, as if the definition of the included type were copy-pasted into the outer type, instead of invoking its serialization function.
You can see field iterators in action in this carefully crafted unit test: https://github.com/UAVCAN/pydsdl/blob/f998ad6f744b853d9b97e240ab0302df27ddd598/pydsdl/_serializable.py#L1284-L1383
...also in this Jinja2 code generation template for PyUAVCAN:
Where
alignment_prefix
is defined as:'aligned' if offset.is_aligned_at_byte() else 'unaligned'
.In the following example (sourced from PyUAVCAN as well) look at the case of
CompositeType
, where we select whether the current item matches the alignment requirement of the serialization method oft
(which is_serialize_aligned_(..)
). If there is a match, we invoke the method; otherwise, we emit serialization code in-place.One can see more examples in my PyUAVCAN repo (which is still a WIP of course): https://github.com/pavel-kirienko/pyuavcan/blob/uavcan-v1.0/pyuavcan/dsdl/_templates/serialization.j2, also https://github.com/pavel-kirienko/pyuavcan/blob/6f9234ab918beefce56ef3f58920773df29079e5/pyuavcan/dsdl/_serialized_representation/_serializer.py#L56-L187
I expect that this feature will allow us to greatly simplify implementations. Particularly, libuavcan may no longer need the rather complex primitive marshaling templates, since generated serialization methods can operate on raw byte pointers now.