Skip to content

Commit

Permalink
Merge pull request #24 from UAVCAN/static-layout-analysis-hooks
Browse files Browse the repository at this point in the history
Static layout analysis hooks for near-zero-cost serialization/deserialization
  • Loading branch information
pavel-kirienko authored Apr 28, 2019
2 parents 101fb7f + 5e5e8b0 commit 9cef74c
Show file tree
Hide file tree
Showing 13 changed files with 920 additions and 247 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ script:
- ./test.sh
- coveralls # Publish the coverage stats online.
- git clone https://github.com/UAVCAN/dsdl --branch=uavcan-v1.0 dsdl-test # TODO: Switch to master once v1.0 is merged
- ./test_namespace.py dsdl-test/uavcan
- ./demo.py dsdl-test/uavcan
44 changes: 11 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ PyDSDL supports all DSDL features defined in the UAVCAN specification,
and performs all mandatory static definition validity checks.
Additionally, it checks for bit compatibility for data type definitions under the same major version.

A brief usage example is provided in the file `demo.py`.

## Installation

PyDSDL requires Python 3.5 or newer.
Expand Down Expand Up @@ -91,11 +93,13 @@ response structure of the service type, respectively.
Every data type (i.e., the `SerializableType` root class) has the following public attributes
(although they raise `TypeError` when used against an instance of `ServiceType`):

- `bit_length_range: Tuple[int, int]` - returns a named tuple containing `min:int` and `max:int`, in bits,
which represent the minimum and the maximum possible bit length of an encoded representation.
- `compute_bit_length_values() -> Set[int]` - this function performs a bit length combination analysis on
the data type and returns a full set of bit lengths of all possible valid encoded representations of the data type.
Due to the involved computations, the function can be expensive to invoke, so use with care.
- `bit_length_set: BitLengthSet` - the set of bit length values of all serialized representations of the type.
The type `BitLengthSet` is similar to the native set of integers `typing.Set[int]`: it is iterable and comparable,
plus there are several important convenience methods for bit length set manipulation.
- `__str__()` - a string representation of a data type is a valid DSDL expression that would
have yielded the same data type if evaluated by a DSDL processor.
For example: `saturated uint8[<=2]`, `uavcan.node.Heartbeat.1.0`.
- `__hash__()` - data types are hashable.

Instances of `CompositeType` (and its derivatives) contain *attributes*.
Per the specification, an attribute can be a field or a constant.
Expand Down Expand Up @@ -149,32 +153,6 @@ representation of the contained value.
- `Container` - generic container; has `element_type: Type[Any]` and is iterable.
- `Set` - a DSDL constant homogeneous set.

## Usage example

```python
import sys
import pydsdl

try:
composite_types = pydsdl.read_namespace('path/to/root_namespace', ['path/to/dependencies'])
except pydsdl.InvalidDefinitionError as ex:
print(ex, file=sys.stderr) # The DSDL definition is invalid
except pydsdl.InternalError as ex:
print('Internal error:', ex, file=sys.stderr) # Oops! Please report.
else:
for t in composite_types:
if isinstance(t, pydsdl.ServiceType):
blr, blv = 0, {0}
else:
blr, blv = t.bit_length_range, t.compute_bit_length_values()
# The above is because service types are not directly serializable (see the UAVCAN specification)
print(t.full_name, t.version, t.fixed_port_id, t.deprecated, blr, len(blv))
for f in t.fields:
print('\t', str(f.data_type), f.name)
for c in t.constants:
print('\t', str(c.data_type), c.name, '=', str(c.value.native_value))
```

## Development

### Dependencies
Expand All @@ -201,8 +179,8 @@ If you really need to import a specific entity, consider prefixing it with an un
scope leakage, unless you really want it to be externally visible.

```python
from . import data_type # Good
from .data_type import CompositeType # Pls no
from . import _serializable # Good
from ._serializable import CompositeType # Pls no
```

### Writing tests
Expand Down
105 changes: 105 additions & 0 deletions demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
#!/usr/bin/env python3
#
# This is a helper script used for testing the parser against a specified namespace directory.
# It just directly invokes the corresponding API, prints the output, and exits.
#

import sys
import time
import pydsdl
import logging

logging.basicConfig(stream=sys.stderr, level=logging.INFO, format='%(levelname)-8s %(message)s')

target_directory = sys.argv[1]
lookup_directories = sys.argv[2:]


def _print_handler(file: str, line: int, text: str) -> None:
print('%s:%d:' % (file, line), text)


def _show_fields(indent_level: int,
field_prefix: str,
t: pydsdl.CompositeType,
base_offset: pydsdl.BitLengthSet) -> None:
# This function is intended to be a crude demonstration of how the static bit layout analysis can be leveraged
# to generate very efficient serialization and deserialization routines. With PyDSDL it is possible to determine
# whether any given field at an arbitrary level of nesting always meets a certain alignment goal. This information
# allows the code generator to choose the most efficient serialization/deserialization strategy. For example:
#
# - If every field of a data structure is a standard-bit-length field (e.g., uint64) and its offset meets the
# native alignment requirement, the whole structure can be serialized and deserialized by simple memcpy().
# We call it "zero-cost serialization".
#
# - Otherwise, if a field is standard-bit-length and its offset is always a multiple of eight bits, the field
# itself can be serialized by memcpy(). This case differs from the above in that the whole structure may not
# be zero-cost-serializable, but some or all of its fields still may be.
#
# - Various other optimizations are possible depending on whether the bit length of a field is a multiple of
# eight bits and/or whether its base offset is byte-aligned. Many optimization possibilities depend on a
# particular programming language and platform, so they will not be reviewed here in detail. Interested readers
# are advised to consult with existing implementations.
#
# - In the worst case, if none of the possible optimizations are discoverable statically, the code generator will
# resort to bit-level serialization, where a field is serialized/deserialized bit-by-bit. Such fields are
# extremely uncommon, and a data type designer can easily ensure that their data type definitions are free from
# such fields by using @assert expressions checking against _offset_. More info in the specification.
#
# The key part of static layout analysis is the class pydsdl.BitLengthSet; please read its documentation.
indent = ' ' * indent_level * 4
for field, offset in t.iterate_fields_with_offsets(base_offset):
field_type = field.data_type
prefixed_name = '.'.join(filter(None, [field_prefix, field.name or '(padding)']))

if isinstance(field_type, pydsdl.PrimitiveType):
print(indent, prefixed_name, '# byte-aligned: %s; standard bit length: %s; standard-aligned: %s' %
(offset.is_aligned_at_byte(),
field_type.standard_bit_length,
field_type.standard_bit_length and offset.is_aligned_at(field_type.bit_length)))

elif isinstance(field_type, pydsdl.VoidType):
print(indent, prefixed_name)

elif isinstance(field_type, pydsdl.VariableLengthArrayType):
offset_of_every_element = offset + field_type.bit_length_set # All possible element offsets for this array
print(indent, prefixed_name, '# length field is byte-aligned: %s; every element is byte-aligned: %s' %
(offset.is_aligned_at_byte(), offset_of_every_element.is_aligned_at_byte()))

elif isinstance(field_type, pydsdl.FixedLengthArrayType):
for index, element_offset in field_type.enumerate_elements_with_offsets(offset):
# Real implementations would recurse; this is not shown in this demo for compactness.
print(indent, prefixed_name + '[%d]' % index, '# byte-aligned:', element_offset.is_aligned_at_byte())

elif isinstance(field_type, pydsdl.CompositeType):
print(indent, str(field) + ':')
_show_fields(indent_level + 1, prefixed_name, field_type, offset)

else:
raise RuntimeError('Unhandled type: %r' % field_type)


def _main():
try:
started_at = time.monotonic()
compound_types = pydsdl.read_namespace(target_directory, lookup_directories, _print_handler)
except pydsdl.InvalidDefinitionError as ex:
print(ex, file=sys.stderr) # The DSDL definition is invalid.
except pydsdl.InternalError as ex:
print('Internal error:', ex, file=sys.stderr) # Oops! Please report.
else:
for t in compound_types:
if isinstance(t, pydsdl.ServiceType):
print(t, 'request:')
_show_fields(1, 'request', t.request_type, pydsdl.BitLengthSet())
print(t, 'response:')
_show_fields(1, 'response', t.response_type, pydsdl.BitLengthSet())
else:
print(t, ':', sep='')
_show_fields(1, '', t, pydsdl.BitLengthSet())
print()

print('%d types parsed in %.1f seconds' % (len(compound_types), time.monotonic() - started_at))


_main()
25 changes: 13 additions & 12 deletions pydsdl/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
# This software is distributed under the terms of the MIT License.
#

import os
import sys
import os as _os
import sys as _sys

if sys.version_info[:2] < (3, 5): # pragma: no cover
print('A newer version of Python is required', file=sys.stderr)
sys.exit(1)
if _sys.version_info[:2] < (3, 5): # pragma: no cover
print('A newer version of Python is required', file=_sys.stderr)
_sys.exit(1)

__version__ = 0, 6, 2
__version__ = 0, 7, 3
__license__ = 'MIT'

# Our unorthodox approach to dependency management requires us to apply certain workarounds.
Expand All @@ -20,8 +20,8 @@
# when done, we restore the path back to its original value. One implication is that it won't be possible
# to import stuff dynamically after the initialization is finished (e.g., function-local imports won't be
# able to reach the third-party stuff), but we don't care.
_original_sys_path = sys.path
sys.path = [os.path.join(os.path.dirname(__file__), 'third_party')] + sys.path
_original_sys_path = _sys.path
_sys.path = [_os.path.join(_os.path.dirname(__file__), 'third_party')] + _sys.path

# Never import anything that is not available here - API stability guarantees are only provided for the exposed items.
from ._namespace import read_namespace
Expand All @@ -42,12 +42,13 @@
# Data type model - attributes.
from ._serializable import Attribute, Field, PaddingField, Constant

# Data type model - auxiliary.
from ._serializable import BitLengthRange, ValueRange, Version

# Expression model.
from ._expression import Any
from ._expression import Primitive, Boolean, Rational, String
from ._expression import Container, Set

sys.path = _original_sys_path
# Auxiliary.
from ._serializable import ValueRange, Version
from ._bit_length_set import BitLengthSet

_sys.path = _original_sys_path
Loading

0 comments on commit 9cef74c

Please sign in to comment.