-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
serialize: make new serialization traits object-safe #871
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
piodul
force-pushed
the
serialize-trait-genericide
branch
from
December 8, 2023 06:36
e53348b
to
b945435
Compare
They are not going to be used after in the adjusted design, so remove them.
The tests in the `writers` module used a helper function to run the same code with both counting writers and buffer-backed writers, which required implementing a trait (Rust doesn't have generic closures). This complexity is no longer necessary.
The writer traits are using a type-level trick to force implementors of `SerializeCql::serialize` to call `CellWriter::finish()`: the latter returns `CellWriter::WrittenCellProof` which the former is required to return. The fact that the proof is an associated type of the writer and the `serialize` method is generic over the writer type makes sure that the proof truly comes from the `finish()` call on the writer provided to the method and not from any other writer. We are going to change `CellWriter` into a struct, so this trick will no longer be applicable. Instead, introduce a `WrittenCellProof` struct, which is a zero-sized type generic over an invariant lifetime. The `CellWriter` struct will consume itself and return a `WrittenCellProof` with the same lifetime parameter as the original cell writer; `serialize` will require the implementor to return it. The lifetime invariance makes sure that one proof cannot be assigned to another if their lifetimes are not exactly the same.
Adjust the interface to use the buffer backed row writer. Rename BufBackedRowWriter to RowWriter to make our life easier later when we replace the writer trait with the buffer-backed row writer struct.
Adjust the interface to use the buffer backed cell writer. Rename BufBackedCellWriter to CellWriter to make our life easier later when we replace the writer trait with the buffer-backed cell writer struct.
Remove RowWriter/CellWriter/CellValueBuilder traits and rename their buffer-backed implementations to take their place as structs. Fix the existing direct uses of buf-backed structs to refer to them by their new name.
Put them in the same `impl` block as other methods of the same structs for improved visual clarity.
We are going to remove `preliminary_type_check` methods by moving the type checking logic to `serialize`. Add default `preliminary_type_check` impls in the trait definitions to make it possible to gradually remove their impls.
Moves the type checking logic of types that used `impl_exact_preliminary_type_check` into their `serialize` method.
As we get rid of `preliminary_type_check`, error variants that were used to represent type check failure of some sub-type (e.g. a tuple element) will not be used anymore. Remove them now and adjust the rest of the code.
Type checks for compound types only check that the type is of the right shape (e.g. it's a UDT, a list, etc.). This is already checked in `serialize`, so remove the `preliminary_type_check` impls.
Remaining `SerializeCql::preliminary_type_check` impls can be removed because they either not do anything or call `preliminary_type_check` on other types.
In case when serialization of one of the values fails, a row represented by a map would return a type check error. A serialization error should be returned instead.
There's not many of them, so remove them all in one go. The ColumnTypeCheckFailed serialization error variant is also removed as it is no longer used.
As we removed all the non-default impls of `preliminary_type_check`, the method now does nothing. Remove it and all the remaining callers.
The bound on `impl<T: SerializeRow> SerializeRow for &T` implicitly requires the type `T` to be sized, preventing it from being used with `dyn SerializeRow`. Relax the restriction by adding `+ ?Sized` to be able to use it with trait objects.
As a final confirmation of the work in the PR and to prevent regressions, add tests which explicitly use `dyn SerializeCql` and `dyn SerializeRow`.
piodul
force-pushed
the
serialize-trait-genericide
branch
from
December 8, 2023 11:20
b945435
to
5733a2f
Compare
Lorak-mmk
approved these changes
Dec 9, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As @Lorak-mmk stated in #858 (comment), the new serialization traits are incompatible with dynamic dispatch -
dyn SerializeCql
/dyn SerializeRow
just isn't usable, while this was supported forValue
/ValueList
.This is due to the fact that Rust disallows using traits via
dyn
unless they meet some conditions - and our new traits don't meet them:preliminary_type_check
andserialize
. Having a separate type checking phase helps reduce the amount of actual type checking that needs to be performed when serializing (e.g. in case of collections). However,preliminary_type_check
is an associated function instead of a method by design (i.e. it doesn't take aself
parameter), and that prevents from making it object-safe.serialize
method takes a generic "writer" parameter that it can use to serialize the object. The idea behind the generic writer was to have two implementations: one that doesn't write anything but only counts bytes, and the other that actually appends bytes into a byte buffer. This would allow us to useserialize
for both things: calculating the needed buffer size and also to actually serialize. However, as Rust uses monomorphisation to implement generics, they cannot be directly used with dynamic dispatch.While it would be possible to keep the current definitions intact and work around the problem by introducing "erased" versions of the traits (credit to
erased_serde
for inspiration), it increases complexity and could make serialization of type-erased types inefficient (due to the writers needing to be type-erased, too). Instead, let's simplify it:serialize
, so move all type checking there from the existingpreliminary_type_check
methods and then remove them. While it introduces more type checks than necessary, it is not clear whether they have a measurable performance impact and it will be possible to extend the trait to reintroduce the optimization in a backwards-compatible way - for now, let's keep it simple.size_hint
method for the traits; it would be optional, but would have to be manually implemented (which is not a huge amount of work).Pre-review checklist
I have adjusted the documentation in./docs/source/
.I added appropriateFixes:
annotations to PR description.