Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flatten attribute to derive SerializeRow #1144

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nrxus
Copy link
Contributor

@nrxus nrxus commented Dec 7, 2024

This is similar to the flatten attribute in serde.

For now I've only added support when using match_by_name flavor, largely because that's the way I am using scylla. I'd assume that extending it to support the enforce_ordering flavor would actually be easier but it might also need new traits similar to how I had to add some for match_by_name. Both the parent struct and the flattened struct need to be match_by_name, if the parent struct isn't then there is a clear compiler error, if the flattened struct isn't then there is also a compiler error but it'd be a little more opaque because it'd be produced by lack of implementation of a doc hidden trait. If this was ever extended to enforce_ordering, then it will also work similarly that you cannot mix and match flavors when flattening.

I have only added this attribute to SerializeRow because it was easier than DeserializeRow but I also want to add it to that macro in a future PR next chance I get to dig into this code. However, I have done enough digging to know that implementing this attribute on DeserializeRow will require a breaking change in the DeserializeRow trait. Realistically I doubt many people are manually implementing that trait so I don't think it'd actually be a breaking change in practice, but by semver rules alone it would be.

All the new traits/structs/enums needed for this change are inside the _macro_internal subdmodule such that no new public API is exposed. Maybe in the future those could be made public but it felt too early to know if all the signatures were exactly how we wanted to expose them or not.

For context, I am currently dealing with an issue that if I have different insert queries where one sets N columns and another one sets the same N and one extra, then I have two make two structs with N repeated fields. With this PR I'd be able to to instead flatten the struct with N fields inside the other struct to make my code more maintainable.

General Strategy

ser::row::ByName

A new internal-only type ser::row::ByName is added that wraps a struct that implements a new trait: SerializeRowByName. This new type has a single function ser::row::ByName::serialize and attempts to serialize an entire RowSerializationContext, returning an error if any of the columns in the context were not serialized or do not belong to the struct. This is basically the implementation of SerializeRow::serialize for any struct that implements SerializeRowByName but split into its own internal-type so that the macro doesn't have to create this shared code. This couldn't be added as a default implementation in one of our traits because we need to call for some functions using Self as a generic parameter which caused some compilation errors.

SerializeRowByName

When deriving SerializeRow using the match_by_name flavor the struct will also implement a new internal-only trait: SerializeRowByName. This trait has a single type associated type Partial, and a function partial() that creates it. The partial struct has 3 main parts:

  1. For every field that is a column (will not be flattened): A reference to the field.
  2. For every field that is a nested struct (will be flattened) The partial view of that nested struct. This means that the nested struct must also implement: SerializeRowByName such that partial() can be called on it.
  3. A hashset to check that every field (not column) has been serialized. For fields that are columns we do this via their column-name. For nested structs, we do this via the field name of the nested struct.

The partial struct is required to implement a new trait PartialSerializeRowByName

PartialSerializeRowByName

PartialSerializeRowByName has two required functions:

  • serialize_field: takes the spec of a single column and attempts to serialize the corresponding field to it. If this column does not belong to this partial struct then the caller is told that the column is unused so that the caller can instead try to use a different field for this same column (i.e., when testing to see if any nested structs can serialize to that column). If the column is used, then a check is done to see if that column has completed the serialization of this field so that it can remove it out of its missing set. The caller is informed if that column has finished the serialization of this partial struct or not.

  • check_missing: consumes the partial struct while checking if all the fields in this struct were serialized, returning an error if not. This is used inside ser::row::ByName::serialize to verify that the a struct has been fully serialized. If a field has not finished serializing and the field is a nested struct (i.e., not just a column) then we should get the error from the nested struct instead for better error messaging.

To do this signaling, a new internal-only enum ser::row::FieldStatus was added that returns whether a column was used for the field, was used and completed the field, or was used by the field is still missing more columns.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.

Copy link

github-actions bot commented Dec 7, 2024

cargo semver-checks found no API-breaking changes in this PR! 🎉🥳
Checked commit: c03bf5b

@nrxus nrxus force-pushed the flatten-serialize-name branch 7 times, most recently from 464fdee to 0de5bb0 Compare December 9, 2024 04:19
Currently only the `match_by_name` flavor is supported. All the needed
structs/traits to make this work are marked as `#[doc(hidden)]` to not
increase the public API surface. Effort was done to not change any of
the existing API.
@nrxus nrxus force-pushed the flatten-serialize-name branch from 0de5bb0 to dc2a19e Compare December 9, 2024 04:25
@nrxus
Copy link
Contributor Author

nrxus commented Dec 9, 2024

I apologize about the multiple force pushes yesterday, I was chewing on it yesterday a bit more since it wasn't reviewed yet and decided to move some stuff around to make it more clear that all the new structs are internal to the macro implementation only (by moving it to that sub-module). All that should be done by now.

I have also started work on adding the flatten attribute for the enforce_order flavor as well but in a separate branch. Let me know if you have a preference on whether to put that as its own PR or push it here so that the entire support for #[flatten] for SerializeRow is all on in one PR.

this was causing a compile time error when trying to use it in my own
project. Usure why there wasn't a compile time error in the test
@nrxus
Copy link
Contributor Author

nrxus commented Dec 10, 2024

I have finished the work to also support flattening when serializing with flavor = "enforced_order". It's on a separate branch though since I wasn't sure if this PR was already considered too big or not. Let me know if you'd like me to merge it into here so you can review it all as one piece or if I should just wait until this is reviewed and merged as the changes are already large.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant