Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue where custom JSON serializer generates Base64 encoded data in state store #539

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

abogdanov37
Copy link
Contributor

Description

Refactor ObjectSerializer class to add ability set custom object mapper.
Refactor DaprStateAsyncprovider class to use custom serializers based on ObjectSerializer class as Json serializer.
Add unit tests.

Issue reference

#503

Checklist

Please make sure you've completed the relevant tasks for this PR, out of the following list:

  • Code compiles correctly
  • Created/updated tests
  • Extended the documentation

Copy link
Contributor

@halspang halspang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code seems fine in general, just some formatting callouts. My concern here is how this fits into the larger picture. Especially in regards to the conversation that happened on the issue and how it relates to #509.

We also have an IT failure which should be looked into.

@artursouza What are your thoughts?

@@ -20,7 +20,7 @@
/**
* An error message for an invalid constructor arg.
*/
private final String errorMsg = "actor needs to be initialized with an id!";
private static final String errorMsg = "actor needs to be initialized with an id!";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static final variables should be capitalized: ERROR_MSG

@@ -74,6 +73,28 @@ public int hashCode() {

}

class CustomJsonSerializer implements DaprObjectSerializer{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a space before {

Copy link
Member

@artursouza artursouza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abogdanov37 Thanks for your contribution.

My main concern is backwards compatibility. This time, I think I found a way out, please, see my comment.

@halspang Yes, we need to be mindful about the vision to move default serializer out of this package. This improvement (IMO) does not invalidate that.

@@ -69,7 +73,7 @@
}

T response = this.stateSerializer.deserialize(s, type);
if (this.isStateSerializerDefault && (response instanceof byte[])) {
if (this.isStateSerializerJson && (response instanceof byte[])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this logic will make this not backwards compatible - it means that values saving with the current version of the code will not be able to be read from a newer version.

After thinking about this same problem in another PR, I think this should be done by a configuration. When registering the ActorType, a config can be set to determine is the data is written as Text or binary (Base64). The default behavior is Base64. This way, existing code does not break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! I understood your point. But we have a unit test that checks that binary processing is OK. See https://github.com/dapr/java-sdk/blob/master/sdk-actors/src/test/java/io/dapr/actors/runtime/DaprStateAsyncProviderTest.java#L166
I add this test for a custom serializer and it works well.
I can add a method to the serializer interface with the name useMimeType or checkMimeType and change behavior use this method

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abogdanov37 Adding a new method to the interface would also be a breaking change since apps would need to implement that new interface method after upgrading.

I thought about bringing this in as a configuration per Actor Type, which means that it would arrive in this class as a new argument in the constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artursouza since Java 8 we have a default method. Because of that, it won't be breaking changes. I still can't imagine a case in which current changes will be breaking. Did you find this case with integration tests? May be beta tester give this information to you?
I have a real system with a custom serializer and many states stored in binary format. I try changed SDK for binary format. All works as expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abogdanov37 There are 2 things here:

  1. Adding to the serializer interface might be done without breaking by using default - OK, but I would still be concerned over adding more to that interface. I would like to keep it as simple as possible. So, I would be OK if this setting was outside the serializer.

  2. This is a breaking change because data saved with the old code cannot be read with the new code - we don't have test for that yet, I am inferring this by looking at the code. If you can prove that data saved with the old code is read with the new, that would close this concern.

If not, I plan to address this in the next milestone. I am not 100% satisfied with our current serializer approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thanks for example. Sorry for my bad testing. I check only for objects but not for primitives (((

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found my place in my code where I convert Base64. I can reproduce this behavior in tests. Take a time to update code to avoid breaking changes. Thank you @artursouza!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abogdanov37 You might want to enable the new behavior via a System property, this way, apps can opt in when they know it will not break them (new app, for example).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        T response;
        try {
          response = this.stateSerializer.deserialize(s, type);
        } catch (IOException ex) {
          byte[] data = Arrays.copyOfRange(s, 1, s.length - 1);
          response = this.stateSerializer.deserialize(Base64.getDecoder().decode(data), type);
        }

this solution works for all types exclude String. Strings pass through as is (base64 encoded)
Feature flag is a great idea!
By System property you mean JVM property?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artursouza I push another solution based on "fair" conversion.
If we look there

generator.writeBinaryField("value", (byte[]) value);
we can see that we have a transformation to Base64
I add code to transform Base64 from store to origin byte array which can be processed.

@artursouza artursouza self-assigned this Jun 3, 2021
@abogdanov37 abogdanov37 requested review from a team as code owners July 12, 2021 11:47
@@ -68,6 +68,19 @@
return Mono.empty();
}

if (!this.isStateSerializerDefault) {
if (s.length == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed only in the deserialize path?

Copy link
Contributor Author

@abogdanov37 abogdanov37 Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In another way, we'll get an exception as I remember correcly

s = OBJECT_MAPPER.readValue(s, byte[].class);
if (type.getType().equals(String.class)) {
byte[] out = new byte[s.length + 2];
out[0] = '"';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not cover the scenarios where the string contains a quote already since it would not be escaped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quotes will be added in the serialization process. Give me please an example I'll add it in tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my "string" with quotes

Copy link
Contributor Author

@abogdanov37 abogdanov37 Aug 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add two test cases for quoted strings. As I saw in debugging, additional quotes had been escaped with the serializer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still concerned about this logic. This should not be needed. I will need more time to try this PR on my desktop first.

@codecov
Copy link

codecov bot commented Aug 12, 2021

Codecov Report

Merging #539 (2d2e162) into master (df54a5d) will decrease coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #539      +/-   ##
============================================
- Coverage     81.03%   80.98%   -0.05%     
- Complexity      991      995       +4     
============================================
  Files            88       88              
  Lines          3042     3051       +9     
  Branches        332      335       +3     
============================================
+ Hits           2465     2471       +6     
- Misses          409      412       +3     
  Partials        168      168              
Impacted Files Coverage Δ
...k-actors/src/main/java/io/dapr/actors/ActorId.java 100.00% <ø> (ø)
...io/dapr/actors/runtime/DaprStateAsyncProvider.java 69.64% <100.00%> (+8.77%) ⬆️
sdk/src/main/java/io/dapr/utils/Retry.java 50.00% <0.00%> (-25.00%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update df54a5d...2d2e162. Read the comment docs.

@cicoyle
Copy link
Contributor

cicoyle commented Dec 21, 2023

gentle ping - @abogdanov37 mind resolving the conflicts and addressing the ^ comments?

@cicoyle
Copy link
Contributor

cicoyle commented Feb 19, 2024

gentle ping - @abogdanov37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants