From f555c5a985b5112186a37ee08b30eac377efe1f6 Mon Sep 17 00:00:00 2001 From: Kyungtak Woo Date: Thu, 16 Jul 2020 15:44:00 -0500 Subject: [PATCH 1/5] Initial design doc for JsonSerialization using Source Generators --- .../docs/JsonSerializationSourceGeneration.md | 183 ++++++++++++++++++ 1 file changed, 183 insertions(+) create mode 100644 src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md diff --git a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md new file mode 100644 index 000000000000..5312588154fb --- /dev/null +++ b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md @@ -0,0 +1,183 @@ +# System.Text.Json build-time generation for serializers using SourceGenerators + +## Motivation + +There are comprehensive [documents](https://github.com/dotnet/designs/pull/113) detailing the needs and benefits of generating JSON serializers at compile time. Some of these benefits are improved startup time, reduction in private memory usage, and the most obvious, faster runtime for serialization/deserialization. After discussing some approaches and pros/cons of some of them we decided to go ahead and implement this feature using [Roslyn Source Generators](https://github.com/dotnet/roslyn/blob/master/docs/features/source-generators.cookbook.md). This document will outline the roadmap for the initial experiment and highlight actionable items for the base prototype. + +## New API Proposal + +```C# +namespace System.Text.Json.Serialization +{ + /// + /// When placed on a type, will source generate de/serialization for the specified type and it's descendants. + /// + /// + /// Must take into account that type discovery using this attribute is at compile time using Source Generators. + /// + [AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct, AllowMultiple = false)] + public sealed class JsonSerializableAttribute : JsonAttribute + { + public JsonSerializableAttribute() { } + + public JsonSerializableAttribute(Type type) { } + } +} +``` + +## Example Usage + +```C# + // (Base Case) Codegen De/Serialization as extensions of SerializableClass. +[JsonSerializable] +public partial class SerializableClass +{ + public string Email { get; set; } + public string Password { get; set; } +} + +// (Pass in Type) Codegen De/Serialization extending SerializableClassExtension using ExternClass. +[JsonSerializable(typeof(ExternalClass))] +public static partial class SerializableClassExtension { } + +// Using the generated source code. +SerializableClass obj = SerializableClass.Deserialize(json); +string serializedObj = SerializableClass.Serialize(obj); + +// (WIP) High level usage of serialization using contexts. +using (var context = new MyJsonSerializerContext(options)) +{ + SerializableClass obj = context.SerializableClass.Deserialize(json); +} +``` + +## Feature Behavior +There are 3 main points in this project: type discovery, source code generation, generated source code integration (with user applications): + +### Type Discovery + +Type discovery can be divided into two different models, an implicit model (where the user does not have to specify which types to generate code for) and an explicit model (user specifically specifies through code which types they want us to generate code for). + +#### Implicit Model +Various implicit approaches have been discussed such as source generating all partial classes or scanning for calls into the serializer using the Roslyn tree scanning. These models can be revisited in the future as the value/feasibility of the approach becomes clearer based on user feedback. It is important to note that some downsides to such model can result in missing types to source generate or source generating types when not needed due to a bug or edge cases we didn’t consider. + +#### Proposed Explicit Model +There are two scenarios within the proposed explicit model: + +1. Base case: Code generates a partial class to the attribute target class/struct. + +```c# + // (Base Case) Codegen De/Serialization as extensions of SerializableClass. +[JsonSerializable] +public partial class SerializableClass +{ + public string Email { get; set; } + public string Password { get; set; } + public bool RememberMe { get; set; } +} +``` +2. Pass in type: Code generates a partial class to the attribute target class/struct using the passed in type. This scenario can be used if you don't want to make your serializable class partial or you don't own the serializable class. + +```c# +// (Pass in Type) Codegen De/Serialization extending SerializableClassExtension using ExternClass. +[JsonSerializable(typeof(ExternalClass))] +public static partial class SerializableClassExtension { } +``` + +The proposed approach for source code generation requires JSON serializable types defined by the user to be partial since source generation does not change the user’s code and we want to extend it and to allow the serialization of non-public members for owned types. + +The output of this phase would be a list of reflection-type-like model where we can iterate through the type's members in order to codegen recursively. The scope of this phase is to only find the root serializable types instead of the whole type-graph since we want to recursively codegen without storing the whole type-graph in memory. + +We believe that an explicit model using attributes would be a simple first-approach to the problem given that the source code generation needs the user to declare their type class as partial anyway. We can then use Roslyn tree to find the JsonSerializable attribute for both types the user owns and doesn’t own to source generate using Roslyn Source Generators. + +### Source Code Generation +This phase consists of taking the discovered types and recursively codegenerating the serialization methods. + +#### Proposed Approach +The expected code generation has been already been [tackled](https://github.com/dotnet/runtimelab/compare/master...steveharter:ApiAdds) by @steveharter focusing mainly on performance gains and extendibility to the current codebase. This approach increases performance drastically in 2 different ways. The first would be during the first-time/warm-up performance for both CPU and memory by avoiding costly reflection to build up a Type metadata cache mentioned here. The second would be throughput improvement by avoiding the initial metadata-dictionary lookup on calls to the serializer by generating ```CreateObjectFunc```, ```SerializeFunc``` and ```DeserializeFunc``` when creating its ```ClassInfo```. + +#### Sketch of SouceGenerated Code (for simple POCO using SerializableClass) +CreateObjectFunc: +```c# +private static object CreateObjectFunc() +{ + return new SerializableClass(); +} +``` + +SerializeFunc: +```c# +private static void SerializeFunc(Utf8JsonWriter writer, object value, ref WriteStack writeStack, JsonSerializerOptions options) +{ + SerializableClass obj = (SerializableClass)value; + + _s_Property_Email.WriteValue(obj.Email, writer); + _s_Property_Password.WriteValue(obj.Password, writer); +} +``` + +DeserializeFunc: +```c# + private static SerializableClass DeserializeFunc(ref Utf8JsonReader reader, ref ReadStack readStack, JsonSerializerOptions options) + { + bool ReadPropertyName(ref Utf8JsonReader reader) + { + return reader.Read() && reader.TokenType == JsonTokenType.PropertyName; + } + + ReadOnlySpan propertyName; + SerializableClass obj = new SerializableClass(); + + if (!ReadPropertyName(ref reader)) goto Done; + propertyName = reader.ValueSpan; + + if (propertyName.SequenceEqual(_s_Property_Email.NameAsUtf8Bytes)) + { + reader.Read(); + _s_Property_Email.ReadValueAndSetMember(ref reader, ref readStack, obj); + if (!ReadPropertyName(ref reader)) goto Done; + propertyName = reader.ValueSpan; + } + + if (propertyName.SequenceEqual(_s_Property_Password.NameAsUtf8Bytes)) + { + reader.Read(); + _s_Property_Name.ReadValueAndSetMember(ref reader, ref readStack, obj); + if (!ReadPropertyName(ref reader)) goto Done; + propertyName = reader.ValueSpan; + } + + reader.Read(); + + Done: + if (reader.TokenType != JsonTokenType.EndObject) + { + throw new JsonException("Could not deserialize object"); + } + + return obj; + } +} + +``` + +It is also important to notice that in case there are nested types within the root type we are recursing over, a new class with name ```FoundTypeNameSerializer``` will have to be created in order to completely serialize and deserialize. + +Even if the source generation fails, we can always fallback to the slower status quo by using ```Reflection```. + +#### Alternatives +Some alternatives such as the use of interfaces for the functions mentioned above or the creation of individual JsonConverter for each type were talked about. However, due to performance and the direction we are taking with the initial prototype, we believe these are not necessary. After user feedback we can revisit this if needed. + +### Generated Source Code Integration +There are [discussions](https://gist.github.com/steveharter/d71cdfc25df53a8f60f1a3563d13cf0f) regarding integration of the approach mentioned above. + +The high level registration for the generated source code implies that the Json options class is modified by calling generated code where we use de/serialization API entry points for the extended class that auto-registers itself. + +For the most part the source code generation approach mentioned above solves this problem since the serialize and deserialize functions would live within the type class and once the serializer is initialized, the serializer can be used calling ```JsonSerializer.Serialize()``` or ```SourceGeneratedType.Serialize()```. In order to continue with this approach, an extension of the type class would be needed where the users would have to explicitly declare their types as partial classes for it to be extended by the source generator while the types they don’t own would be entirely created. + +Even if this initialization isn't performed, we can always fallback to the slower status quo ```JsonSerializer``` methods. + +## Future Considerations + +* **Versioning**: This will be needed in order to determine compatibility and to be able to detect the bugs related to the different releases of this feature. +* **Error Handling**: Currently if something goes wrong in the source generation or code generation, Roslyn's SourceGenerator default message is shown to the user. This needs to be handled to show compilation errors from the source generated code to the user to be more verbose. \ No newline at end of file From 615e6cb21e3c606f66d5dd863c34c12948f99195 Mon Sep 17 00:00:00 2001 From: Kyungtak Woo Date: Fri, 17 Jul 2020 14:33:45 -0500 Subject: [PATCH 2/5] Some Api changes to doc --- .../docs/JsonSerializationSourceGeneration.md | 212 +++++++++++------- 1 file changed, 129 insertions(+), 83 deletions(-) diff --git a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md index 5312588154fb..025c02c7d4b5 100644 --- a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md +++ b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md @@ -2,54 +2,7 @@ ## Motivation -There are comprehensive [documents](https://github.com/dotnet/designs/pull/113) detailing the needs and benefits of generating JSON serializers at compile time. Some of these benefits are improved startup time, reduction in private memory usage, and the most obvious, faster runtime for serialization/deserialization. After discussing some approaches and pros/cons of some of them we decided to go ahead and implement this feature using [Roslyn Source Generators](https://github.com/dotnet/roslyn/blob/master/docs/features/source-generators.cookbook.md). This document will outline the roadmap for the initial experiment and highlight actionable items for the base prototype. - -## New API Proposal - -```C# -namespace System.Text.Json.Serialization -{ - /// - /// When placed on a type, will source generate de/serialization for the specified type and it's descendants. - /// - /// - /// Must take into account that type discovery using this attribute is at compile time using Source Generators. - /// - [AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct, AllowMultiple = false)] - public sealed class JsonSerializableAttribute : JsonAttribute - { - public JsonSerializableAttribute() { } - - public JsonSerializableAttribute(Type type) { } - } -} -``` - -## Example Usage - -```C# - // (Base Case) Codegen De/Serialization as extensions of SerializableClass. -[JsonSerializable] -public partial class SerializableClass -{ - public string Email { get; set; } - public string Password { get; set; } -} - -// (Pass in Type) Codegen De/Serialization extending SerializableClassExtension using ExternClass. -[JsonSerializable(typeof(ExternalClass))] -public static partial class SerializableClassExtension { } - -// Using the generated source code. -SerializableClass obj = SerializableClass.Deserialize(json); -string serializedObj = SerializableClass.Serialize(obj); - -// (WIP) High level usage of serialization using contexts. -using (var context = new MyJsonSerializerContext(options)) -{ - SerializableClass obj = context.SerializableClass.Deserialize(json); -} -``` +There are comprehensive [documents](https://github.com/dotnet/designs/pull/113) detailing the needs and benefits of generating JSON serializers at compile time. Some of these benefits are faster throughput, **improved startup time**, and **reduction in private memory usage** for serialization/deserialization. After discussing some approaches and pros/cons of some of them we decided to go ahead and implement this feature using [Roslyn Source Generators](https://github.com/dotnet/roslyn/blob/master/docs/features/source-generators.cookbook.md). This document will outline the roadmap for the initial experiment and highlight actionable items for the base prototype. ## Feature Behavior There are 3 main points in this project: type discovery, source code generation, generated source code integration (with user applications): @@ -73,15 +26,14 @@ public partial class SerializableClass { public string Email { get; set; } public string Password { get; set; } - public bool RememberMe { get; set; } } ``` 2. Pass in type: Code generates a partial class to the attribute target class/struct using the passed in type. This scenario can be used if you don't want to make your serializable class partial or you don't own the serializable class. ```c# -// (Pass in Type) Codegen De/Serialization extending SerializableClassExtension using ExternClass. +// (Pass in Type) Codegen De/Serialization extending SerializerForExternalClass using ExternClass. [JsonSerializable(typeof(ExternalClass))] -public static partial class SerializableClassExtension { } +public static class SerializerForExternalClass { } ``` The proposed approach for source code generation requires JSON serializable types defined by the user to be partial since source generation does not change the user’s code and we want to extend it and to allow the serialization of non-public members for owned types. @@ -90,64 +42,111 @@ The output of this phase would be a list of reflection-type-like model where we We believe that an explicit model using attributes would be a simple first-approach to the problem given that the source code generation needs the user to declare their type class as partial anyway. We can then use Roslyn tree to find the JsonSerializable attribute for both types the user owns and doesn’t own to source generate using Roslyn Source Generators. +#### New API Proposal + +```C# +namespace System.Text.Json.Serialization +{ + /// + /// When placed on a type, will source generate de/serialization for the specified type and all types in it's object graph. + /// + /// + /// Must take into account that type discovery using this attribute is at compile time using Source Generators. + /// + [AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct, AllowMultiple = false)] + public sealed class JsonSerializableAttribute : JsonAttribute + { + /// + /// Takes target class/struct to construct a facade Serializer class as TargetNameSerializer. + /// + public JsonSerializableAttribute() { } + + /// + /// Takes type as an argument and uses it to create a facade Serializer class as TargetNameSerializer. + /// + public JsonSerializableAttribute(Type type) { } + } +} +``` + +#### Validation and Testing + +For validations we will handle cases where the type representation is missing required fields to source generate. This can be done in the current phase or the Source Code Generation phase but must be handled in both. + +Testing for this phase will consist of unit tests where given different source code and referenced assemblies, we verify that the source generation pass detects and creates all of the type representation with necessary data. + ### Source Code Generation This phase consists of taking the discovered types and recursively codegenerating the serialization methods. #### Proposed Approach -The expected code generation has been already been [tackled](https://github.com/dotnet/runtimelab/compare/master...steveharter:ApiAdds) by @steveharter focusing mainly on performance gains and extendibility to the current codebase. This approach increases performance drastically in 2 different ways. The first would be during the first-time/warm-up performance for both CPU and memory by avoiding costly reflection to build up a Type metadata cache mentioned here. The second would be throughput improvement by avoiding the initial metadata-dictionary lookup on calls to the serializer by generating ```CreateObjectFunc```, ```SerializeFunc``` and ```DeserializeFunc``` when creating its ```ClassInfo```. +The expected code generation has been already been [tackled](https://github.com/dotnet/runtimelab/compare/master...steveharter:ApiAdds) by @steveharter focusing mainly on performance gains and extendibility to the current codebase. This approach increases performance drastically in 2 different ways. The first would be during the first-time/warm-up performance for both CPU and memory by avoiding costly reflection to build up a Type metadata cache mentioned here. The second would be throughput improvement by avoiding the initial metadata-dictionary lookup on calls to the serializer by generating ```CreateObjectFunc```, ```SerializeFunc``` and ```DeserializeFunc``` when creating its ```JsonClassInfo``` (metadata). + +The proposed approach consist of an initialization phase where generated code will call an initialization method within the created facade class where a ```JsonClassInfo``` is created with the functions mentioned above and registered into options with the necessary ```JsonPropertyInfo```. For each call into the serializer using the generated code, the POCO would call a public overload into the ```JsonSerializer``` that also take the metadata ```JsonClassInfo``` created during the initialization method. + +#### Sketch of SourceGenerated Code (for simple POCO using SerializableClass) + +Class variables for code generated SerializableClassSerializer: +```c# +private static bool _s_isInitiated; +private static JsonClassInfo _s_classInfo; + +private static JsonPropertyInfo _s_Property_Email; +private static JsonPropertyInfo _s_Property_Password; +``` + +These functions would used to create a JsonClassInfo: -#### Sketch of SouceGenerated Code (for simple POCO using SerializableClass) -CreateObjectFunc: ```c# +// CreateObjectFunc private static object CreateObjectFunc() { return new SerializableClass(); } ``` -SerializeFunc: ```c# +// SerializeFunc private static void SerializeFunc(Utf8JsonWriter writer, object value, ref WriteStack writeStack, JsonSerializerOptions options) { - SerializableClass obj = (SerializableClass)value; + SerializableClassSerializer obj = (SerializableClassSerializer)value; _s_Property_Email.WriteValue(obj.Email, writer); _s_Property_Password.WriteValue(obj.Password, writer); } ``` -DeserializeFunc: ```c# - private static SerializableClass DeserializeFunc(ref Utf8JsonReader reader, ref ReadStack readStack, JsonSerializerOptions options) +// DeserializeFunc +private static SerializableClassSerializer DeserializeFunc(ref Utf8JsonReader reader, ref ReadStack readStack, JsonSerializerOptions options) +{ + bool ReadPropertyName(ref Utf8JsonReader reader) { - bool ReadPropertyName(ref Utf8JsonReader reader) - { - return reader.Read() && reader.TokenType == JsonTokenType.PropertyName; - } + return reader.Read() && reader.TokenType == JsonTokenType.PropertyName; + } - ReadOnlySpan propertyName; - SerializableClass obj = new SerializableClass(); + ReadOnlySpan propertyName; + SerializableClassSerializer obj = new SerializableClassSerializer(); + if (!ReadPropertyName(ref reader)) goto Done; + propertyName = reader.ValueSpan; + + if (propertyName.SequenceEqual(_s_Property_Email.NameAsUtf8Bytes)) + { + reader.Read(); + _s_Property_Email.ReadValueAndSetMember(ref reader, ref readStack, obj); if (!ReadPropertyName(ref reader)) goto Done; propertyName = reader.ValueSpan; + } - if (propertyName.SequenceEqual(_s_Property_Email.NameAsUtf8Bytes)) - { - reader.Read(); - _s_Property_Email.ReadValueAndSetMember(ref reader, ref readStack, obj); - if (!ReadPropertyName(ref reader)) goto Done; - propertyName = reader.ValueSpan; - } - - if (propertyName.SequenceEqual(_s_Property_Password.NameAsUtf8Bytes)) - { - reader.Read(); - _s_Property_Name.ReadValueAndSetMember(ref reader, ref readStack, obj); - if (!ReadPropertyName(ref reader)) goto Done; - propertyName = reader.ValueSpan; - } - + if (propertyName.SequenceEqual(_s_Property_Password.NameAsUtf8Bytes)) + { reader.Read(); + _s_Property_Password.ReadValueAndSetMember(ref reader, ref readStack, obj); + if (!ReadPropertyName(ref reader)) goto Done; + propertyName = reader.ValueSpan; + } + + reader.Read(); Done: if (reader.TokenType != JsonTokenType.EndObject) @@ -158,26 +157,73 @@ DeserializeFunc: return obj; } } +``` +User faced methods for (de)serialization (assuming SerializableClassSerializer is initialized): +```c# +public static SerializableClassSerializer Deserialize(string json) +{ + return JsonSerializer.Deserialize(json, this._s_classInfo); +} + +public static string Serialize() +{ + return JsonSerializer.Serialize(this, this._s_classInfo); +} ``` It is also important to notice that in case there are nested types within the root type we are recursing over, a new class with name ```FoundTypeNameSerializer``` will have to be created in order to completely serialize and deserialize. Even if the source generation fails, we can always fallback to the slower status quo by using ```Reflection```. +#### Validations and Tests + +Validations for this phase will happen for each type where we will use Roslyn's API to verify that the code we are generating is valid C# syntax code. If validations fail we will only include in the output the generated code that do not contain errors and leave the rest for fallback methods. These validation errors should produce output to users at compile time. + +There will be unit tests that checks the source code generation by checking output source code generation given multiple types that could be discovered in the Type Discovery phase. + #### Alternatives -Some alternatives such as the use of interfaces for the functions mentioned above or the creation of individual JsonConverter for each type were talked about. However, due to performance and the direction we are taking with the initial prototype, we believe these are not necessary. After user feedback we can revisit this if needed. +An alternative approach involving the creation of individual JsonConverter for each type was talked about. However, we believe that the current design provides the potential perf benefits of that approach in a way that is more serviceable, scalable, and has better integration with the serializer (to utilize support for more complex features) ### Generated Source Code Integration There are [discussions](https://gist.github.com/steveharter/d71cdfc25df53a8f60f1a3563d13cf0f) regarding integration of the approach mentioned above. -The high level registration for the generated source code implies that the Json options class is modified by calling generated code where we use de/serialization API entry points for the extended class that auto-registers itself. +The high level registration for the generated source code implies that the Json options class is modified by calling generated code where we use de/serialization API entry points for the extended class that auto-registers itself. This also implies the use of a shared options for all the types in a project that can be circumvented by creating a feature that moves the options to be class level like mentioned [here](https://github.com/dotnet/runtime/issues/36671). For the most part the source code generation approach mentioned above solves this problem since the serialize and deserialize functions would live within the type class and once the serializer is initialized, the serializer can be used calling ```JsonSerializer.Serialize()``` or ```SourceGeneratedType.Serialize()```. In order to continue with this approach, an extension of the type class would be needed where the users would have to explicitly declare their types as partial classes for it to be extended by the source generator while the types they don’t own would be entirely created. +For cases where users may not have enough context to call more specific overloads proposed (such as ASP.NET) we are considering ways of looking up the type's metadata that points to the type's JsonClassInfo so part of this feature's benefits could be received. + Even if this initialization isn't performed, we can always fallback to the slower status quo ```JsonSerializer``` methods. +#### Example Usage + +```C# + // (Base Case) Codegen De/Serialization as extensions of SerializableClass. +[JsonSerializable] +public partial class SerializableClass +{ + public string Email { get; set; } + public string Password { get; set; } +} + +// (Pass in Type) Codegen De/Serialization extending SerializerForExternalClass using ExternClass. +[JsonSerializable(typeof(ExternalClass))] +public static partial class SerializerForExternalClass { } + +// (WIP) High level usage of serialization using contexts. +using (var context = new MyJsonSerializerContext(options)) +{ + SerializableClass obj = context.SerializableClass.Deserialize(json); +} +``` + +#### Validations and Tests + +Validations for this tests will be mostly burdened by Roslyn's API error handling where if something goes wrong in the first two phases, it won't include any generated code into the final compilation along with the validations mentioned in the previous phases. However, there will be end to end tests that verify the error handling, generated source code and types that were generated given source codes. + ## Future Considerations * **Versioning**: This will be needed in order to determine compatibility and to be able to detect the bugs related to the different releases of this feature. -* **Error Handling**: Currently if something goes wrong in the source generation or code generation, Roslyn's SourceGenerator default message is shown to the user. This needs to be handled to show compilation errors from the source generated code to the user to be more verbose. \ No newline at end of file +* **Error Handling**: Currently if something goes wrong in the source generation or code generation, Roslyn's SourceGenerator default message is shown to the user. This needs to be handled to show compilation errors from the source generated code to the user to be more verbose. +* **Linker Trimming**: Adding linker trimming test to ensure we have everything for both generated code and application code will be necessary. \ No newline at end of file From f8eae8b1491aab094406f3ae1c75c89255887de1 Mon Sep 17 00:00:00 2001 From: Kyungtak Woo Date: Sat, 18 Jul 2020 16:38:18 -0500 Subject: [PATCH 3/5] Update design doc with new high level programming model --- .../docs/JsonSerializationSourceGeneration.md | 66 +++++++++++-------- 1 file changed, 38 insertions(+), 28 deletions(-) diff --git a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md index 025c02c7d4b5..00880f8d8e1e 100644 --- a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md +++ b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md @@ -5,7 +5,7 @@ There are comprehensive [documents](https://github.com/dotnet/designs/pull/113) detailing the needs and benefits of generating JSON serializers at compile time. Some of these benefits are faster throughput, **improved startup time**, and **reduction in private memory usage** for serialization/deserialization. After discussing some approaches and pros/cons of some of them we decided to go ahead and implement this feature using [Roslyn Source Generators](https://github.com/dotnet/roslyn/blob/master/docs/features/source-generators.cookbook.md). This document will outline the roadmap for the initial experiment and highlight actionable items for the base prototype. ## Feature Behavior -There are 3 main points in this project: type discovery, source code generation, generated source code integration (with user applications): +There are 3 main points in this project: type discovery, source code generation and generated source code integration (with user applications): ### Type Discovery @@ -17,28 +17,26 @@ Various implicit approaches have been discussed such as source generating all pa #### Proposed Explicit Model There are two scenarios within the proposed explicit model: -1. Base case: Code generates a partial class to the attribute target class/struct. +1. Base case: Code generates a facade ```JsonClassInfo``` for the attribute target class/struct. ```c# - // (Base Case) Codegen De/Serialization as extensions of SerializableClass. + // (Base Case) Codegen (De)Serialization ExampleLoginClassInfo. [JsonSerializable] -public partial class SerializableClass +public class ExampleLogin { public string Email { get; set; } public string Password { get; set; } } ``` -2. Pass in type: Code generates a partial class to the attribute target class/struct using the passed in type. This scenario can be used if you don't want to make your serializable class partial or you don't own the serializable class. +2. Pass in type: Code generates a ```JsonClassInfo``` for the attribute target class/struct using the passed in type. This scenario can be used if you don't own the serializable class. ```c# -// (Pass in Type) Codegen De/Serialization extending SerializerForExternalClass using ExternClass. +// (Pass in Type) Codegen (De)Serialization to create SerializerForExternalClassInfo using ExternClass. [JsonSerializable(typeof(ExternalClass))] -public static class SerializerForExternalClass { } +public static class ExampleExternal { } ``` -The proposed approach for source code generation requires JSON serializable types defined by the user to be partial since source generation does not change the user’s code and we want to extend it and to allow the serialization of non-public members for owned types. - -The output of this phase would be a list of reflection-type-like model where we can iterate through the type's members in order to codegen recursively. The scope of this phase is to only find the root serializable types instead of the whole type-graph since we want to recursively codegen without storing the whole type-graph in memory. +The output of this phase would be a list of reflection-type-like models where we can iterate through the type's members in order to codegen recursively. The scope of this phase is to only find the root serializable types instead of the whole type-graph since we want to recursively codegen without storing the whole type-graph in-memory. We believe that an explicit model using attributes would be a simple first-approach to the problem given that the source code generation needs the user to declare their type class as partial anyway. We can then use Roslyn tree to find the JsonSerializable attribute for both types the user owns and doesn’t own to source generate using Roslyn Source Generators. @@ -57,12 +55,12 @@ namespace System.Text.Json.Serialization public sealed class JsonSerializableAttribute : JsonAttribute { /// - /// Takes target class/struct to construct a facade Serializer class as TargetNameSerializer. + /// Takes target class/struct to construct a facade JsonClassInfo as TargetNameClassInfo. /// public JsonSerializableAttribute() { } /// - /// Takes type as an argument and uses it to create a facade Serializer class as TargetNameSerializer. + /// Takes type as an argument and uses it to create a facade JsonClassInfo as TargetNameClassInfo. /// public JsonSerializableAttribute(Type type) { } } @@ -76,7 +74,7 @@ For validations we will handle cases where the type representation is missing re Testing for this phase will consist of unit tests where given different source code and referenced assemblies, we verify that the source generation pass detects and creates all of the type representation with necessary data. ### Source Code Generation -This phase consists of taking the discovered types and recursively codegenerating the serialization methods. +This phase consist of taking the discovered types and recursively codegenerating JsonClassInfo along with its registration. #### Proposed Approach The expected code generation has been already been [tackled](https://github.com/dotnet/runtimelab/compare/master...steveharter:ApiAdds) by @steveharter focusing mainly on performance gains and extendibility to the current codebase. This approach increases performance drastically in 2 different ways. The first would be during the first-time/warm-up performance for both CPU and memory by avoiding costly reflection to build up a Type metadata cache mentioned here. The second would be throughput improvement by avoiding the initial metadata-dictionary lookup on calls to the serializer by generating ```CreateObjectFunc```, ```SerializeFunc``` and ```DeserializeFunc``` when creating its ```JsonClassInfo``` (metadata). @@ -85,7 +83,7 @@ The proposed approach consist of an initialization phase where generated code wi #### Sketch of SourceGenerated Code (for simple POCO using SerializableClass) -Class variables for code generated SerializableClassSerializer: +Class variables for code generated ExampleLoginClassInfo: ```c# private static bool _s_isInitiated; private static JsonClassInfo _s_classInfo; @@ -100,7 +98,7 @@ These functions would used to create a JsonClassInfo: // CreateObjectFunc private static object CreateObjectFunc() { - return new SerializableClass(); + return new ExampleLogin(); } ``` @@ -108,7 +106,7 @@ private static object CreateObjectFunc() // SerializeFunc private static void SerializeFunc(Utf8JsonWriter writer, object value, ref WriteStack writeStack, JsonSerializerOptions options) { - SerializableClassSerializer obj = (SerializableClassSerializer)value; + ExampleLogin obj = (ExampleLogin)value; _s_Property_Email.WriteValue(obj.Email, writer); _s_Property_Password.WriteValue(obj.Password, writer); @@ -117,7 +115,7 @@ private static void SerializeFunc(Utf8JsonWriter writer, object value, ref Write ```c# // DeserializeFunc -private static SerializableClassSerializer DeserializeFunc(ref Utf8JsonReader reader, ref ReadStack readStack, JsonSerializerOptions options) +private static ExampleLogin DeserializeFunc(ref Utf8JsonReader reader, ref ReadStack readStack, JsonSerializerOptions options) { bool ReadPropertyName(ref Utf8JsonReader reader) { @@ -125,7 +123,7 @@ private static SerializableClassSerializer DeserializeFunc(ref Utf8JsonReader re } ReadOnlySpan propertyName; - SerializableClassSerializer obj = new SerializableClassSerializer(); + ExampleLogin obj = new ExampleLogin(); if (!ReadPropertyName(ref reader)) goto Done; propertyName = reader.ValueSpan; @@ -161,9 +159,9 @@ private static SerializableClassSerializer DeserializeFunc(ref Utf8JsonReader re User faced methods for (de)serialization (assuming SerializableClassSerializer is initialized): ```c# -public static SerializableClassSerializer Deserialize(string json) +public static ExampleLogin Deserialize(string json) { - return JsonSerializer.Deserialize(json, this._s_classInfo); + return JsonSerializer.Deserialize(json, this._s_classInfo); } public static string Serialize() @@ -172,7 +170,7 @@ public static string Serialize() } ``` -It is also important to notice that in case there are nested types within the root type we are recursing over, a new class with name ```FoundTypeNameSerializer``` will have to be created in order to completely serialize and deserialize. +It is also important to notice that in case there are nested types within the root type we are recursing over, a new class with name ```FoundTypeNameClassInfo``` will have to be created in order to completely serialize and deserialize. Even if the source generation fails, we can always fallback to the slower status quo by using ```Reflection```. @@ -183,14 +181,25 @@ Validations for this phase will happen for each type where we will use Roslyn's There will be unit tests that checks the source code generation by checking output source code generation given multiple types that could be discovered in the Type Discovery phase. #### Alternatives -An alternative approach involving the creation of individual JsonConverter for each type was talked about. However, we believe that the current design provides the potential perf benefits of that approach in a way that is more serviceable, scalable, and has better integration with the serializer (to utilize support for more complex features) +An alternative approach involving the creation of individual JsonConverter for each type was talked about. However, we believe that the current design provides the potential perf benefits of that approach in a way that is more serviceable, scalable, and has better integration with the serializer (to utilize support for more complex features). ### Generated Source Code Integration There are [discussions](https://gist.github.com/steveharter/d71cdfc25df53a8f60f1a3563d13cf0f) regarding integration of the approach mentioned above. -The high level registration for the generated source code implies that the Json options class is modified by calling generated code where we use de/serialization API entry points for the extended class that auto-registers itself. This also implies the use of a shared options for all the types in a project that can be circumvented by creating a feature that moves the options to be class level like mentioned [here](https://github.com/dotnet/runtime/issues/36671). +The high level registration for the generated source code implies that the Json options class is modified by calling generated code where we use (de)serialization API entry points for the extended class that auto-registers itself. This also implies the use of a shared options for all the types in a project that can be circumvented by creating a feature that moves the options to be class level like mentioned [here](https://github.com/dotnet/runtime/issues/36671). Because of this, the usage wrapper for this feature would consist in creating and using a ```JsonSerializerContext``` class by passing in the ```JsonSerializerOptions``` as seen in the following examples and usages: + +```c# +public class JsonSerializerContext : IDisposable +{ + public JsonSerializerContext(); + public JsonSerializerContext(JsonSerializerOptions options); + public JsonSerializerOptions JsonSerializerOptions { get; }; -For the most part the source code generation approach mentioned above solves this problem since the serialize and deserialize functions would live within the type class and once the serializer is initialized, the serializer can be used calling ```JsonSerializer.Serialize()``` or ```SourceGeneratedType.Serialize()```. In order to continue with this approach, an extension of the type class would be needed where the users would have to explicitly declare their types as partial classes for it to be extended by the source generator while the types they don’t own would be entirely created. + // Generated JsonClassInfos. + public JsonClassInfo ExampleLoginClassInfo { get; } + public JsonClassInfo SerializerForExternalClassInfo { get; } +} +``` For cases where users may not have enough context to call more specific overloads proposed (such as ASP.NET) we are considering ways of looking up the type's metadata that points to the type's JsonClassInfo so part of this feature's benefits could be received. @@ -201,7 +210,7 @@ Even if this initialization isn't performed, we can always fallback to the slowe ```C# // (Base Case) Codegen De/Serialization as extensions of SerializableClass. [JsonSerializable] -public partial class SerializableClass +public class ExampleLogin { public string Email { get; set; } public string Password { get; set; } @@ -209,12 +218,13 @@ public partial class SerializableClass // (Pass in Type) Codegen De/Serialization extending SerializerForExternalClass using ExternClass. [JsonSerializable(typeof(ExternalClass))] -public static partial class SerializerForExternalClass { } +public static class ExampleExternal { } -// (WIP) High level usage of serialization using contexts. +// High level usage of serialization using context. using (var context = new MyJsonSerializerContext(options)) { - SerializableClass obj = context.SerializableClass.Deserialize(json); + ExampleLogin obj = context.ExampleLoginClassInfo.Deserialize(json); + ExternalClass obj = context.ExampleExternalClassInfo.Deserialize(json); } ``` From 68ebe311b9cd36d54d22e4b8581c30066be9e626 Mon Sep 17 00:00:00 2001 From: Kyungtak Woo Date: Sat, 18 Jul 2020 18:02:59 -0500 Subject: [PATCH 4/5] Code changes to the design document --- .../System.Text.Json/docs/JsonSerializationSourceGeneration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md index 00880f8d8e1e..c999b607cd1d 100644 --- a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md +++ b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md @@ -197,7 +197,7 @@ public class JsonSerializerContext : IDisposable // Generated JsonClassInfos. public JsonClassInfo ExampleLoginClassInfo { get; } - public JsonClassInfo SerializerForExternalClassInfo { get; } + public JsonClassInfo ExampleExternalClassInfo { get; } } ``` From 8b0ebd865ac03393c126a77ca02ef679ecde65fc Mon Sep 17 00:00:00 2001 From: Kyungtak Woo Date: Thu, 23 Jul 2020 14:25:58 -0500 Subject: [PATCH 5/5] Port changes from one pager doc to design doc --- .../docs/JsonSerializationSourceGeneration.md | 31 +++++++++---------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md index c999b607cd1d..7cfab1fbf9d0 100644 --- a/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md +++ b/src/libraries/System.Text.Json/docs/JsonSerializationSourceGeneration.md @@ -31,7 +31,7 @@ public class ExampleLogin 2. Pass in type: Code generates a ```JsonClassInfo``` for the attribute target class/struct using the passed in type. This scenario can be used if you don't own the serializable class. ```c# -// (Pass in Type) Codegen (De)Serialization to create SerializerForExternalClassInfo using ExternClass. +// (Pass in Type) Codegen (De)Serialization to create ExampleExternalClassInfo using ExternClass. [JsonSerializable(typeof(ExternalClass))] public static class ExampleExternal { } ``` @@ -46,7 +46,7 @@ We believe that an explicit model using attributes would be a simple first-appro namespace System.Text.Json.Serialization { /// - /// When placed on a type, will source generate de/serialization for the specified type and all types in it's object graph. + /// When placed on a type, will source generate (de)serialization for the specified type and all types in it's object graph. /// /// /// Must take into account that type discovery using this attribute is at compile time using Source Generators. @@ -69,21 +69,22 @@ namespace System.Text.Json.Serialization #### Validation and Testing -For validations we will handle cases where the type representation is missing required fields to source generate. This can be done in the current phase or the Source Code Generation phase but must be handled in both. +For validations we will handle cases where the type representation has missing required fields to source generate. This can be done in the current phase or the Source Code Generation phase but must be handled in both. Testing for this phase will consist of unit tests where given different source code and referenced assemblies, we verify that the source generation pass detects and creates all of the type representation with necessary data. ### Source Code Generation -This phase consist of taking the discovered types and recursively codegenerating JsonClassInfo along with its registration. +This phase consist of taking the discovered types and recursively codegenerating ```JsonClassInfo``` along with its registration. #### Proposed Approach -The expected code generation has been already been [tackled](https://github.com/dotnet/runtimelab/compare/master...steveharter:ApiAdds) by @steveharter focusing mainly on performance gains and extendibility to the current codebase. This approach increases performance drastically in 2 different ways. The first would be during the first-time/warm-up performance for both CPU and memory by avoiding costly reflection to build up a Type metadata cache mentioned here. The second would be throughput improvement by avoiding the initial metadata-dictionary lookup on calls to the serializer by generating ```CreateObjectFunc```, ```SerializeFunc``` and ```DeserializeFunc``` when creating its ```JsonClassInfo``` (metadata). + +The design for the generated source focuses mainly on performance gains and extendibility to the current codebase. This approach improves performance in two ways. The first would be during the first-time/warm-up performance for both CPU and memory by avoiding costly reflection to build up a Type metadata cache mentioned [here](https://github.com/dotnet/runtime/issues/38982). The second would be throughput improvement by avoiding the initial metadata-dictionary lookup on calls to the serializer by generating ```CreateObjectFunc```, ```SerializeFunc``` and ```DeserializeFunc``` when creating its ```JsonClassInfo``` (metadata) that would be used to (de)serialize with overloaded JsonSerializer functions using a wrapper which will be mentioned in the integration phase. The proposed approach consist of an initialization phase where generated code will call an initialization method within the created facade class where a ```JsonClassInfo``` is created with the functions mentioned above and registered into options with the necessary ```JsonPropertyInfo```. For each call into the serializer using the generated code, the POCO would call a public overload into the ```JsonSerializer``` that also take the metadata ```JsonClassInfo``` created during the initialization method. #### Sketch of SourceGenerated Code (for simple POCO using SerializableClass) -Class variables for code generated ExampleLoginClassInfo: +Class variables for code generated ```ExampleLoginClassInfo```: ```c# private static bool _s_isInitiated; private static JsonClassInfo _s_classInfo; @@ -92,7 +93,7 @@ private static JsonPropertyInfo _s_Property_Email; private static JsonPropertyInfo _s_Property_Password; ``` -These functions would used to create a JsonClassInfo: +These functions would used to create a ```JsonClassInfo```: ```c# // CreateObjectFunc @@ -157,7 +158,7 @@ private static ExampleLogin DeserializeFunc(ref Utf8JsonReader reader, ref ReadS } ``` -User faced methods for (de)serialization (assuming SerializableClassSerializer is initialized): +User faced methods for (de)serialization (assuming ExampleLoginClassInfo is initialized): ```c# public static ExampleLogin Deserialize(string json) { @@ -172,7 +173,7 @@ public static string Serialize() It is also important to notice that in case there are nested types within the root type we are recursing over, a new class with name ```FoundTypeNameClassInfo``` will have to be created in order to completely serialize and deserialize. -Even if the source generation fails, we can always fallback to the slower status quo by using ```Reflection```. +Even if the source generation fails, we can always fallback to the slower status quo by using ```Reflection``` at runtime. #### Validations and Tests @@ -186,7 +187,7 @@ An alternative approach involving the creation of individual JsonConverter for e ### Generated Source Code Integration There are [discussions](https://gist.github.com/steveharter/d71cdfc25df53a8f60f1a3563d13cf0f) regarding integration of the approach mentioned above. -The high level registration for the generated source code implies that the Json options class is modified by calling generated code where we use (de)serialization API entry points for the extended class that auto-registers itself. This also implies the use of a shared options for all the types in a project that can be circumvented by creating a feature that moves the options to be class level like mentioned [here](https://github.com/dotnet/runtime/issues/36671). Because of this, the usage wrapper for this feature would consist in creating and using a ```JsonSerializerContext``` class by passing in the ```JsonSerializerOptions``` as seen in the following examples and usages: +The proposed approach consists of the generator creating a context class (```JsonSerializerContext```) which takes an options instance that contains references to the generated ```JsonClassInfo```s for each type seen above. This relies on the creation of new overloads to the current serializer that take ```JsonClassInfo```s that can be retrieved from the context. An example of the overload and usage can be seen [here](https://github.com/dotnet/runtimelab/compare/master...steveharter:ApiAdds) while examples and details of the end to end approach can be seen as follows: ```c# public class JsonSerializerContext : IDisposable @@ -201,14 +202,12 @@ public class JsonSerializerContext : IDisposable } ``` -For cases where users may not have enough context to call more specific overloads proposed (such as ASP.NET) we are considering ways of looking up the type's metadata that points to the type's JsonClassInfo so part of this feature's benefits could be received. - -Even if this initialization isn't performed, we can always fallback to the slower status quo ```JsonSerializer``` methods. +For cases where users may not have enough context to call more specific overloads proposed (such as ASP.NET) we are considering ways of looking up the type's metadata that points to the type's ```JsonClassInfo``` so part of this feature's benefits could be received. #### Example Usage ```C# - // (Base Case) Codegen De/Serialization as extensions of SerializableClass. + // (Base Case) Codegen (De)Serialization of ExampleLogin into ExampleLoginClassInfo. [JsonSerializable] public class ExampleLogin { @@ -216,11 +215,11 @@ public class ExampleLogin public string Password { get; set; } } -// (Pass in Type) Codegen De/Serialization extending SerializerForExternalClass using ExternClass. +// (Pass in Type) Codegen (De)Serialization of ExampleExternal into ExampleExternalClassInfo using ExternalClass. [JsonSerializable(typeof(ExternalClass))] public static class ExampleExternal { } -// High level usage of serialization using context. +// High level usage of serialization using JsonSerializableContext. using (var context = new MyJsonSerializerContext(options)) { ExampleLogin obj = context.ExampleLoginClassInfo.Deserialize(json);