Skip to content

Commit

Permalink
Update serialization.md
Browse files Browse the repository at this point in the history
Fixes warning: [19-02-06 08:40:50.533]Warning:[BuildCommand.BuildCore.Build Document.CompilePhaseHandlerWithIncremental.ConceptualDocumentProcessor.Build.BuildConceptualDocument](articles/networking/serialization.md#L197)Inline code snippet is only allowed inside tables.
  • Loading branch information
jdsartori committed Feb 6, 2019
1 parent e31288a commit 51c84c9
Showing 1 changed file with 90 additions and 85 deletions.
175 changes: 90 additions & 85 deletions docs/articles/networking/serialization.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,30 +5,24 @@ title: Serialization

# Serialization

One of the core concepts of any actor system like Akka.NET is the notion of
message passing between actors. Since Akka.NET is network transparent, these
actors can be located locally or remotely. As such the system needs a common
exchange format to package messages into so that it can send them to receiving
actors. In Akka.NET, messages are plain objects and thus easily converted to a
byte array. The process of converting objects into byte arrays is known as
serialization.

Akka.NET itself uses `Protocol Buffers` to serialize internal messages (i.e.
cluster gossip messages). However, the serialization mechanism in Akka.NET allows
you to write custom serializers and to define which serializer to use for what.
As shown in the examples further down this page, these serializers can be mixed
and matched depending on preference or need.

There are many other uses for serialization other than messaging. It's possible
to use these serializers for ad-hoc purposes as well.
One of the core concepts of any actor system like Akka.NET is the notion of message passing between actors.
Since Akka.NET is network transparent, these actors can be located locally or remotely. As such the system needs a common
exchange format to package messages into so that it can send them to receiving actors.
In Akka.NET, messages are plain objects and thus easily converted to a byte array.
The process of converting objects into byte arrays is known as serialization.

Akka.NET itself uses `Protocol Buffers` to serialize internal messages (i.e. cluster gossip messages).
However, the serialization mechanism in Akka.NET allows you to write custom serializers and to define which serializer to use for what.
As shown in the examples further down this page, these serializers can be mixed and matched depending on preference or need.

There are many other uses for serialization other than messaging.
It's possible to use these serializers for ad-hoc purposes as well.

## Usage

### Configuration
For Akka.NET to know which `Serializer` to use when (de-)serializing objects,
two sections need to be defined in the application's configuration. The
`akka.actor.serializers` section is where names are associated to
implementations of the `Serializer` to use.
For Akka.NET to know which `Serializer` to use when (de-)serializing objects, two sections need to be defined in the application's configuration.
The `akka.actor.serializers` section is where names are associated to implementations of the `Serializer` to use.

```hocon
akka {
Expand All @@ -41,8 +35,7 @@ akka {
}
```

The `akka.actor.serialization-bindings` section is where object types are
associated to a `Serializer` by the names defined in the previous section.
The `akka.actor.serialization-bindings` section is where object types are associated to a `Serializer` by the names defined in the previous section.

```hocon
akka {
Expand All @@ -62,29 +55,25 @@ akka {
}
```

In case of ambiguity, a message implements several of the configured
classes, the most specific configured class will be used, i.e. the one of
which all other candidates are superclasses. If this condition cannot be met,
because e.g. `ISerializable` and `MyOwnSerializable` both apply and neither is a
In case of ambiguity, such as a message implements several of the configured classes, the most specific configured class will be used, i.e. the one of which all other candidates are superclasses.
If this condition cannot be met, because e.g. `ISerializable` and `MyOwnSerializable` both apply and neither is a
subtype of the other, a warning will be issued.

Akka.NET provides serializers for POCO's (Plain Old C# Objects) and for
`Google.Protobuf.IMessage` by default, so normally you don't need to add
configuration for that.
Akka.NET provides serializers for POCO's (Plain Old C# Objects) and for `Google.Protobuf.IMessage` by default, so you don't usually need to add configuration for that.

### Verification
Normally, messages sent between local actors (i.e. same CLR) do not undergo serialization.
For testing, sometimes, it may be desirable to force serialization on all messages
(both remote and local). If you want to do this in order to verify that your messages
are serializable you can enable the following config option:
For testing, it may be desirable to force serialization on all messages, both remote and local.
If you want to do this to verify that your messages are serializable, you can enable the following config option:

```hocon
akka {
actor {
serialize-messages = on
}
}
```
If you want to verify that your `Props` are serializable you can enable the following config option:
If you want to verify that your `Props` are serializable, you can enable the following config option:

```hocon
akka {
Expand All @@ -95,13 +84,13 @@ akka {
```

> [!WARNING]
> We recommend having these config options turned on only when you're running tests. Turning these options on in production is pointless, as it would negatively impact the performance of local message passing without giving any gain.
> We recommend having these config options turned on only when you're running tests.
Turning these options on in production is pointless, as it would negatively impact the performance of local message passing without giving any gain.

### Programmatic
As mentioned previously, Akka.NET uses serialization for message passing.
However the system is much more robust than that. To programmatically
(de-)serialize objects using Akka.NET serialization, a reference to the
main serialization class is all that is needed.
However the system is much more robust than that.
To programmatically (de-)serialize objects using Akka.NET serialization, a reference to the main serialization class is all that is needed.

```csharp
using Akka.Actor;
Expand Down Expand Up @@ -129,42 +118,41 @@ Assert.AreEqual(original, back);
```

## Customization
Akka.NET makes it extremely easy to create custom serializers to handle a wide
variety of scenarios. All serializers in Akka.NET inherit from
`Akka.Serialization.Serializer`. So to create a custom serializer, all that is
needed is a class that inherits from this base class.
Akka.NET makes it extremely easy to create custom serializers to handle a wide variety of scenarios.
All serializers in Akka.NET inherit from `Akka.Serialization.Serializer`.
So, to create a custom serializer, all that is needed is a class that inherits from this base class.

### Creating new Serializers
A custom `Serializer` has to inherit from `Akka.Serialization.Serializer` and can be defined like the following:
A custom `Serializer` has to inherit from `Akka.Serialization.Serializer` and can be defined like this:

[!code-csharp[Main](../../examples/DocsExamples/Networking/Serialization/CreateCustomSerializer.cs?range=7-42)]

The only thing left to do for this class would be to fill in the serialization
logic in the ``ToBinary(object)`` method and the deserialization logic in the
``FromBinary(byte[], Type)``. Afterwards the configuration would need to be
updated to reflect which name to bind to and the classes that use this
The only thing left to do for this class would be to fill in the serialization logic in the ``ToBinary(object)`` method and the deserialization logic in the ``FromBinary(byte[], Type)``.
Afterwards the configuration would need to be updated to reflect which name to bind to and the classes that use this
serializer.

### Serializer with String Manifest
The `Serializer` illustrated above supports a class based manifest (type hint). For serialization of data that need to evolve over time the `SerializerWithStringManifest` is recommended instead of `Serializer` because the manifest (type hint) is a `String` instead of a `Type`. That means that the class can be moved/removed and the serializer can still deserialize old data by matching on the String. This is especially useful for `Persistence`.
The `Serializer` illustrated above supports a class-based manifest (type hint).
For serialization of data that need to evolve over time, the `SerializerWithStringManifest` is recommended instead of `Serializer` because the manifest (type hint) is a `String` instead of a `Type`.
This means that the class can be moved/removed and the serializer can still deserialize old data by matching on the String.
This is especially useful for `Persistence`.

The manifest string can also encode a version number that can be used in `FromBinary` to deserialize in different ways to migrate old data to new domain objects.

If the data was originally serialized with `Serializer` and in a later version of the system you change to `SerializerWithStringManifest` the manifest string will be the full class name if you used `IncludeManifest=true`, otherwise it will be the empty string.
If the data was originally serialized with `Serializer`, and in a later version of the system you change to `SerializerWithStringManifest`, the manifest string will be the full class name if you used `IncludeManifest=true`, otherwise it will be the empty string.

This is how a `SerializerWithStringManifest` looks like:
This is how a `SerializerWithStringManifest` looks:
[!code-csharp[Main](../../examples/DocsExamples/Networking/Serialization/MyOwnSerializer2.cs?range=9-66)]

You must also bind it to a name in your `Configuration` and then list which classes that should be serialized using it.

It's recommended to throw `SerializationException` in `FromBinary` if the manifest is unknown. This makes it possible to introduce new message types and send them to nodes that don't know about them. This is typically needed when performing rolling upgrades, i.e. running a cluster with mixed versions for while. `SerializationException` is treated as a transient problem in the TCP based remoting layer. The problem will be logged and message is dropped. Other exceptions will tear down the TCP connection because it can be an indication of corrupt bytes from the underlying transport.

### Serializing ActorRefs
All actors are serializable using the default protobuf serializer, but in cases were
custom serializers are used, we need to know how to (de-)serialize them
properly. In the general case, the local address to be used depends on the
type of remote address which shall be the recipient of the serialized
information. Use `Serialization.SerializedActorPath(actorRef)` like this:
All actors are serializable using the default protobuf serializer, but in cases where custom serializers are used, we need to know how to (de-)serialize them properly.
In the general case, the local address to be used depends on the type of remote address which shall be the recipient of the serialized
information.
Use `Serialization.SerializedActorPath(actorRef)` like this:

```csharp
using Akka.Actor;
Expand All @@ -181,41 +169,58 @@ IActorRef deserializedActorRef = extendedSystem.Provider.ResolveActorRef(id);
// Then just use the IActorRef
```

This assumes that serialization happens in the context of sending a message
through the remote transport. There are other uses of serialization, though,
e.g. storing actor references outside of an actor application (database, etc.).
In this case, it is important to keep in mind that the address part of an
actor's path determines how that actor is communicated with. Storing a local
actor path might be the right choice if the retrieval happens in the same
logical context, but it is not enough when deserializing it on a different
network host: for that it would need to include the system's remote transport
address. An actor system is not limited to having just one remote transport
per se, which makes this question a bit more interesting. To find out the
appropriate address to use when sending to `remoteAddr` you can use
`IActorRefProvider.GetExternalAddressFor(remoteAddr)` like this:

.[!code-csharp[Main](../../examples/DocsExamples/Networking/Serialization/ExternalAddressProvider.cs?range=7-66)]
This assumes that serialization happens in the context of sending a message through the remote transport.
There are other uses of serialization, though, e.g. storing actor references outside of an actor application (database, etc.).
In this case, it is important to keep in mind that the address part of an actor's path determines how that actor is communicated with. Storing a local actor path might be the right choice if the retrieval happens in the same logical context, but it is not enough when deserializing it on a different network host: for that it would need to include the system's remote transport address.
An actor system is not limited to having just one remote transport per se, which makes this question a bit more interesting.
To find out the appropriate address to use when sending to `remoteAddr` you can use `IActorRefProvider.GetExternalAddressFor(remoteAddr)` like this:

```csharp
public class ExternalAddress : ExtensionIdProvider<ExternalAddressExtension>
{
public override ExternalAddressExtension CreateExtension(ExtendedActorSystem system) =>
new ExternalAddressExtension(system);
}

public class ExternalAddressExtension : IExtension
{
private readonly ExtendedActorSystem _system;

public ExternalAddressExtension(ExtendedActorSystem system)
{
_system = system;
}

public Address AddressFor(Address remoteAddr)
{
return _system.Provider.GetExternalAddressFor(remoteAddr)
?? throw new InvalidOperationException($"cannot send to {remoteAddr}");
}
}

public class Test
{
private ExtendedActorSystem ExtendedSystem =>
ActorSystem.Create("test").AsInstanceOf<ExtendedActorSystem>();

public string SerializeTo(IActorRef actorRef, Address remote)
{
return actorRef.Path.ToSerializationFormatWithAddress(
new ExternalAddress().Get(ExtendedSystem).AddressFor(remote));
}
}

```

> [!NOTE]
> `ActorPath.ToSerializationFormatWithAddress` differs from `ToString` if the address
does not already have `host` and `port` components, i.e. it only inserts address
information for local addresses.
> `ToSerializationFormatWithAddress` also adds the unique id of the actor, which
will change when the actor is stopped and then created again with the same name.
Sending messages to a reference pointing the old actor will not be delivered to
the new actor. If you do not want this behavior, e.g. in case of long term
storage of the reference, you can use `ToStringWithAddress`, which does not
include the unique id.

This requires that you know at least which type of address will be supported by
the system which will deserialize the resulting actor reference; if you have no
concrete address handy you can create a dummy one for the right protocol using
`new Address(protocol, "", "", 0)` (assuming that the actual transport used is as
lenient as Akka's `RemoteActorRefProvider`).
> `ActorPath.ToSerializationFormatWithAddress` differs from `ToString` if the address does not already have `host` and `port` components, i.e. it only inserts address information for local addresses.
> `ToSerializationFormatWithAddress` also adds the unique id of the actor, which will change when the actor is stopped and then created again with the same name.
Sending messages to a reference pointing the old actor will not be delivered to the new actor. If you do not want this behavior, e.g. in case of long term storage of the reference, you can use `ToStringWithAddress`, which does not include the unique id.

This requires that you know at least which type of address will be supported by the system which will deserialize the resulting actor reference; if you have no concrete address handy you can create a dummy one for the right protocol using `new Address(protocol, "", "", 0)` (assuming that the actual transport used is as lenient as Akka's `RemoteActorRefProvider`).

### Deep serialization of Actors
The recommended approach to do deep serialization of internal actor state is
to use [Akka Persistence](xref:persistence-architecture).
The recommended approach to do deep serialization of internal actor state is to use [Akka Persistence](xref:persistence-architecture).

## How to setup Hyperion as default serializer

Expand Down

0 comments on commit 51c84c9

Please sign in to comment.