Skip to content

Commit

Permalink
Consolidate serialization docs
Browse files Browse the repository at this point in the history
Summary: Consolidate serialization docs in one place and make one of them public (the other one is already public). There is some overlap in content to be addressed separately.

Reviewed By: prasad223

Differential Revision: D46497076

fbshipit-source-id: 953e16e146bf98b422dcd068bc084e9b8d614d8f
  • Loading branch information
vitaut authored and facebook-github-bot committed Jun 7, 2023
1 parent dc2227b commit e19ba27
Show file tree
Hide file tree
Showing 6 changed files with 88 additions and 14 deletions.
2 changes: 1 addition & 1 deletion thrift/doc/features/field-mask.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ union Mask {
const Mask allMask = {"excludes": {}}; // Masks all fields/whole field.
const Mask noneMask = {"includes": {}}; // Masks no fields.`
```
[Debug protocol](../spec/protocol/data/#debug-protocol) can be used to convert Mask to a human readable string.
[Debug protocol](/features/serialization/protocols.md#debug-protocol) can be used to convert Mask to a human readable string.

## APIs

Expand Down
2 changes: 1 addition & 1 deletion thrift/doc/features/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ feature in different languages.

| Thrift feature | C++ | Hack | Java | Python |
| :------------- | :-: | :--: | :--: | :----: |
| [Serialization](/fb/features/serialization.md) | <Supported/> | <Supported/> | <Supported/> | <Supported/> |
| [Serialization](/features/serialization/index.md) | <Supported/> | <Supported/> | <Supported/> | <Supported/> |
| [Universal names](/features/universal-name.md) | <Supported/> | <Supported/> | <Supported/> | <Supported/> |
| [Streaming](/fb/features/streaming/index.md) | <Supported/> | <Partial>Client[^1]</Partial> | <Partial>Client[^1]</Partial> | <Supported/> |
| [Interactions](/fb/features/interactions.md) | <Supported/> | <Partial>Client[^1]</Partial> | <Partial>Client[^1]</Partial> | <Partial>Client[^1]</Partial> |
Expand Down
70 changes: 70 additions & 0 deletions thrift/doc/features/serialization/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Serialization

<!-- https://www.internalfb.com/intern/wiki/Thrift/Overview/Serialization/?noredirect -->

## Protocols

There are two approaches that may be used for serialization:

* Serialization by field id
* In this case, the serialized data contains the id, the type, and the value for every field that is serialized. The name is not included in the serialized data. If the field is of an enumeration type, the integer value of the enumerator is included in the serialized data, and the enumerator (the named constant) is not included.
* Compact/Binary/Frozen protocols use this approach, which are used by most thrift services at Meta.
* Serialization by field name
* In this case, the serialized data contains the name, the type, and the value for every field that is serialized. The id is not included in the serialized data. If the field is of an enumeration type, the enumerator (the named constant) is included in the serialized data, and the integer value of the enumerator is not included.
* JSON protocol uses this approach. Most notable use cases at Meta are Configerator and Tupperware to generate configs in a human readable JSON format.

*Caution: Never mix serialization by field id and serialization by field name within the same use case.*

## Qualifiers

For different qualifiers, the serialization behavior is slightly different:

* Optional field is only serialized when this field is set. (Note: in C++, due to historical reason, by using deprecated API (e.g. `value_unchecked()`), it is possible to change underlying value without setting the field).
* Unqualified field is always serialized. If data is not set, it will be serialized to default value. This is recommended unless you need to distinguish empty and the default value.
* Required field is always serialized since it’s always set. You can’t unset it. **This qualifier is deprecated and most languages ignore it**, don't use it in new code.

```
struct ThriftStruct {
1: string unqual_field; // unqualified
2: optional string opt_field; // optional
3: required string req_field; // required
}
```

The representation of required, optional, and unqualified fields in serialized data are identical.

## Class type

The serialized data for structs, unions, and exceptions are indistinguishable.

---

# Deserialization

## Protocols

When serialized data is deserialized into a struct object, the fields in the serialized data are *matched* to fields in the struct object:

* If serialized by id, fields with the same id are matched.
* If serialized by name, fields with the same name are matched.

The value of the field in the serialized data is assigned to the matching field in the struct object, thus

1. After deserialization, this field in the struct object will be *present* with that value.
2. Any unmatched fields in the serialized data are ignored.
3. Any matched fields with mismatched types in the serialized data are ignored.
4. For given field in struct object, if there is no matching fields in the serialized data, it remains untouched.

## Qualifiers

The deserialization behavior is same for all qualifier. However, the initialization behavior is slightly different:

* Optional field will be empty. Attempting to access this field results exception.
* Unqualified field will be empty. Accessing it results default value.
* Required field will be set to default value.

For RPC, since we are not only deserializing the payload, but also initializing thrift object, different qualifier changes RPC behavior when sending/receiving thrift request.

## Class type

Deserialization into an union object fails when there is more than one match between the serialized data and the union object. Otherwise structs, unions, and exceptions have same behavior.
Original file line number Diff line number Diff line change
@@ -1,11 +1,6 @@
---
state: draft
sidebar_position: 1
---
# Serialization Protocols

# Data Protocols

A data protocol in Thrift is a format that defines how data is serialized into a sequence of bytes and deserialized from it.
A serialization protocol in Thrift is a format that defines how data is serialized into a sequence of bytes and deserialized from it.

## Thrift Types

Expand Down
6 changes: 3 additions & 3 deletions thrift/doc/spec/protocol/interface/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,15 @@ This document describes the layer immediately preceding the transport protocol (

## Request

The client **must** take the user provided arguments to the Interface method and serialize the request as described in Serialization Details below. The client **may** compress the serialized request as described in [Request Compression](#request-compression). The name of the Interface method being requested as well as the [data protocol](../data.md) that was used to serialize the request **must** be included in the metadata associated with the request. The client may detect an exception while attempting to perform the request before a valid response is received from the server, in which case, it **must** be raised as a [Client Detected Exception](#client-detected-exceptions).
The client **must** take the user provided arguments to the Interface method and serialize the request as described in Serialization Details below. The client **may** compress the serialized request as described in [Request Compression](#request-compression). The name of the Interface method being requested as well as the [serialization protocol](/features/serialization/protocols.md) that was used to serialize the request **must** be included in the metadata associated with the request. The client may detect an exception while attempting to perform the request before a valid response is received from the server, in which case, it **must** be raised as a [Client Detected Exception](#client-detected-exceptions).

The client **must** specify the method name in the request as follows:
- For methods inside an interaction, the method name is `‹InteractionName›.‹MethodName›` where `‹InteractionName›` and `‹MethodName›` are the names in the IDL of the interaction and method respectively.
- For all other methods (including factory methods for interactions), the method name matches the name in the IDL.

### Serialization Details

The parameters to an Interface method **must** be treated as fields of a Thrift struct with an empty name (`""`). The Field IDs **must** be the same as those specified in the IDL. If the Interface method has no parameters then the struct **must** have no fields. To prepare for sending the request through one of the underlying transport protocols, this unnamed struct **must** be serialized with one of Thrift’s [data protocols](../data.md).
The parameters to an Interface method **must** be treated as fields of a Thrift struct with an empty name (`""`). The Field IDs **must** be the same as those specified in the IDL. If the Interface method has no parameters then the struct **must** have no fields. To prepare for sending the request through one of the underlying transport protocols, this unnamed struct **must** be serialized with one of Thrift [serialization protocols](/features/serialization/protocols.md).

For example, this method:

Expand All @@ -43,7 +43,7 @@ struct ‹Anonymous› {
}
```

passing the `‹Anonymous›` to a serializer for the [Compact Protocol](../data.md#compact-protocol) and passing the resulting string to a compressor for [zstd](https://facebook.github.io/zstd/).
passing the `‹Anonymous›` to a serializer for the [Compact Protocol](/features/serialization/protocols.md#compact-protocol) and passing the resulting string to a compressor for [zstd](https://facebook.github.io/zstd/).

## Response

Expand Down
13 changes: 11 additions & 2 deletions thrift/website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,6 @@ module.exports = {
id: 'spec/protocol/index',
},
items: [
"spec/protocol/data",
{
type: 'category',
label: 'Interface Protocol',
Expand Down Expand Up @@ -130,7 +129,17 @@ module.exports = {
// affecting their URLs.

// Released features:
'fb/features/serialization',
{
type: 'category',
label: 'Serialization',
link: {
type: 'doc',
id: "features/serialization/index",
},
items: [
'features/serialization/protocols'
]
},
'features/operators',
'features/universal-name',
{
Expand Down

0 comments on commit e19ba27

Please sign in to comment.