When protocols use protobuf as a serialisation, alloy also proposes a set of semantics for how smithy shapes translate to protobuf-specific concepts. In this document, we describe these semantics by explaining how the smithy code should translate to proto code representing the equivalent data.
alloy
defines a number of traits that are aimed at capturing various semantics relative to https://protobuf.dev/overview/ serialisation that are not part of the smithy standard library.
For full documentation on what each of these traits does, see the smithy specification here.
Note that for convenience, alloy
provides a module containing protobuf definitions that used downstream to ensure that the semantics described in this document are respected.
com.disneystreaming.alloy:alloy-protocol:x.y.z
alloy
comes with validators that verify the abidance of shapes to the rules described below. Note that these validators are protocol-specific, and
are only verifying shapes that belong to the transitive closure of shapes annotated with either alloy.proto#grpc
or alloy.proto#protoEnabled
.
Below is a table describing how smithy shapes translate to proto constructs.
Protobuf supports a number of scalar types that do not have first class support in smithy. In order to allow for expressing some of these in smithy, alloy
provides a alloy.proto#protoNumType
trait that can refine the meaning of Integer
or Long
types in protobuf semantics.
Smithy type | @protoNumType | @protoTimestampFormat | Proto |
---|---|---|---|
boolean | N/A | N/A | bool |
bigDecimal | N/A | N/A | string |
bigInteger | N/A | N/A | string |
blob | N/A | N/A | bytes |
double | N/A | N/A | double |
float | N/A | N/A | float |
string | N/A | N/A | string |
integer, byte, short | N/A | N/A | int32 |
integer, byte, short | FIXED | N/A | fixed32 |
integer, byte, short | FIXED_SIGNED | N/A | sfixed32 |
integer, byte, short | SIGNED | N/A | sint32 |
integer, byte, short | UNSIGNED | N/A | uint32 |
long | N/A | N/A | int64 |
long | FIXED | N/A | fixed64 |
long | FIXED_SIGNED | N/A | sfixed64 |
long | SIGNED | N/A | sint64 |
long | UNSIGNED | N/A | uint64 |
timestamp | N/A | none or PROTOBUF | message { long seconds = 1; long nanos = 2; } |
timestamp | N/A | EPOCH_MILLIS | message { long milliseconds = 1; |
Additionally, in the context of alloy
, the presence of the @protoWrapped
trait is interpreted as requiring the primitive to be wrapped in a one-field message.
For instance :
@protoWrapped
string MyString
would be converted to
message MyString {
string value = 1;
}
When converting .smithy IDL files to .proto IDL, types from the google.protobuf
library and the alloy.protobuf
library can be used.
Using protoWrapped
is interesting as it permits the distinction between the absence of a value and the presence of a default value.
Smithy type | @protoNumType | |
---|---|---|
float | N/A | google.protobuf.FloatValue |
blob | N/A | google.protobuf.BytesValue |
boolean | N/A | google.protobuf.BoolValue |
double | N/A | google.protobuf.DoubleValue |
bigDecimal | N/A | alloy.protobuf.BigDecimalValue |
bigInteger | N/A | alloy.protobuf.BigIntegerValue |
string | N/A | google.protobuf.StringValue |
integer, byte, short | N/A | google.protobuf.Int32Value |
integer, byte, short | FIXED | alloy.protobuf.FixedInt32Value |
integer, byte, short | FIXED_SIGNED | alloy.protobuf.SFixedInt32Value |
integer, byte, short | SIGNED | alloy.protobuf.SInt32Value |
integer, byte, short | UNSIGNED | google.protobuf.UInt32Value |
long | N/A | google.protobuf.Int64Value |
long | FIXED | alloy.protobuf.Fixed64Value |
long | FIXED_SIGNED | alloy.protobuf.SFixed64Value |
long | SIGNED | alloy.protobuf.SInt64Value |
long | UNSIGNED | google.protobuf.UInt64Value |
Integer and Long shapes can be annotated with the @alloy.proto#protoNumType
in order to signal what encoding should be used during protobuf serialisation.
- SIGNED
- UNSIGNED
- FIXED
- FIXED_SIGNED
See here for documentation about these encodings.
Timestamp shapes can be annotated with the @alloy.proto#protoTimestampFormat
trait in order to signal what type of encoding should be used for timestamps in proto serialisation/deserialisation.
Possible values are PROTOBUF
and EPOCH_MILLIS
. PROTOBUF
is the default that is used in the absence of this trait. When EPOCH_MILLIS
is specified then the timestamp will be represented as a wrapped int64
in the corresponding proto definition.
By default, string shapes annotated with @uuidFormat
are serialised as protobuf strings. However, a alloy.proto#protoCompactUUID
trait is provided, which signals that the serialised form should be a message containing two int64 values :
Smithy:
use alloy#uuidFormat
use alloy.proto#protoCompactUUID
@protoCompactUUID
@uuidFormat
string MyUUID
structure Foo {
uuid : alloy#UUID
}
Proto:
message MyUUID {
int64 upper_bits = 1;
int64 lower_bits = 2;
}
message Foo {
uuid: MyUUID
}
Documents should be serialised using a protobuf message equivalent to the google.protobuf.Value
type, which is commonly used in the protobuf ecosystem to represent JSON values.
Timestamps should be serialised using a protobuf message equivalent to the google.protobuf.Timestamp
type, which is commonly used in the protobuf ecosystem to represent Timestamp values.
In the absence of explicit @protoIndex
traits on their members, the following rules is applied for structures/unions/string enumerations:
- In the case of structure and union members, the members should be treated as having an implicit protobuf field value starting from 1 for the first member, and increasing monotonically (by 1) for each subsequent member.
- In the case of string enumerations, the members should be treated as having an implicit protobuf field value string from 0 for the first member, and increasing monotonically (by 1) for each subsequent member.
Smithy:
structure Testing {
myString: String,
myInt: Integer
}
Proto:
import "google/protobuf/wrappers.proto";
message Testing {
google.protobuf.StringValue myString = 1;
google.protobuf.Int32Value myInt = 2;
}
Unions in Smithy are tricky to translate to Protobuf because of the nature of oneOf
: unions are first-class citizens in Smithy, whereas oneOf
can only exist relatively to messages in proto. Therefore, the default encoding for unions in protobuf is equivalent to the one of a proto message
that contains a definition
field which is the oneOf
. For example:
Smithy:
structure Union {
@required
value: TestUnion
}
union TestUnion {
num: Integer,
txt: String
}
Proto:
message Union {
foo.TestUnion value = 1;
}
message TestUnion {
oneof definition {
int32 num = 1;
string txt = 2;
}
}
It is possible to use the alloy@protoInlinedOneOf
to indicate that a union should be encoded as if the corresponding oneOf
was directly inlined in a message. This is subject to additional constraints, are oneOf
field indices are supposed to flattened into the containing message
's' field indices. However, this encoding is more compact.
For example:
Smithy:
use alloy.proto#protoInlinedOneOf
structure Union {
value: TestUnion
}
@protoInlinedOneOf
union TestUnion {
num: Integer,
txt: String
}
Proto:
syntax = "proto3";
package foo;
message Union {
oneof value {
int32 num = 1;
string txt = 2;
}
}
Protobuf doesn't allow oneof
members to have repeated
or map
fields. As a result, a smithy union with a members targeting a collection shapes MUST
either have the @protoWrapped
trait or target a collection shape have the @protoWrapped
trait.
The alloy.proto#protoInlinedOneOf
trait can be used to inline the corresponding oneof
in a protobuf message. A union with this trait MUST be used exactly once, by a structure member.
For example, this is valid:
structure Test {
myUnion: MyUnion
}
@protoInlinedOneOf
union MyUnion {
a: String,
b: Integer
}
But this is not because the MyUnion
is used in multiple shapes.
structure Test {
myUnion: MyUnion
}
structure OtherStruct {
aUnion: MyUnion
}
@protoInlinedOneOf
union MyUnion {
a: String,
b: Integer
}
This is also invalid because MyUnion
is never used.
@protoInlinedOneOf
union MyUnion {
a: String,
b: Integer
}
Smithy:
list StringList {
member: String
}
structure Struct {
value: StringList
}
Proto:
message Struct {
repeated string value = 1;
}
Smithy:
map StringStringMapType {
key: String,
value: String
}
structure StringStringMap {
value: StringStringMapType
}
Proto:
message StringStringMap {
map<string, string> value = 1;
}
Smithy:
enum Color {
RED
GREEN
BLUE
}
Proto:
enum Color {
RED = 0;
GREEN = 1;
BLUE = 2;
}
Open string enumerations are considered as raw strings when serialised to protobuf :
Smithy:
@alloy#openEnum
enum Color {
RED
GREEN
BLUE
}
structure Foo {
color: Color
}
Proto:
message Foo {
String color = 1;
}
Each value translates to the proto index. Because of this, one of the values MUST be 0, as proto enforces each enumeration to have a value set to 0.
Smithy:
intEnum Color {
RED = 0
GREEN = 5
BLUE = 6
}
Proto:
enum Color {
RED = 0;
GREEN = 5;
BLUE = 6;
}
Open int enumerations are considered as raw integers when serialised to protobuf :
Smithy:
@alloy#openEnum
enum Color {
RED = 6
GREEN = 7
BLUE = 8
}
structure Foo {
color: Color
}
Proto:
message Foo {
int32 color = 1;
}
The alloy.proto#protoIndex
trait marks an explicit index to be used for a member when it gets serialised to protobuf. For example:
the following
structure Test {
@protoIndex(2)
str: String
}
has the following meaning in protobuf semantics
message Test {
string str = 2;
}
When one member is annotated with a @protoIndex
, all members have to be annotated with it. This includes the members of :
- structures
- unions
- (closed) enumerations
Members of closed enumerations (whether string or int) can be annotated by alloy.proto#protoIndex
in smithy to customise the corresponding proto index that should be used during serialisation. An additional constraint is that when users elect to specify alloy.proto#protoIndex
, they are required to assign the 0
value to one of the enumeration members, as it is a requirement for protobuf.
On the other hand, members of open enumerations MUST NOT be annotated with alloy.proto#protoIndex
, as open enumerations in Smithy translate to the raw string/int in protobuf, allowing for the capture of unknown value regardless of how the target language generates enumerations.
This trait can be used by tooling to filter-in the list of shapes that should be taken into consideration when performing some protobuf-related validation or processing. This is used, for example, by the smithy-translate tool.
This can be used by tooling to mark some fields as reserved, which can be helpful to prevent some backward/forward compatibility problems when using smithy to describe protobuf/gRPC interactions.
It allows to mark certain field indexes as unusable by the smithy specification. For example, if a range is provided of 1 to 10 then the proto indexes for any fields in that structure must fall outside of that range. Ranges are inclusive.