Skip to content

Commit

Permalink
Support oneof declaraction in protobuf (#2546)
Browse files Browse the repository at this point in the history
With `@ProtoOneOf` annotation for sealed classes and interfaces.
Inheritors of such an interface are expected to have one property with @ProtoNumber, encoded and decoded with special `OneOfEncoder/Decoder`. See documentation for design details.

Fixes #2538 
Fixes #67
  • Loading branch information
xiaozhikang0916 authored Apr 25, 2024
1 parent f525f1a commit 251bca7
Show file tree
Hide file tree
Showing 25 changed files with 1,646 additions and 290 deletions.
123 changes: 113 additions & 10 deletions docs/formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ stable, these are currently experimental features of Kotlin Serialization.
* [Integer types](#integer-types)
* [Lists as repeated fields](#lists-as-repeated-fields)
* [Packed fields](#packed-fields)
* [Oneof field (experimental)](#oneof-field-experimental)
* [Usage](#usage)
* [Alternative](#alternative)
* [ProtoBuf schema generator (experimental)](#protobuf-schema-generator-experimental)
* [Properties (experimental)](#properties-experimental)
* [Custom formats (experimental)](#custom-formats-experimental)
Expand Down Expand Up @@ -435,6 +438,106 @@ Per the standard packed fields can only be used on primitive numeric types. The
Per the [format description](https://developers.google.com/protocol-buffers/docs/encoding#packed) the parser ignores
the annotation, but rather reads list in either packed or repeated format.

### Oneof field (experimental)

Kotlin Serialization `ProtoBuf` format supports [oneof](https://protobuf.dev/programming-guides/proto2/#oneof) fields
basing on the [Polymorphism](polymorphism.md) functionality.

#### Usage

Given a protobuf message defined like:

```proto
message Data {
required string name = 1;
oneof phone {
string home_phone = 2;
string work_phone = 3;
}
}
```

You can define a kotlin class semantically equal to this message by following these steps:

* Declare a sealed interface or abstract class, to represent of the `oneof` group, called *the oneof interface*. In our example, oneof interface is `IPhoneType`.
* Declare a Kotlin class as usual to represent the whole message (`class Data` in our example). In this class, add the property with oneof interface type, annotated with `@ProtoOneOf`. Do not use `@ProtoNumber` for that property.
* Declare subclasses for oneof interface, one per each oneof group element. Each class must have **exactly one property** with the corresponding oneof element type. In our example, these classes are `HomePhone` and `WorkPhone`.
* Annotate properties in subclasses with `@ProtoNumber`, according to original `oneof` definition. In our example, `val number: String` in `HomePhone` has `@ProtoNumber(2)` annotation, because of field `string home_phone = 2;` in `oneof phone`.

<!--- INCLUDE
import kotlinx.serialization.*
import kotlinx.serialization.protobuf.*
-->

```kotlin
// The outer class
@Serializable
data class Data(
@ProtoNumber(1) val name: String,
@ProtoOneOf val phone: IPhoneType?,
)

// The oneof interface
@Serializable sealed interface IPhoneType

// Message holder for home_phone
@Serializable @JvmInline value class HomePhone(@ProtoNumber(2) val number: String): IPhoneType

// Message holder for work_phone. Can also be a value class, but we leave it as `data` to demonstrate that both variants can be used.
@Serializable data class WorkPhone(@ProtoNumber(3) val number: String): IPhoneType

fun main() {
val dataTom = Data("Tom", HomePhone("123"))
val stringTom = ProtoBuf.encodeToHexString(dataTom)
val dataJerry = Data("Jerry", WorkPhone("789"))
val stringJerry = ProtoBuf.encodeToHexString(dataJerry)
println(stringTom)
println(stringJerry)
println(ProtoBuf.decodeFromHexString<Data>(stringTom))
println(ProtoBuf.decodeFromHexString<Data>(stringJerry))
}
```

> You can get the full code [here](../guide/example/example-formats-08.kt).
```text
0a03546f6d1203313233
0a054a657272791a03373839
Data(name=Tom, phone=HomePhone(number=123))
Data(name=Jerry, phone=WorkPhone(number=789))
```

<!--- TEST -->

In [ProtoBuf diagnostic mode](https://protogen.marcgravell.com/decode) the first 2 lines in the output are equivalent to

```
Field #1: 0A String Length = 3, Hex = 03, UTF8 = "Tom" Field #2: 12 String Length = 3, Hex = 03, UTF8 = "123"
Field #1: 0A String Length = 5, Hex = 05, UTF8 = "Jerry" Field #3: 1A String Length = 3, Hex = 03, UTF8 = "789"
```

You should note that each group of `oneof` types should be tied to exactly one data class, and it is better not to reuse it in
another data class. Otherwise, you may get id conflicts or `IllegalArgumentException` in runtime.

#### Alternative

You don't always need to apply the `@ProtoOneOf` form in your class for messages with `oneof` fields, if this class is only used for deserialization.

For example, the following class:

```
@Serializable
data class Data2(
@ProtoNumber(1) val name: String,
@ProtoNumber(2) val homeNumber: String? = null,
@ProtoNumber(3) val workNumber: String? = null,
)
```

is also compatible with the `message Data` given above, which means the same input can be deserialized into it instead of `Data` — in case you don't want to deal with sealed hierarchies.

But please note that there are no exclusivity checks. This means that if an instance of `Data2` has both (or none) `homeNumber` and `workNumber` as non-null values and is serialized to protobuf, it no longer complies with the original schema. If you send such data to another parser, one of the fields may be omitted, leading to an unknown issue.

### ProtoBuf schema generator (experimental)

As mentioned above, when working with protocol buffers you usually use a ".proto" file and a code generator for your
Expand Down Expand Up @@ -467,15 +570,15 @@ fun main() {
println(schemas)
}
```
> You can get the full code [here](../guide/example/example-formats-08.kt).
> You can get the full code [here](../guide/example/example-formats-09.kt).
Which would output as follows.

```text
syntax = "proto2";
// serial name 'example.exampleFormats08.SampleData'
// serial name 'example.exampleFormats09.SampleData'
message SampleData {
required int64 amount = 1;
optional string description = 2;
Expand Down Expand Up @@ -519,7 +622,7 @@ fun main() {
}
```

> You can get the full code [here](../guide/example/example-formats-09.kt).
> You can get the full code [here](../guide/example/example-formats-10.kt).
The resulting map has dot-separated keys representing keys of the nested objects.

Expand Down Expand Up @@ -599,7 +702,7 @@ fun main() {
}
```

> You can get the full code [here](../guide/example/example-formats-10.kt).
> You can get the full code [here](../guide/example/example-formats-11.kt).
As a result, we got all the primitive values in our object graph visited and put into a list
in _serial_ order.
Expand Down Expand Up @@ -701,7 +804,7 @@ fun main() {
}
```

> You can get the full code [here](../guide/example/example-formats-11.kt).
> You can get the full code [here](../guide/example/example-formats-12.kt).
Now we can convert a list of primitives back to an object tree.

Expand Down Expand Up @@ -792,7 +895,7 @@ fun main() {
}
-->

> You can get the full code [here](../guide/example/example-formats-12.kt).
> You can get the full code [here](../guide/example/example-formats-13.kt).
<!--- TEST
[kotlinx.serialization, kotlin, 9000]
Expand Down Expand Up @@ -899,7 +1002,7 @@ fun main() {
}
```

> You can get the full code [here](../guide/example/example-formats-13.kt).
> You can get the full code [here](../guide/example/example-formats-14.kt).
We see the size of the list added to the result, letting the decoder know where to stop.

Expand Down Expand Up @@ -1011,7 +1114,7 @@ fun main() {

```

> You can get the full code [here](../guide/example/example-formats-14.kt).
> You can get the full code [here](../guide/example/example-formats-15.kt).
In the output we see how not-null`!!` and `NULL` marks are used.

Expand Down Expand Up @@ -1139,7 +1242,7 @@ fun main() {
}
```
> You can get the full code [here](../guide/example/example-formats-15.kt).
> You can get the full code [here](../guide/example/example-formats-16.kt).
As we can see, the result is a dense binary format that only contains the data that is being serialized.
It can be easily tweaked for any kind of domain-specific compact encoding.
Expand Down Expand Up @@ -1333,7 +1436,7 @@ fun main() {
}
```
> You can get the full code [here](../guide/example/example-formats-16.kt).
> You can get the full code [here](../guide/example/example-formats-17.kt).
As we can see, our custom byte array format is being used, with the compact encoding of its size in one byte.

Expand Down
3 changes: 3 additions & 0 deletions docs/serialization-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,9 @@ Once the project is set up, we can start serializing some classes.
* <a name='integer-types'></a>[Integer types](formats.md#integer-types)
* <a name='lists-as-repeated-fields'></a>[Lists as repeated fields](formats.md#lists-as-repeated-fields)
* <a name='packed-fields'></a>[Packed fields](formats.md#packed-fields)
* <a name='oneof-field-experimental'></a>[Oneof field (experimental)](formats.md#oneof-field-experimental)
* <a name='usage'></a>[Usage](formats.md#usage)
* <a name='alternative'></a>[Alternative](formats.md#alternative)
* <a name='protobuf-schema-generator-experimental'></a>[ProtoBuf schema generator (experimental)](formats.md#protobuf-schema-generator-experimental)
* <a name='properties-experimental'></a>[Properties (experimental)](formats.md#properties-experimental)
* <a name='custom-formats-experimental'></a>[Custom formats (experimental)](formats.md#custom-formats-experimental)
Expand Down
7 changes: 7 additions & 0 deletions formats/protobuf/api/kotlinx-serialization-protobuf.api
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,13 @@ public synthetic class kotlinx/serialization/protobuf/ProtoNumber$Impl : kotlinx
public final synthetic fun number ()I
}

public abstract interface annotation class kotlinx/serialization/protobuf/ProtoOneOf : java/lang/annotation/Annotation {
}

public synthetic class kotlinx/serialization/protobuf/ProtoOneOf$Impl : kotlinx/serialization/protobuf/ProtoOneOf {
public fun <init> ()V
}

public abstract interface annotation class kotlinx/serialization/protobuf/ProtoPacked : java/lang/annotation/Annotation {
}

Expand Down
3 changes: 3 additions & 0 deletions formats/protobuf/api/kotlinx-serialization-protobuf.klib.api
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ open annotation class kotlinx.serialization.protobuf/ProtoNumber : kotlin/Annota
final val number // kotlinx.serialization.protobuf/ProtoNumber.number|{}number[0]
final fun <get-number>(): kotlin/Int // kotlinx.serialization.protobuf/ProtoNumber.number.<get-number>|<get-number>(){}[0]
}
open annotation class kotlinx.serialization.protobuf/ProtoOneOf : kotlin/Annotation { // kotlinx.serialization.protobuf/ProtoOneOf|null[0]
constructor <init>() // kotlinx.serialization.protobuf/ProtoOneOf.<init>|<init>(){}[0]
}
open annotation class kotlinx.serialization.protobuf/ProtoPacked : kotlin/Annotation { // kotlinx.serialization.protobuf/ProtoPacked|null[0]
constructor <init>() // kotlinx.serialization.protobuf/ProtoPacked.<init>|<init>(){}[0]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,14 @@ public annotation class ProtoType(public val type: ProtoIntegerType)
@Target(AnnotationTarget.PROPERTY)
@ExperimentalSerializationApi
public annotation class ProtoPacked

/**
* Instructs that a particular property should be written as an [oneof](https://protobuf.dev/programming-guides/proto2/#oneof).
*
* The type of the annotated property should be polymorphic (interface or abstract class).
* Inheritors of this type would represent `one of` choices, and each inheritor should have exactly one property, annotated with [ProtoNumber].
*/
@SerialInfo
@Target(AnnotationTarget.PROPERTY)
@ExperimentalSerializationApi
public annotation class ProtoOneOf
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ package kotlinx.serialization.protobuf.internal

import kotlinx.serialization.*
import kotlinx.serialization.descriptors.*
import kotlinx.serialization.modules.*
import kotlinx.serialization.protobuf.*

internal typealias ProtoDesc = Long
Expand All @@ -16,19 +17,17 @@ internal const val i64 = 1
internal const val SIZE_DELIMITED = 2
internal const val i32 = 5

private const val INTTYPEMASK = (Int.MAX_VALUE.toLong() shr 1) shl 33
private const val PACKEDMASK = 1L shl 32
internal const val ID_HOLDER_ONE_OF = -2

@Suppress("NOTHING_TO_INLINE")
internal inline fun ProtoDesc(protoId: Int, type: ProtoIntegerType, packed: Boolean): ProtoDesc {
val packedBits = if (packed) 1L shl 32 else 0L
val signature = type.signature or packedBits
return signature or protoId.toLong()
}
private const val ONEOFMASK = 1L shl 36
private const val INTTYPEMASK = 3L shl 33
private const val PACKEDMASK = 1L shl 32

@Suppress("NOTHING_TO_INLINE")
internal inline fun ProtoDesc(protoId: Int, type: ProtoIntegerType): ProtoDesc {
return type.signature or protoId.toLong()
internal inline fun ProtoDesc(protoId: Int, type: ProtoIntegerType, packed: Boolean = false, oneOf: Boolean = false): ProtoDesc {
val packedBits = if (packed) PACKEDMASK else 0L
val oneOfBits = if (oneOf) ONEOFMASK else 0L
return packedBits or oneOfBits or type.signature or protoId.toLong()
}

internal inline val ProtoDesc.protoId: Int get() = (this and Int.MAX_VALUE.toLong()).toInt()
Expand All @@ -51,11 +50,19 @@ internal val SerialDescriptor.isPackable: Boolean
internal val ProtoDesc.isPacked: Boolean
get() = (this and PACKEDMASK) != 0L

internal val ProtoDesc.isOneOf: Boolean
get() = (this and ONEOFMASK) != 0L

internal fun ProtoDesc.overrideId(protoId: Int): ProtoDesc {
return this and (0xFFFFFFF00000000L) or protoId.toLong()
}

internal fun SerialDescriptor.extractParameters(index: Int): ProtoDesc {
val annotations = getElementAnnotations(index)
var protoId: Int = index + 1
var format: ProtoIntegerType = ProtoIntegerType.DEFAULT
var protoPacked = false
var isOneOf = false

for (i in annotations.indices) { // Allocation-friendly loop
val annotation = annotations[i]
Expand All @@ -65,23 +72,61 @@ internal fun SerialDescriptor.extractParameters(index: Int): ProtoDesc {
format = annotation.type
} else if (annotation is ProtoPacked) {
protoPacked = true
} else if (annotation is ProtoOneOf) {
isOneOf = true
}
}
return ProtoDesc(protoId, format, protoPacked)
if (isOneOf) {
// reset protoId to index-based for oneOf field,
// Decoder will restore the real proto id then from [ProtobufDecoder.index2IdMap]
// See [kotlinx.serialization.protobuf.internal.ProtobufDecoder.decodeElementIndex] for detail
protoId = index + 1
}
return ProtoDesc(protoId, format, protoPacked, isOneOf)
}

/**
* Get the proto id from the descriptor of [index] element,
* or return [ID_HOLDER_ONE_OF] if such element is marked with [ProtoOneOf]
*/
internal fun extractProtoId(descriptor: SerialDescriptor, index: Int, zeroBasedDefault: Boolean): Int {
val annotations = descriptor.getElementAnnotations(index)
var result = if (zeroBasedDefault) index else index + 1
for (i in annotations.indices) { // Allocation-friendly loop
val annotation = annotations[i]
if (annotation is ProtoNumber) {
return annotation.number
if (annotation is ProtoOneOf) {
// Fast return for one of field
return ID_HOLDER_ONE_OF
} else if (annotation is ProtoNumber) {
result = annotation.number
}
}
return if (zeroBasedDefault) index else index + 1
return result
}

internal class ProtobufDecodingException(message: String) : SerializationException(message)

internal expect fun Int.reverseBytes(): Int
internal expect fun Long.reverseBytes(): Long


internal fun SerialDescriptor.getAllOneOfSerializerOfField(
serializersModule: SerializersModule,
): List<SerialDescriptor> {
return when (this.kind) {
PolymorphicKind.OPEN -> serializersModule.getPolymorphicDescriptors(this)
PolymorphicKind.SEALED -> getElementDescriptor(1).elementDescriptors.toList()
else -> throw IllegalArgumentException("Class ${this.serialName} should be abstract or sealed or interface to be used as @ProtoOneOf property.")
}.onEach { desc ->
if (desc.getElementAnnotations(0).none { anno -> anno is ProtoNumber }) {
throw IllegalArgumentException("${desc.serialName} implementing oneOf type ${this.serialName} should have @ProtoNumber annotation in its single property.")
}
}
}

internal fun SerialDescriptor.getActualOneOfSerializer(
serializersModule: SerializersModule,
protoId: Int
): SerialDescriptor? {
return getAllOneOfSerializerOfField(serializersModule).find { it.extractParameters(0).protoId == protoId }
}
Loading

0 comments on commit 251bca7

Please sign in to comment.