Skip to content

Commit

Permalink
[Java] Optimize collection serialization protocol by homogenization (#…
Browse files Browse the repository at this point in the history
…923)

* add codegen invocation annotation

* optimize collection serialization protocol by homogeneous info

* implement interpreter optimized collection read/write

* refine jit if/comparator exprs

* implement jit collection optimization

* add tests

* update depth uo make generics push work

* fix collection opt jit

* add collection nested opt tests

* write decl class for meta share

* use walkpath to reuse classinfo/holder

* fix get classinfo

* inline classinfo to get smaller code size

* split methods into small methods

* add non final object type tests

* misc fix

* add missing header

* fix class resolver test

* fix jit method split

* update classinfo only for not decl type

* Fix method split for collection jit

* add map with set elements test

* Optimize StringBuilder/StringBuffer serialization (#908)

* Optimize StringBuilder/StringBuffer serialization

* try to optimize StringBuilder

* first to Check code Style

* hidden

* hidden

* bug fix and check code style

* delete excess code and add buffers to try testing

* fix

* try to fix problem

* fix function

* code fix

* code fix again

* Update java/fury-core/src/main/java/io/fury/serializer/Serializers.java

commit

Co-authored-by: Shawn <[email protected]>

* Update java/fury-core/src/main/java/io/fury/serializer/Serializers.java

commit

Co-authored-by: Shawn <[email protected]>

---------

Co-authored-by: pankoli <[email protected]>
Co-authored-by: Shawn <[email protected]>

* Bump release versin to 0.1.2 (#924)

* [Doc] add basic type java format doc (#928)

add basic type java format doc

* [Java] speed test codegen speed by avoid duplicate codegen (#929)

* speed test codegen speed by avoid duplicate codegen

* fix cache

* fix cllass gc

* use a standalone lock for every key

* refine gc trigger

* skip cache for furyGC tests

* fix gc tests

* lint code

* add collection serialization java design doc

* update doc

* update doc

* debug ci

* Workaround G1ParScanThreadState::copy_to_survivor_space crash

* add iterate array bench results

* add benchmark suite

* fix jvm g1 workaround

* add CollectionSuite header

* fix crash

* skip unnecessary compress number

---------

Co-authored-by: PAN <[email protected]>
Co-authored-by: pankoli <[email protected]>
  • Loading branch information
3 people authored Oct 3, 2023
1 parent 15af063 commit e8d26d3
Show file tree
Hide file tree
Showing 28 changed files with 1,753 additions and 502 deletions.
6 changes: 3 additions & 3 deletions docs/protocols/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Serialization Protocols
- For Java Object Graph Protocol, see [java_object_graph_guide](java_object_graph.md) doc.
- For Cross Language Object Graph Protocol, see [xlang_object_graph_guide](./xlang_object_graph.md) doc.
- For Row Format Protocol, see [row format_guide](./row_format.md) doc.
- For Java Object Graph Protocol, see [java_object_graph_format](java_object_graph.md) doc.
- For Cross Language Object Graph Protocol, see [xlang_object_graph_format](./xlang_object_graph.md) doc.
- For Row Format Protocol, see [row format](./row_format.md) doc.
30 changes: 30 additions & 0 deletions docs/protocols/java_object_graph.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,38 @@ Which encoding to choose:
- For JDK9+: fury use `coder` in `String` object for encoding, `ascii`/`utf-16` will be used for encoding.
- If the string is encoded by `utf-8`, then fury will use `utf-8` to decode the data. But currently fury doesn't enable utf-8 encoding by default for java. Cross-language string serialization of fury use `utf-8` by default.

## Array

## Collection
> All collection serializer must extends `io.fury.serializer.CollectionSerializers.CollectionSerializer`.
Format:
```java
length(positive varint) | collection header | elements header | elements data
```

### collection header
- For `ArrayList/LinkedArrayList/HashSet/LinkedHashSet`, this will be empty.
- For `TreeSet`, this will be `Comparator`
- For subclass of `ArrayList`, this may be extra object field info.

### elements header
In most cases, all collection elements are same type and not null, elements header will encode those homogeneous
information to avoid the cost of writing it for every elements. Specifically, there are four kinds of information
which will be encoded by elements header, each use one bit:
- Whether track elements ref, use first bit `0b1` of header to flag it.
- Whether collection has null, use second bit `0b10` of header to flag it. If ref tracking is enabled for this
element type, this flag is invalid.
- Whether collection elements type is not declare type, use 3rd bit `0b100` of header to flag it.
- Whether collection elements type different, use 4rd bit `0b1000` of header to flag it.

By default, all bits are unset, which means all elements won't track ref, all elements are same type,, not null and the
actual element is the declare type in custom class field.

### elements data
Based on the elements header, the serialization of elements data may skip `ref flag`/`null flag`/`element class info`.

`io.fury.serializer.CollectionSerializers.CollectionSerializer#write/read` can be taken as an example.

## Map

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,11 @@

package io.fury.benchmark;

import io.fury.memory.MemoryBuffer;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
import org.openjdk.jmh.Main;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Level;
Expand Down Expand Up @@ -44,24 +48,91 @@ public void setup() {
}
}

@Benchmark
// @Benchmark
public Object clearObjectArray(ArrayState state) {
Arrays.fill(state.objects, null);
return state.objects;
}

@Benchmark
// @Benchmark
public Object clearObjectArrayByCopy(ArrayState state) {
System.arraycopy(state.nilArray, 0, state.objects, 0, state.objects.length);
return state.objects;
}

@Benchmark
// @Benchmark
public Object clearIntArray(ArrayState state) {
Arrays.fill(state.ints, 0);
return state.ints;
}

private static Integer[] array = new Integer[100];
private static List<Integer> list = new ArrayList<>(100);

private static MemoryBuffer buffer = MemoryBuffer.newHeapBuffer(32);

static {
Random random = new Random(7);
for (int i = 0; i < 100; i++) {
int x = random.nextInt();
array[i] = x;
list.add(i, x);
}
}

// Benchmark Mode Cnt Score Error Units
// ArraySuite.iterateArray thrpt 3 18107614.727 ± 25969433.513 ops/s
// ArraySuite.iterateList thrpt 3 9448162.588 ± 13139664.082 ops/s
// ArraySuite.iterateList2 thrpt 3 14678631.109 ± 14579521.954 ops/s
// ArraySuite.serializeList thrpt 3 1659718.571 ± 1323226.629 ops/s
@Benchmark
public Object iterateArray() {
int count = 0;
for (Integer o : array) {
if (o != null) {
count += o;
}
}
return count;
}

@Benchmark
public Object iterateList() {
int count = 0;
for (Integer o : list) {
if (o != null) {
count += o;
}
}
return count;
}

@Benchmark
public Object iterateList2() {
int count = 0;
int size = list.size();
for (int i = 0; i < size; i++) {
Integer o = list.get(i);
if (o != null) {
count += o;
}
}
return count;
}

@Benchmark
public Object serializeList() {
buffer.writerIndex(0);
int size = list.size();
for (int i = 0; i < size; i++) {
Integer o = list.get(i);
if (o != null) {
buffer.writeVarInt(o);
}
}
return buffer;
}

// Mac Monterey 12.1: 2.6 GHz 6-Core Intel Core i7
// JDK11
// Benchmark (arraySize) Mode Cnt Score Error Units
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
/*
* Copyright 2023 The Fury Authors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package io.fury.benchmark;

import io.fury.Fury;
import java.util.ArrayList;
import java.util.List;
import org.openjdk.jmh.Main;
import org.openjdk.jmh.annotations.Benchmark;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
* Test suite for collection.
*
* @author chaokunyang
*/
public class CollectionSuite {
private static final Logger LOG = LoggerFactory.getLogger(CollectionSuite.class);

public static void main(String[] args) throws Exception {
if (args.length == 0) {
String commandLine = "io.*CollectionSuite.* -f 3 -wi 5 -i 5 -t 1 -w 2s -r 2s -rf csv";
System.out.println(commandLine);
args = commandLine.split(" ");
}
Main.main(args);
}

private static Fury fury = Fury.builder().build();
private static List<Integer> list1 = new ArrayList<>(1024);
private static byte[] list1Bytes;

static {
for (int i = 0; i < 1024; i++) {
list1.add(i % 255);
}
list1Bytes = fury.serialize(list1);
LOG.info("Size: {}", list1Bytes.length);
}

@Benchmark
public Object serializeArrayList() {
return fury.serialize(list1);
}

@Benchmark
public Object deserializeArrayList() {
return fury.deserialize(list1Bytes);
}
// Benchmark Mode Cnt Score Error Units
// CollectionSuite.deserializeArrayList thrpt 3 175281.624 ± 142913.891 ops/s
// CollectionSuite.serializeArrayList thrpt 3 137648.540 ± 158192.786 ops/s
}
25 changes: 25 additions & 0 deletions java/fury-core/src/main/java/io/fury/Fury.java
Original file line number Diff line number Diff line change
Expand Up @@ -336,6 +336,17 @@ public <T> void writeRef(MemoryBuffer buffer, T obj, Serializer<T> serializer) {
}
}

/** Write object class and data without tracking ref. */
public void writeNullable(MemoryBuffer buffer, Object obj) {
if (obj == null) {
buffer.writeByte(Fury.NULL_FLAG);
} else {
buffer.writeByte(Fury.NOT_NULL_VALUE_FLAG);
writeNonRef(buffer, obj);
}
}

/** Write object class and data without tracking ref. */
public void writeNullable(MemoryBuffer buffer, Object obj, ClassInfoCache classInfoCache) {
if (obj == null) {
buffer.writeByte(Fury.NULL_FLAG);
Expand Down Expand Up @@ -781,6 +792,16 @@ public Object readNonRef(MemoryBuffer buffer, ClassInfoCache classInfoCache) {
return readDataInternal(buffer, classResolver.readClassInfo(buffer, classInfoCache));
}

/** Read object class and data without tracking ref. */
public Object readNullable(MemoryBuffer buffer) {
byte headFlag = buffer.readByte();
if (headFlag == Fury.NULL_FLAG) {
return null;
} else {
return readNonRef(buffer);
}
}

/** Class should be read already. */
public Object readData(MemoryBuffer buffer, ClassInfo classInfo) {
depth++;
Expand Down Expand Up @@ -1197,6 +1218,10 @@ public void setDepth(int depth) {
this.depth = depth;
}

public void incDepth(int diff) {
this.depth += diff;
}

// Invoked by jit
public StringSerializer getStringSerializer() {
return stringSerializer;
Expand Down
29 changes: 29 additions & 0 deletions java/fury-core/src/main/java/io/fury/annotation/CodegenInvoke.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
/*
* Copyright 2023 The Fury Authors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package io.fury.annotation;

import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;

/**
* An annotation to mark a method will be invoked by generated method. This annotation is used for
* documentation only.
*
* @author chaokunyang
*/
@Retention(RetentionPolicy.SOURCE)
public @interface CodegenInvoke {}
Loading

0 comments on commit e8d26d3

Please sign in to comment.