From f0cdc4c8e4ab76b937ff7384ec773d3e443570d8 Mon Sep 17 00:00:00 2001 From: imbajin Date: Sat, 9 Sep 2023 12:51:05 +0000 Subject: [PATCH] doc: sync cn and en contribution doc with server CONTRIBUTION.md (#279) * modify ASF and remove meaningless CLA --------- Co-authored-by: imbajin 0a32867cf572c6b7437043ad08fd75891471a076 --- cn/docs/_print/index.html | 6 ++--- .../contribution-guidelines/_print/index.html | 6 ++--- .../contribute/index.html | 21 ++++++++------- cn/docs/contribution-guidelines/index.xml | 4 +-- cn/docs/index.xml | 4 +-- cn/sitemap.xml | 2 +- docs/_print/index.html | 6 ++--- .../contribution-guidelines/_print/index.html | 6 ++--- .../contribute/index.html | 26 +++++++++---------- docs/contribution-guidelines/index.xml | 4 +-- docs/index.xml | 4 +-- en/sitemap.xml | 2 +- sitemap.xml | 2 +- 13 files changed, 44 insertions(+), 49 deletions(-) diff --git a/cn/docs/_print/index.html b/cn/docs/_print/index.html index 775e37c2d..8aa9f3155 100644 --- a/cn/docs/_print/index.html +++ b/cn/docs/_print/index.html @@ -6452,7 +6452,7 @@ // what is the name of the brother and the name of the place? g.V(pluto).out('brother').as('god').out('lives').as('place').select('god','place').by('name') -

推荐使用HugeGraph-Studio 通过可视化的方式来执行上述代码。另外也可以通过HugeGraph-Client、HugeApi、GremlinConsole和GremlinDriver等多种方式执行上述代码。

3.2 总结

HugeGraph 目前支持 Gremlin 的语法,用户可以通过 Gremlin / REST-API 实现各种查询需求。

8 - PERFORMANCE

8.1 - HugeGraph BenchMark Performance

1 测试环境

1.1 硬件信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 软件信息

1.2.1 测试用例

测试使用graphdb-benchmark,一个图数据库测试集。该测试集主要包含4类测试:

  • Massive Insertion,批量插入顶点和边,一定数量的顶点或边一次性提交

  • Single Insertion,单条插入,每个顶点或者每条边立即提交

  • Query,主要是图数据库的基本查询操作:

    • Find Neighbors,查询所有顶点的邻居
    • Find Adjacent Nodes,查询所有边的邻接顶点
    • Find Shortest Path,查询第一个顶点到100个随机顶点的最短路径
  • Clustering,基于Louvain Method的社区发现算法

1.2.2 测试数据集

测试使用人造数据和真实数据

本测试用到的数据集规模
名称vertex数目edge数目文件大小
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB
com-lj.ungraph.txt399796134681189479MB

1.3 服务配置

  • HugeGraph版本:0.5.6,RestServer和Gremlin Server和backends都在同一台服务器上

    • RocksDB版本:rocksdbjni-5.8.6
  • Titan版本:0.5.4, 使用thrift+Cassandra模式

    • Cassandra版本:cassandra-3.10,commit-log 和 data 共用SSD
  • Neo4j版本:2.0.1

graphdb-benchmark适配的Titan版本为0.5.4

2 测试结果

2.1 Batch插入性能

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.6295.7115.24367.033
Titan10.15108.569150.2661217.944
Neo4j3.88418.93824.890281.537

说明

  • 表头"()“中数据是数据规模,以边为单位
  • 表中数据是批量插入的时间,单位是s
  • 例如,HugeGraph使用RocksDB插入amazon0601数据集的300w条边,花费5.711s
结论
  • 批量插入性能 HugeGraph(RocksDB) > Neo4j > Titan(thrift+Cassandra)

2.2 遍历性能

2.2.1 术语说明
  • FN(Find Neighbor), 遍历所有vertex, 根据vertex查邻接edge, 通过edge和vertex查other vertex
  • FA(Find Adjacent), 遍历所有edge,根据edge获得source vertex和target vertex
2.2.2 FN性能
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)com-lj.ungraph(400w)
HugeGraph4.07245.11866.006609.083
Titan8.08492.507184.5431099.371
Neo4j2.42410.53711.609106.919

说明

  • 表头”()“中数据是数据规模,以顶点为单位
  • 表中数据是遍历顶点花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有顶点,并查找邻接边和另一顶点,总共耗时45.118s
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph1.54010.76411.243151.271
Titan7.36193.344169.2181085.235
Neo4j1.6734.7754.28440.507

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是遍历边花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有边,并查询每条边的两个顶点,总共耗时10.764s
结论
  • 遍历性能 Neo4j > HugeGraph(RocksDB) > Titan(thrift+Cassandra)

2.3 HugeGraph-图常用分析方法性能

术语说明
  • FS(Find Shortest Path), 寻找最短路径
  • K-neighbor,从起始vertex出发,通过K跳边能够到达的所有顶点, 包括1, 2, 3…(K-1), K跳边可达vertex
  • K-out, 从起始vertex出发,恰好经过K跳out边能够到达的顶点
FS性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.4940.1033.3648.155
Titan11.8180.239377.709575.678
Neo4j1.7191.8001.9568.530

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是找到从第一个顶点出发到达随机选择的100个顶点的最短路径的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端在图amazon0601中查找第一个顶点到100个随机顶点的最短路径,总共耗时0.103s
结论
  • 在数据规模小或者顶点关联关系少的场景下,HugeGraph性能优于Neo4j和Titan
  • 随着数据规模增大且顶点的关联度增高,HugeGraph与Neo4j性能趋近,都远高于Titan
K-neighbor性能
顶点深度一度二度三度四度五度六度
v1时间0.031s0.033s0.048s0.500s11.27sOOM
v111时间0.027s0.034s0.1151.36sOOM
v1111时间0.039s0.027s0.052s0.511s10.96sOOM

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
K-out性能
顶点深度一度二度三度四度五度六度
v1时间0.054s0.057s0.109s0.526s3.77sOOM
10133245350,8301,128,688
v111时间0.032s0.042s0.136s1.25s20.62sOOM
1021149441131502,629,970
v1111时间0.039s0.045s0.053s1.10s2.92sOOM
101402555508251,070,230

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
结论
  • FS场景,HugeGraph性能优于Neo4j和Titan
  • K-neighbor和K-out场景,HugeGraph能够实现在5度范围内秒级返回结果

2.4 图综合性能测试-CW

数据库规模1000规模5000规模10000规模20000
HugeGraph(core)20.804242.099744.7801700.547
Titan45.790820.6332652.2359568.623
Neo4j5.91350.267142.354460.880

说明

  • “规模"以顶点为单位
  • 表中数据是社区发现完成需要的时间,单位是s,例如HugeGraph使用RocksDB后端在规模10000的数据集,社区聚合不再变化,需要耗时744.780s
  • CW测试是CRUD的综合评估
  • 该测试中HugeGraph跟Titan一样,没有通过client,直接对core操作
结论
  • 社区聚类算法性能 Neo4j > HugeGraph > Titan

8.2 - HugeGraph-API Performance

HugeGraph API性能测试主要测试HugeGraph-Server对RESTful API请求的并发处理能力,包括:

  • 顶点/边的单条插入
  • 顶点/边的批量插入
  • 顶点/边的查询

HugeGraph的每个发布版本的RESTful API的性能测试情况可以参考:

即将更新,敬请期待!

8.2.1 - v0.5.6 Stand-alone(RocksDB)

1 测试环境

被压机器信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • 起压力机器信息:与被压机器同配置
  • 测试工具:apache-Jmeter-2.5.1

注:起压机器和被压机器在同一机房

2 测试说明

2.1 名词定义(时间的单位均为ms)

  • Samples – 本次场景中一共完成了多少个线程
  • Average – 平均响应时间
  • Median – 统计意义上面的响应时间的中值
  • 90% Line – 所有线程中90%的线程的响应时间都小于xx
  • Min – 最小响应时间
  • Max – 最大响应时间
  • Error – 出错率
  • Throughput – 吞吐量
  • KB/sec – 以流量做衡量的吞吐量

2.2 底层存储

后端存储使用RocksDB,HugeGraph与RocksDB都在同一机器上启动,server相关的配置文件除主机和端口有修改外,其余均保持默认。

3 性能结果总结

  1. HugeGraph单条插入顶点和边的速度在每秒1w左右
  2. 顶点和边的批量插入速度远大于单条插入速度
  3. 按id查询顶点和边的并发度可达到13000以上,且请求的平均延时小于50ms

4 测试结果及分析

4.1 batch插入

4.1.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数

持续时间:5min

顶点的最大插入速度:
image

####### 结论:

  • 并发2200,顶点的吞吐量是2026.8,每秒可处理的数据:2026.8*200=405360/s
边的最大插入速度
image

####### 结论:

  • 并发900,边的吞吐量是776.9,每秒可处理的数据:776.9*500=388450/s

4.2 single插入

4.2.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数
  • 持续时间:5min
  • 服务异常标志:错误率大于0.00%
顶点的单条插入
image

####### 结论:

  • 并发11500,吞吐量为10730,顶点的单条插入并发能力为11500
边的单条插入
image

####### 结论:

  • 并发9000,吞吐量是8418,边的单条插入并发能力为9000

4.3 按id查询

4.3.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数
  • 持续时间:5min
  • 服务异常标志:错误率大于0.00%
顶点的按id查询
image

####### 结论:

  • 并发14000,吞吐量是12663,顶点的按id查询的并发能力为14000,平均延时为44ms
边的按id查询
image

####### 结论:

  • 并发13000,吞吐量是12225,边的按id查询的并发能力为13000,平均延时为12ms

8.2.2 - v0.5.6 Cluster(Cassandra)

1 测试环境

被压机器信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • 起压力机器信息:与被压机器同配置
  • 测试工具:apache-Jmeter-2.5.1

注:起压机器和被压机器在同一机房

2 测试说明

2.1 名词定义(时间的单位均为ms)

  • Samples – 本次场景中一共完成了多少个线程
  • Average – 平均响应时间
  • Median – 统计意义上面的响应时间的中值
  • 90% Line – 所有线程中90%的线程的响应时间都小于xx
  • Min – 最小响应时间
  • Max – 最大响应时间
  • Error – 出错率
  • Throughput – 吞吐量
  • KB/sec – 以流量做衡量的吞吐量

2.2 底层存储

后端存储使用15节点Cassandra集群,HugeGraph与Cassandra集群位于不同的服务器,server相关的配置文件除主机和端口有修改外,其余均保持默认。

3 性能结果总结

  1. HugeGraph单条插入顶点和边的速度分别为9000和4500
  2. 顶点和边的批量插入速度分别为5w/s和15w/s,远大于单条插入速度
  3. 按id查询顶点和边的并发度可达到12000以上,且请求的平均延时小于70ms

4 测试结果及分析

4.1 batch插入

4.1.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数

持续时间:5min

顶点的最大插入速度:
image

####### 结论:

  • 并发3500,顶点的吞吐量是261,每秒可处理的数据:261*200=52200/s
边的最大插入速度
image

####### 结论:

  • 并发1000,边的吞吐量是323,每秒可处理的数据:323*500=161500/s

4.2 single插入

4.2.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数
  • 持续时间:5min
  • 服务异常标志:错误率大于0.00%
顶点的单条插入
image

####### 结论:

  • 并发9000,吞吐量为8400,顶点的单条插入并发能力为9000
边的单条插入
image

####### 结论:

  • 并发4500,吞吐量是4160,边的单条插入并发能力为4500

4.3 按id查询

4.3.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数
  • 持续时间:5min
  • 服务异常标志:错误率大于0.00%
顶点的按id查询
image

####### 结论:

  • 并发14500,吞吐量是13576,顶点的按id查询的并发能力为14500,平均延时为11ms
边的按id查询
image

####### 结论:

  • 并发12000,吞吐量是10688,边的按id查询的并发能力为12000,平均延时为63ms

8.3 - HugeGraph-Loader Performance

使用场景

当要批量插入的图数据(包括顶点和边)条数为billion级别及以下,或者总数据量小于TB时,可以采用HugeGraph-Loader工具持续、高速导入图数据

性能

测试均采用网址数据的边数据

RocksDB单机性能

  • 关闭label index,22.8w edges/s
  • 开启label index,15.3w edges/s

Cassandra集群性能

  • 默认开启label index,6.3w edges/s

8.4 -

1 测试环境

1.1 硬件信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 软件信息

1.2.1 测试用例

测试使用graphdb-benchmark,一个图数据库测试集。该测试集主要包含4类测试:

  • Massive Insertion,批量插入顶点和边,一定数量的顶点或边一次性提交

  • Single Insertion,单条插入,每个顶点或者每条边立即提交

  • Query,主要是图数据库的基本查询操作:

    • Find Neighbors,查询所有顶点的邻居
    • Find Adjacent Nodes,查询所有边的邻接顶点
    • Find Shortest Path,查询第一个顶点到100个随机顶点的最短路径
  • Clustering,基于Louvain Method的社区发现算法

1.2.2 测试数据集

测试使用人造数据和真实数据

本测试用到的数据集规模
名称vertex数目edge数目文件大小
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB

1.3 服务配置

  • HugeGraph版本:0.4.4,RestServer和Gremlin Server和backends都在同一台服务器上
  • Cassandra版本:cassandra-3.10,commit-log 和data共用SSD
  • RocksDB版本:rocksdbjni-5.8.6
  • Titan版本:0.5.4, 使用thrift+Cassandra模式

graphdb-benchmark适配的Titan版本为0.5.4

2 测试结果

2.1 Batch插入性能

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan9.51688.123111.586
RocksDB2.34514.07616.636
Cassandra11.930108.709101.959
Memory3.07715.20413.841

说明

  • 表头"()“中数据是数据规模,以边为单位
  • 表中数据是批量插入的时间,单位是s
  • 例如,HugeGraph使用RocksDB插入amazon0601数据集的300w条边,花费14.076s,速度约为21w edges/s
结论
  • RocksDB和Memory后端插入性能优于Cassandra
  • HugeGraph和Titan同样使用Cassandra作为后端的情况下,插入性能接近

2.2 遍历性能

2.2.1 术语说明
  • FN(Find Neighbor), 遍历所有vertex, 根据vertex查邻接edge, 通过edge和vertex查other vertex
  • FA(Find Adjacent), 遍历所有edge,根据edge获得source vertex和target vertex
2.2.2 FN性能
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)
Titan7.72470.935128.884
RocksDB8.87665.85263.388
Cassandra13.125126.959102.580
Memory22.309207.411165.609

说明

  • 表头”()“中数据是数据规模,以顶点为单位
  • 表中数据是遍历顶点花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有顶点,并查找邻接边和另一顶点,总共耗时65.852s
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan7.11963.353115.633
RocksDB6.03264.52652.721
Cassandra9.410102.76694.197
Memory12.340195.444140.89

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是遍历边花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有边,并查询每条边的两个顶点,总共耗时64.526s
结论
  • HugeGraph RocksDB > Titan thrift+Cassandra > HugeGraph Cassandra > HugeGraph Memory

2.3 HugeGraph-图常用分析方法性能

术语说明
  • FS(Find Shortest Path), 寻找最短路径
  • K-neighbor,从起始vertex出发,通过K跳边能够到达的所有顶点, 包括1, 2, 3…(K-1), K跳边可达vertex
  • K-out, 从起始vertex出发,恰好经过K跳out边能够到达的顶点
FS性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan11.3330.313376.06
RocksDB44.3912.221268.792
Cassandra39.8453.337331.113
Memory35.6382.059388.987

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是找到从第一个顶点出发到达随机选择的100个顶点的最短路径的时间,单位是s
  • 例如,HugeGraph使用RocksDB查找第一个顶点到100个随机顶点的最短路径,总共耗时2.059s
结论
  • 在数据规模小或者顶点关联关系少的场景下,Titan最短路径性能优于HugeGraph
  • 随着数据规模增大且顶点的关联度增高,HugeGraph最短路径性能优于Titan
K-neighbor性能
顶点深度一度二度三度四度五度六度
v1时间0.031s0.033s0.048s0.500s11.27sOOM
v111时间0.027s0.034s0.1151.36sOOM
v1111时间0.039s0.027s0.052s0.511s10.96sOOM

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
K-out性能
顶点深度一度二度三度四度五度六度
v1时间0.054s0.057s0.109s0.526s3.77sOOM
10133245350,8301,128,688
v111时间0.032s0.042s0.136s1.25s20.62sOOM
1021149441131502,629,970
v1111时间0.039s0.045s0.053s1.10s2.92sOOM
101402555508251,070,230

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
结论
  • FS场景,HugeGraph性能优于Titan
  • K-neighbor和K-out场景,HugeGraph能够实现在5度范围内秒级返回结果

2.4 图综合性能测试-CW

数据库规模1000规模5000规模10000规模20000
Titan45.943849.1682737.1179791.46
Memory(core)41.0771825.905**
Cassandra(core)39.783862.7442423.1366564.191
RocksDB(core)33.383199.894763.8691677.813

说明

  • “规模"以顶点为单位
  • 表中数据是社区发现完成需要的时间,单位是s,例如HugeGraph使用RocksDB后端在规模10000的数据集,社区聚合不再变化,需要耗时763.869s
  • “*“表示超过10000s未完成
  • CW测试是CRUD的综合评估
  • 后三者分别是HugeGraph的不同后端,该测试中HugeGraph跟Titan一样,没有通过client,直接对core操作
结论
  • HugeGraph在使用Cassandra后端时,性能略优于Titan,随着数据规模的增大,优势越来越明显,数据规模20000时,比Titan快30%
  • HugeGraph在使用RocksDB后端时,性能远高于Titan和HugeGraph的Cassandra后端,分别比两者快了6倍和4倍

9 - CHANGELOGS

9.1 - HugeGraph 1.0.0 Release Notes

OLTP API & Client 更新

API/Client 接口更新

  • 支持热更新trace开关的 /exception/trace API。
  • 支持 Cypher 图查询语言 API。
  • 支持通过 Swagger UI 接口来查看提供的 API 列表。
  • 将各算法中 ’limit’ 参数的类型由 long 调整为 int。
  • 支持在 Client 端跳过 Server 对 HBase 写入数据 (Beta)。

Core & Server

功能更新

  • 支持 Java 11 版本。
  • 支持 2 个新的 OLTP 算法: adamic-adar 和 resource-allocation。
  • 支持 HBase 后端使用哈希 RowKey,并且允许预初始化 HBase 表。
  • 支持 Cypher 图查询语言。
  • 支持集群 Master 角色的自动管理与故障转移。
  • 支持 16 个 OLAP 算法, 包括:LPA, Louvain, PageRank, BetweennessCentrality, RingsDetect等。
  • 根据 Apache 基金会对项目的发版要求进行适配,包括 License 合规性、发版流程、代码风格等,支持 Apache 版本发布。

Bug 修复

  • 修复无法根据多个 Label 和属性来查询边数据。
  • 增加对环路检测算法的最大深度限制。
  • 修复 tree() 语句返回结果异常问题。
  • 修复批量更新边传入 Id 时的检查异常问题。
  • 解决非预期的 Task 状态问题。
  • 解决在更新顶点时未清除边缓存的问题。
  • 修复 MySQL 后端执行 g.V() 时的错误。
  • 修复因为 server-info 无法超时导致的问题。
  • 导出了 ConditionP 类型用于 Gremlin 中用户使用。
  • 修复 within + Text.contains 查询问题。
  • 修复 addIndexLabel/removeIndexLabel 接口的竞争条件问题。
  • 限制仅 Admin 允许输出图实例。
  • 修复 Profile API 的检查问题。
  • 修复在 count().is(0) 查询中 Empty Graph 的问题。
  • 修复在异常时无法关闭服务的问题。
  • 修复在 Apple M1 系统上的 JNA 报错 UnsatisfiedLinkError 的问题。
  • 修复启动 RpcServer 时报 NPE 的问题。
  • 修复 ACTION_CLEARED 参数数量的问题。
  • 修复 RpcServer 服务启动问题。
  • 修复用户传入参数可能得数字转换隐患问题。
  • 移除了 Word 分词器依赖。
  • 修复 Cassandra 与 MySQL 后端在异常时未优雅关闭迭代器的问题。

配置项更新

  • 将配置项 raft.endpoint 从 Graph 作用域移动到 Server 作用域中。

其它修改

  • refact(core): enhance schema job module.
  • refact(raft): improve raft module & test & install snapshot and add peer.
  • refact(core): remove early cycle detection & limit max depth.
  • cache: fix assert node.next==empty.
  • fix apache license conflicts: jnr-posix and jboss-logging.
  • chore: add logo in README & remove outdated log4j version.
  • refact(core): improve CachedGraphTransaction perf.
  • chore: update CI config & support ci robot & add codeQL SEC-check & graph option.
  • refact: ignore security check api & fix some bugs & clean code.
  • doc: enhance CONTRIBUTING.md & README.md.
  • refact: add checkstyle plugin & clean/format the code.
  • refact(core): improve decode string empty bytes & avoid array-construct columns in BackendEntry.
  • refact(cassandra): translate ipv4 to ipv6 metrics & update cassandra dependency version.
  • chore: use .asf.yaml for apache workflow & replace APPLICATION_JSON with TEXT_PLAIN.
  • feat: add system schema store.
  • refact(rocksdb): update rocksdb version to 6.22 & improve rocksdb code.
  • refact: update mysql scope to test & clean protobuf style/configs.
  • chore: upgrade Dockerfile server to 0.12.0 & add editorconfig & improve ci.
  • chore: upgrade grpc version.
  • feat: support updateIfPresent/updateIfAbsent operation.
  • chore: modify abnormal logs & upgrade netty-all to 4.1.44.
  • refact: upgrade dependencies & adopt new analyzer & clean code.
  • chore: improve .gitignore & update ci configs & add RAT/flatten plugin.
  • chore(license): add dependencies-check ci & 3rd-party dependency licenses.
  • refact: Shutdown log when shutdown process & fix tx leak & enhance the file path.
  • refact: rename package to apache & dependency in all modules (Breaking Change).
  • chore: add license checker & update antrun plugin & fix building problem in windows.
  • feat: support one-step script for apache release v1.0.0 release.

Computer (OLAP)

Algorithm Changes

  • 支持 PageRank 算法。
  • 支持 WCC 算法。
  • 支持 degree centrality 算法。
  • 支持 triangle count 算法。
  • 支持 rings detection 算法。
  • 支持 LPA 算法。
  • 支持 k-core 算法。
  • 支持 closeness centrality 算法。
  • 支持 betweenness centrality 算法。
  • 支持 cluster coefficient 算法。

Platform Changes

  • feat: init module computer-core & computer-algorithm & etcd dependency.
  • feat: add Id as base type of vertex id.
  • feat: init Vertex/Edge/Properties & JsonStructGraphOutput.
  • feat: load data from hugegraph server.
  • feat: init basic combiner, Bsp4Worker, Bsp4Master.
  • feat: init sort & transport interface & basic FileInput/Output Stream.
  • feat: init computation & ComputerOutput/Driver interface.
  • feat: init Partitioner and HashPartitioner
  • feat: init Master/WorkerService module.
  • feat: init Heap/LoserTree sorting.
  • feat: init rpc module.
  • feat: init transport server, client, en/decode, flowControl, heartbeat.
  • feat: init DataDirManager & PointerCombiner.
  • feat: init aggregator module & add copy() and assign() methods to Value class.
  • feat: add startAsync and finishAsync on client side, add onStarted and onFinished on server side.
  • feat: init store/sort module.
  • feat: link managers in worker sending end.
  • feat: implement data receiver of worker.
  • feat: implement StreamGraphInput and EntryInput.
  • feat: add Sender and Receiver to process compute message.
  • feat: add seqfile fromat.
  • feat: add ComputeManager.
  • feat: add computer-k8s and computer-k8s-operator.
  • feat: add startup and make docker image code.
  • feat: sort different type of message use different combiner.
  • feat: add HDFS output format.
  • feat: mount config-map and secret to container.
  • feat: support java11.
  • feat: support partition concurrent compute.
  • refact: abstract computer-api from computer-core.
  • refact: optimize data receiving.
  • fix: release file descriptor after input and compute.
  • doc: add operator deploy readme.
  • feat: prepare for Apache release.

Toolchain (loader, tools, hubble)

  • 支持 Loader 使用 SQL 格式来选取从关系数据库导入哪些数据。
  • 支持 Loader 从 Spark 导入数据(包括 JDBC 方式)。
  • 支持 Loader 增加 Flink-CDC 模式。
  • 解决 Loader 导入 ORC 格式数据时,报错 NPE。
  • 解决 Loader 在 Spark/Flink 模式时未缓存 Schema 的问题。
  • 解决 Loader 的 Json 反序列化问题。
  • 解决 Loader 的 Jackson 版本冲突与依赖问题。
  • 支持 Hubble 高级算法接口的 UI 界面。
  • 支持 Hubble 中 Gremlin 语句的高亮格式显示.
  • 支持 Hubble 使用 Docker 镜像部署。
  • 支持 输出构建日志。
  • 解决 Hubble 的端口输入框问题。
  • 支持 Apache 项目发版的适配。

Commons (common,rpc)

  • 支持 assert-throws 方法返回 Future。
  • 增加 Cnm 与 Anm 方法到 CollectionUtil 中。
  • 支持 用户自定义的 content-type。
  • 支持 Apache 项目发版的适配。

Release Details

更加详细的版本变更信息,可以查看各个子仓库的链接:

9.2 - HugeGraph 0.11 Release Notes

API & Client

功能更新

  • 支持梭形相似度算法(hugegraph #671,hugegraph-client #62)
  • 支持创建 Schema 时,记录创建的时间(hugegraph #746,hugegraph-client #69)
  • 支持 RESTful API 中基于属性的范围查询顶点/边(hugegraph #782,hugegraph-client #73)
  • 支持顶点和边的 TTL (hugegraph #794,hugegraph-client #83)
  • 统一 RESTful API Server 和 Gremlin Server 的日期格式为字符串(hugegraph #1014,hugegraph-client #82)
  • 支持共同邻居,Jaccard 相似度,全部最短路径,带权最短路径和单源最短路径5种遍历算法(hugegraph #936,hugegraph-client #80)
  • 支持用户认证和细粒度权限控制(hugegraph #749,hugegraph #985,hugegraph-client #81)
  • 支持遍历 API 的顶点计数功能(hugegraph #995,hugegraph-client #84)
  • 支持 HTTPS 协议(hugegrap #1036,hugegraph-client #85)
  • 支持创建索引时控制是否重建索引(hugegraph #1106,hugegraph-client #91)
  • 支持定制的 kout/kneighbor,多点最短路径,最相似 Jaccard 点和模板路径5种遍历算法(hugegraph #1174,hugegraph-client #100,hugegraph-client #106)

内部修改

  • 启动 HugeGraphServer 出现异常时快速失败(hugegraph #748)
  • 定义 LOADING 模式来加速导入(hugegraph-client #101)

Core

功能更新

  • 支持多属性顶点/边的分页查询(hugegraph #759)
  • 支持聚合运算的性能优化(hugegraph #813)
  • 支持堆外缓存(hugegraph #846)
  • 支持属性权限管理(hugegraph #971)
  • 支持 MySQL 和 Memory 后端分片,并改进 HBase 分片方法(hugegraph #974)
  • 支持基于 Raft 的分布式一致性协议(hugegraph #1020)
  • 支持元数据拷贝功能(hugegraph #1024)
  • 支持集群的异步任务调度功能(hugegraph #1030)
  • 支持发生 OOM 时打印堆信息功能(hugegraph #1093)
  • 支持 Raft 状态机更新缓存(hugegraph #1119)
  • 支持 Raft 节点管理功能(hugegraph #1137)
  • 支持限制查询请求速率的功能(hugegraph #1158)
  • 支持顶点/边的属性默认值功能(hugegraph #1182)
  • 支持插件化查询加速机制 RamTable(hugegraph #1183)
  • 支持索引重建失败时设置为 INVALID 状态(hugegraph #1226)
  • 支持 HBase 启用 Kerberos 认证(hugegraph #1234)

BUG修复

  • 修复配置权限时 start-hugegraph.sh 的超时问题(hugegraph #761)
  • 修复在 studio 执行 gremlin 时的 MySQL 连接失败问题(hugegraph #765)
  • 修复 HBase 后端 truncate 时出现的 TableNotFoundException(hugegraph #771)
  • 修复限速配置项值未检查的问题(hugegraph #773)
  • 修复唯一索引(Unique Index)的返回的异常信息不准确问题(hugegraph #797)
  • 修复 RocksDB 后端执行 g.V().hasLabel().count() 时 OOM 问题 (hugegraph-798)
  • 修复 traverseByLabel() 分页设置错误问题(hugegraph #805)
  • 修复根据 ID 和 SortKeys 更新边属性时误创建边的问题(hugegraph #819)
  • 修复部分存储后端的覆盖写问题(hugegraph #820)
  • 修复保存执行失败的异步任务时无法取消的问题(hugegraph #827)
  • 修复 MySQL 后端在 SSL 模式下无法打开数据库的问题(hugegraph #842)
  • 修复索引查询时 offset 无效问题(hugegraph #866)
  • 修复 Gremlin 中绝对路径泄露的安全问题(hugegraph #871)
  • 修复 reconnectIfNeeded() 方法的 NPE 问题(hugegraph #874)
  • 修复 PostgreSQL 的 JDBC_URL 配置没有"/“前缀的问题(hugegraph #891)
  • 修复 RocksDB 内存统计问题(hugegraph #937)
  • 修复环路检测的两点成环无法检测的问题(hugegraph #939)
  • 修复梭形算法计算结束后没有清理计数的问题(hugegraph #947)
  • 修复 gremlin-console 无法工作的问题(hugegraph #1027)
  • 修复限制数目的按条件过滤邻接边问题(hugegraph #1057)
  • 修复 MySQL 执行 SQL 时的 auto-commit 问题(hugegraph #1064)
  • 修复通过两个索引查询时发生超时 80w 限制的问题(hugegraph #1088)
  • 修复范围索引检查规则错误(hugegraph #1090)
  • 修复删除残留索引的错误(hugegraph #1101)
  • 修复当前线程为 task-worker 时关闭事务卡住的问题(hugegraph #1111)
  • 修复最短路径查询出现 NoSuchElementException 的问题(hugegraph #1116)
  • 修复异步任务有时提交两次的问题(hugegraph #1130)
  • 修复值很小的 date 反序列化的问题(hugegraph #1152)
  • 修复遍历算法未检查起点或者终点是否存在的问题(hugegraph #1156)
  • 修复 bin/start-hugegraph.sh 参数解析错误的问题(hugegraph #1178)
  • 修复 gremlin-console 运行时的 log4j 错误信息的问题(hugegraph #1229)

内部修改

  • 延迟检查非空属性(hugegraph #756)
  • 为存储后端增加查看集群节点信息的功能 (hugegraph #821)
  • 为 RocksDB 后端增加 compaction 高级配置项(hugegraph #825)
  • 增加 vertex.check_adjacent_vertex_exist 配置项(hugegraph #837)
  • 检查主键属性不允许为空(hugegraph #847)
  • 增加图名字的合法性检查(hugegraph #854)
  • 增加对非预期的 SysProp 的查询(hugegraph #862)
  • 使用 disableTableAsync 加速 HBase 后端的数据清除(hugegraph #868)
  • 允许 Gremlin 环境触发系统异步任务(hugegraph #892)
  • 编码字符类型索引中的类型 ID(hugegraph #894)
  • 安全模块允许 Cassandra 在执行 CQL 时按需创建线程(hugegraph #896)
  • 将 GremlinServer 的默认通道设置为 WsAndHttpChannelizer(hugegraph #903)
  • 将 Direction 和遍历算法的类导出到 Gremlin 环境(hugegraph #904)
  • 增加顶点属性缓存限制(hugegraph #941,hugegraph #942)
  • 优化列表属性的读(hugegraph #943)
  • 增加缓存的 L1 和 L2 配置(hugegraph #945)
  • 优化 EdgeId.asString() 方法(hugegraph #946)
  • 优化当顶点没有属性时跳过后端存储查询(hugegraph #951)
  • 创建名字相同但属性不同的元数据时抛出 ExistedException(hugegraph #1009)
  • 查询顶点和边后按需关闭事务(hugegraph #1039)
  • 当图关闭时清空缓存(hugegraph #1078)
  • 关闭图时加锁避免竞争问题(hugegraph #1104)
  • 优化顶点和边的删除效率,当提供 Label+ID 删除时免去查询(hugegraph #1150)
  • 使用 IntObjectMap 优化元数据缓存效率(hugegraph #1185)
  • 使用单个 Raft 节点管理目前的三个 store(hugegraph #1187)
  • 在重建索引时提前释放索引删除的锁(hugegraph #1193)
  • 在压缩和解压缩异步任务的结果时,使用 LZ4 替代 Gzip(hugegraph #1198)
  • 实现 RocksDB 删除 CF 操作的排他性来避免竞争(hugegraph #1202)
  • 修改 CSV reporter 的输出目录,并默认设置为不输出(hugegraph #1233)

其它

  • cherry-pick 0.10.4 版本的 bug 修复代码(hugegraph #785,hugegraph #1047)
  • Jackson 升级到 2.10.2 版本(hugegraph #859)
  • Thanks 信息中增加对 Titan 的感谢(hugegraph #906)
  • 适配 TinkerPop 测试(hugegraph #1048)
  • 修改允许输出的日志最低等级为 TRACE(hugegraph #1050)
  • 增加 IDEA 的格式配置文件(hugegraph #1060)
  • 修复 Travis CI 太多错误信息的问题(hugegraph #1098)

Loader

功能更新

  • 支持读取 Hadoop 配置文件(hugegraph-loader #105)
  • 支持指定 Date 属性的时区(hugegraph-loader #107)
  • 支持从 ORC 压缩文件导入数据(hugegraph-loader #113)
  • 支持单条边插入时设置是否检查顶点(hugegraph-loader #117)
  • 支持从 Snappy-raw 压缩文件导入数据(hugegraph-loader #119)
  • 支持导入映射文件 2.0 版本(hugegraph-loader #121)
  • 增加一个将 utf8-bom 转换为 utf8 的命令行工具(hugegraph-loader #128)
  • 支持导入任务开始前清理元数据信息的功能(hugegraph-loader #140)
  • 支持 id 列作为属性存储(hugegraph-loader #143)
  • 支持导入任务配置 username(hugegraph-loader #146)
  • 支持从 Parquet 文件导入数据(hugegraph-loader #153)
  • 支持指定读取文件的最大行数(hugegraph-loader #159)
  • 支持 HTTPS 协议(hugegraph-loader #161)
  • 支持时间戳作为日期格式(hugegraph-loader #164)

BUG修复

  • 修复行的 retainAll() 方法没有修改 names 和 values 数组(hugegraph-loader #110)
  • 修复 JSON 文件重新加载时的 NPE 问题(hugegraph-loader #112)

内部修改

  • 只打印一次插入错误信息,以避免过多的错误信息(hugegraph-loader #118)
  • 拆分批量插入和单条插入的线程(hugegraph-loader #120)
  • CSV 的解析器改为 SimpleFlatMapper(hugegraph-loader #124)
  • 编码主键中的数字和日期字段(hugegraph-loader #136)
  • 确保主键列合法或者存在映射(hugegraph-loader #141)
  • 跳过主键属性全部为空的顶点(hugegraph-loader #166)
  • 在导入任务开始前设置为 LOADING 模式,并在导入完成后恢复原来模式(hugegraph-loader #169)
  • 改进停止导入任务的实现(hugegraph-loader #170)

Tools

功能更新

  • 支持 Memory 后端的备份功能 (hugegraph-tools #53)
  • 支持 HTTPS 协议(hugegraph-tools #58)
  • 支持 migrate 子命令配置用户名和密码(hugegraph-tools #61)
  • 支持备份顶点和边时指定类型和过滤属性信息(hugegraph-tools #63)

BUG修复

  • 修复 dump 命令的 NPE 问题(hugegraph-tools #49)

内部修改

  • 在 backup/dump 之前清除分片文件(hugegraph-tools #53)
  • 改进 HugeGraph-tools 的报错信息(hugegraph-tools #67)
  • 改进 migrate 子命令,删除掉不支持的子配置(hugegraph-tools #68)

9.3 - HugeGraph 0.12 Release Notes

API & Client

接口更新

  • 支持 https + auth 模式连接图服务 (hugegraph-client #109 #110)
  • 统一 kout/kneighbor 等 OLTP 接口的参数命名及默认值(hugegraph-client #122 #123)
  • 支持 RESTful 接口利用 P.textcontains() 进行属性全文检索(hugegraph #1312)
  • 增加 graph_read_mode API 接口,以切换 OLTP、OLAP 读模式(hugegraph #1332)
  • 支持 list/set 类型的聚合属性 aggregate property(hugegraph #1332)
  • 权限接口增加 METRICS 资源类型(hugegraph #1355、hugegraph-client #114)
  • 权限接口增加 SCHEMA 资源类型(hugegraph #1362、hugegraph-client #117)
  • 增加手动 compact API 接口,支持 rocksdb/cassandra/hbase 后端(hugegraph #1378)
  • 权限接口增加 login/logout API,支持颁发或回收 Token(hugegraph #1500、hugegraph-client #125)
  • 权限接口增加 project API(hugegraph #1504、hugegraph-client #127)
  • 增加 OLAP 回写接口,支持 cassandra/rocksdb 后端(hugegraph #1506、hugegraph-client #129)
  • 增加返回一个图的所有 Schema 的 API 接口(hugegraph #1567、hugegraph-client #134)
  • 变更 property key 创建与更新 API 的 HTTP 返回码为 202(hugegraph #1584)
  • 增强 Text.contains() 支持3种格式:“word”、"(word)"、"(word1|word2|word3)"(hugegraph #1652)
  • 统一了属性中特殊字符的行为(hugegraph #1670 #1684)
  • 支持动态创建图实例、克隆图实例、删除图实例(hugegraph-client #135)

其它修改

  • 修复在恢复 index label 时 IndexLabelV56 id 丢失的问题(hugegraph-client #118)
  • 为 Edge 类增加 name() 方法(hugegraph-client #121)

Core & Server

功能更新

  • 支持动态创建图实例(hugegraph #1065)
  • 支持通过 Gremlin 调用 OLTP 算法(hugegraph #1289)
  • 支持多集群使用同一个图权限服务,以共享权限信息(hugegraph #1350)
  • 支持跨多节点的 Cache 缓存同步(hugegraph #1357)
  • 支持 OLTP 算法使用原生集合以降低 GC 压力提升性能(hugegraph #1409)
  • 支持对新增的 Raft 节点打快照或恢复快照(hugegraph #1439)
  • 支持对集合属性建立二级索引 Secondary Index(hugegraph #1474)
  • 支持审计日志,及其压缩、限速等功能(hugegraph #1492 #1493)
  • 支持 OLTP 算法使用高性能并行无锁原生集合以提升性能(hugegraph #1552)

BUG修复

  • 修复带权最短路径算法(weighted shortest path)NPE问题 (hugegraph #1250)
  • 增加 Raft 相关的安全操作白名单(hugegraph #1257)
  • 修复 RocksDB 实例未正确关闭的问题(hugegraph #1264)
  • 在清空数据 truncate 操作之后,显示的发起写快照 Raft Snapshot(hugegraph #1275)
  • 修复 Raft Leader 在收到 Follower 转发请求时未更新缓存的问题(hugegraph #1279)
  • 修复带权最短路径算法(weighted shortest path)结果不稳定的问题(hugegraph #1280)
  • 修复 rays 算法 limit 参数不生效问题(hugegraph #1284)
  • 修复 neighborrank 算法 capacity 参数未检查的问题(hugegraph #1290)
  • 修复 PostgreSQL 因为不存在与用户同名的数据库而初始化失败的问题(hugegraph #1293)
  • 修复 HBase 后端当启用 Kerberos 时初始化失败的问题(hugegraph #1294)
  • 修复 HBase/RocksDB 后端 shard 结束判断错误问题(hugegraph #1306)
  • 修复带权最短路径算法(weighted shortest path)未检查目标顶点存在的问题(hugegraph #1307)
  • 修复 personalrank/neighborrank 算法中非 String 类型 id 的问题(hugegraph #1310)
  • 检查必须是 master 节点才允许调度 gremlin job(hugegraph #1314)
  • 修复 g.V().hasLabel().limit(n) 因为索引覆盖导致的部分结果不准确问题(hugegraph #1316)
  • 修复 jaccardsimilarity 算法当并集为空时报 NaN 错误的问题(hugegraph #1324)
  • 修复 Raft Follower 节点操作 Schema 多节点之间数据不同步问题(hugegraph #1325)
  • 修复因为 tx 未关闭导致的 TTL 不生效问题(hugegraph #1330)
  • 修复 gremlin job 的执行结果大于 Cassandra 限制但小于任务限制时的异常处理(hugegraph #1334)
  • 检查权限接口 auth-delete 和 role-get API 操作时图必须存在(hugegraph #1338)
  • 修复异步任务结果中包含 path/tree 时系列化不正常的问题(hugegraph #1351)
  • 修复初始化 admin 用户时的 NPE 问题(hugegraph #1360)
  • 修复异步任务原子性操作问题,确保 update/get fields 及 re-schedule 的原子性(hugegraph #1361)
  • 修复权限 NONE 资源类型的问题(hugegraph #1362)
  • 修复启用权限后,truncate 操作报错 SecurityException 及管理员信息丢失问题(hugegraph #1365)
  • 修复启用权限后,解析数据忽略了权限异常的问题(hugegraph #1380)
  • 修复 AuthManager 在初始化时会尝试连接其它节点的问题(hugegraph #1381)
  • 修复特定的 shard 信息导致 base64 解码错误的问题(hugegraph #1383)
  • 修复启用权限后,使用 consistent-hash LB 在校验权限时,creator 为空的问题(hugegraph #1385)
  • 改进权限中 VAR 资源不再依赖于 VERTEX 资源(hugegraph #1386)
  • 规范启用权限后,Schema 操作仅依赖具体的资源(hugegraph #1387)
  • 规范启用权限后,部分操作由依赖 STATUS 资源改为依赖 ANY 资源(hugegraph #1391)
  • 规范启用权限后,禁止初始化管理员密码为空(hugegraph #1400)
  • 检查创建用户时 username/password 不允许为空(hugegraph #1402)
  • 修复更新 Label 时,PrimaryKey 或 SortKey 被设置为可空属性的问题(hugegraph #1406)
  • 修复 ScyllaDB 丢失分页结果问题(hugegraph #1407)
  • 修复带权最短路径算法(weighted shortest path)权重属性强制转换为 double 的问题(hugegraph #1432)
  • 统一 OLTP 算法中的 degree 参数命名(hugegraph #1433)
  • 修复 fusiformsimilarity 算法当 similars 为空的时候返回所有的顶点问题(hugegraph #1434)
  • 改进 paths 算法,当起始点与目标点相同时应该返回空路径(hugegraph #1435)
  • 修改 kout/kneighbor 的 limit 参数默认值 10 为 10000000(hugegraph #1436)
  • 修复分页信息中的 ‘+’ 被 URL 编码为空格的问题(hugegraph #1437)
  • 改进边更新接口的错误提示信息(hugegraph #1443)
  • 修复 kout 算法 degree 未在所有 label 范围生效的问题(hugegraph #1459)
  • 改进 kneighbor/kout 算法,起始点不允许出现在结果集中(hugegraph #1459 #1463)
  • 统一 kout/kneighbor 的 Get 和 Post 版本行为(hugegraph #1470)
  • 改进创建边时顶点类型不匹配的错误提示信息(hugegraph #1477)
  • 修复 Range Index 的残留索引问题(hugegraph #1498)
  • 修复权限操作未失效缓存的问题(hugegraph #1528)
  • 修复 sameneighbor 的 limit 参数默认值 10 为 10000000(hugegraph #1530)
  • 修复 clear API 不应该所有后端都调用 create snapshot 的问题(hugegraph #1532)
  • 修复当 loading 模式时创建 Index Label 阻塞问题(hugegraph #1548)
  • 修复增加图到 project 或从 project 移除图的问题(hugegraph #1562)
  • 改进权限操作的一些错误提示信息(hugegraph #1563)
  • 支持浮点属性设置为 Infinity/NaN 的值(hugegraph #1578)
  • 修复 Raft 启用 safe_read 时的 quorum read 问题(hugegraph #1618)
  • 修复 token 过期时间配置的单位问题(hugegraph #1625)
  • 修复 MySQL Statement 资源泄露问题(hugegraph #1627)
  • 修复竞争条件下 Schema.getIndexLabel 获取不到数据的问题(hugegraph #1629)
  • 修复 HugeVertex4Insert 无法系列化问题(hugegraph #1630)
  • 修复 MySQL count Statement 未关闭问题(hugegraph #1640)
  • 修复当删除 Index Label 异常时,导致状态不同步问题(hugegraph #1642)
  • 修复 MySQL 执行 gremlin timeout 导致的 statement 未关闭问题(hugegraph #1643)
  • 改进 Search Index 以兼容特殊 Unicode 字符:\u0000 to \u0003(hugegraph #1659)
  • 修复 #1659 引入的 Char 未转化为 String 的问题(hugegraph #1664)
  • 修复 has() + within() 查询时结果异常问题(hugegraph #1680)
  • 升级 Log4j 版本到 2.17 以修复安全漏洞(hugegraph #1686 #1698 #1702)
  • 修复 HBase 后端 shard scan 中 startkey 包含空串时 NPE 问题(hugegraph #1691)
  • 修复 paths 算法在深层环路遍历时性能下降问题 (hugegraph #1694)
  • 改进 personalrank 算法的参数默认值及错误检查(hugegraph #1695)
  • 修复 RESTful 接口 P.within 条件不生效问题(hugegraph #1704)
  • 修复启用权限时无法动态创建图的问题(hugegraph #1708)

配置项修改:

  • 共享 SSL 相关配置项命名(hugegraph #1260)
  • 支持 RocksDB 配置项 rocksdb.level_compaction_dynamic_level_bytes(hugegraph #1262)
  • 去除 RESFful Server 服务协议配置项 restserver.protocol,自动提取 URL 中的 Schema(hugegraph #1272)
  • 增加 PostgreSQL 配置项 jdbc.postgresql.connect_database(hugegraph #1293)
  • 增加针对顶点主键是否编码的配置项 vertex.encode_primary_key_number(hugegraph #1323)
  • 增加针对聚合查询是否启用索引优化的配置项 query.optimize_aggregate_by_index(hugegraph #1549)
  • 修改 cache_type 的默认值 l1 为 l2(hugegraph #1681)
  • 增加 JDBC 强制重连配置项 jdbc.forced_auto_reconnect(hugegraph #1710)

其它修改

  • 增加默认的 SSL Certificate 文件(hugegraph #1254)
  • OLTP 并行请求共享线程池,而非每个请求使用单独的线程池(hugegraph #1258)
  • 修复 Example 的问题(hugegraph #1308)
  • 使用 jraft 版本 1.3.5(hugegraph #1313)
  • 如果启用了 Raft 模式时,关闭 RocksDB 的 WAL(hugegraph #1318)
  • 使用 TarLz4Util 来提升快照 Snapshot 压缩的性能(hugegraph #1336)
  • 升级存储的版本号(store version),因为 property key 增加了 read frequency(hugegraph #1341)
  • 顶点/边 vertex/edge 的 Get API 使用 queryVertex/queryEdge 方法来替代 iterator 方法(hugegraph #1345)
  • 支持 BFS 优化的多度查询(hugegraph #1359)
  • 改进 RocksDB deleteRange() 带来的查询性能问题(hugegraph #1375)
  • 修复 travis-ci cannot find symbol Namifiable 问题(hugegraph #1376)
  • 确保 RocksDB 快照的磁盘与 data path 指定的一致(hugegraph #1392)
  • 修复 MacOS 空闲内存 free_memory 计算不准确问题(hugegraph #1396)
  • 增加 Raft onBusy 回调来配合限速(hugegraph #1401)
  • 升级 netty-all 版本 4.1.13.Final 到 4.1.42.Final(hugegraph #1403)
  • 支持 TaskScheduler 暂停当设置为 loading 模式时(hugegraph #1414)
  • 修复 raft-tools 脚本的问题(hugegraph #1416)
  • 修复 license params 问题(hugegraph #1420)
  • 提升写权限日志的性能,通过 batch flush & async write 方式改进(hugegraph #1448)
  • 增加 MySQL 连接 URL 的日志记录(hugegraph #1451)
  • 提升用户信息校验性能(hugegraph# 1460)
  • 修复 TTL 因为起始时间问题导致的错误(hugegraph #1478)
  • 支持日志配置的热加载及对审计日志的压缩(hugegraph #1492)
  • 支持针对用户级别的审计日志的限速(hugegraph #1493)
  • 缓存 RamCache 支持用户自定义的过期时间(hugegraph #1494)
  • 在 auth client 端缓存 login role 以避免重复的 RPC 调用(hugegraph #1507)
  • 修复 IdSet.contains() 未复写 AbstractCollection.contains() 问题(hugegraph #1511)
  • 修复当 commitPartOfEdgeDeletions() 失败时,未回滚 rollback 的问题(hugegraph #1513)
  • 提升 Cache metrics 性能(hugegraph #1515)
  • 当发生 license 操作错误时,增加打印异常日志(hugegraph #1522)
  • 改进 SimilarsMap 实现(hugegraph #1523)
  • 使用 tokenless 方式来更新 coverage(hugegraph #1529)
  • 改进 project update 接口的代码(hugegraph #1537)
  • 允许从 option() 访问 GRAPH_STORE(hugegraph #1546)
  • 优化 kout/kneighbor 的 count 查询以避免拷贝集合(hugegraph #1550)
  • 优化 shortestpath 遍历方式,以数据量少的一端优先遍历(hugegraph #1569)
  • 完善 rocksdb.data_disks 配置项的 allowed keys 提示信息(hugegraph #1585)
  • 为 number id 优化 OLTP 遍历中的 id2code 方法性能(hugegraph #1623)
  • 优化 HugeElement.getProperties() 返回 Collection<Property>(hugegraph #1624)
  • 增加 APACHE PROPOSAL 文件(hugegraph #1644)
  • 改进 close tx 的流程(hugegraph #1655)
  • 当 reset() 时为 MySQL close 捕获所有类型异常(hugegraph #1661)
  • 改进 OLAP property 模块代码(hugegraph #1675)
  • 改进查询模块的执行性能(hugegraph #1711)

Loader

  • 支持导入 Parquet 格式文件(hugegraph-loader #174)
  • 支持 HDFS Kerberos 权限验证(hugegraph-loader #176)
  • 支持 HTTPS 协议连接到服务端导入数据(hugegraph-loader #183)
  • 修复 trust store file 路径问题(hugegraph-loader #186)
  • 处理 loading mode 重置的异常(hugegraph-loader #187)
  • 增加在插入数据时对非空属性的检查(hugegraph-loader #190)
  • 修复客户端与服务端时区不同导致的时间判断问题(hugegraph-loader #192)
  • 优化数据解析性能(hugegraph-loader #194)
  • 当用户指定了文件头时,检查其必须不为空(hugegraph-loader #195)
  • 修复示例程序中 MySQL struct.json 格式问题(hugegraph-loader #198)
  • 修复顶点边导入速度不精确的问题(hugegraph-loader #200 #205)
  • 当导入启用 check-vertex 时,确保先导入顶点再导入边(hugegraph-loader #206)
  • 修复边 Json 数据导入格式不统一时数组溢出的问题(hugegraph-loader #211)
  • 修复因边 mapping 文件不存在导致的 NPE 问题(hugegraph-loader #213)
  • 修复读取时间可能出现负数的问题(hugegraph-loader #215)
  • 改进目录文件的日志打印(hugegraph-loader #223)
  • 改进 loader 的的 Schema 处理流程(hugegraph-loader #230)

Tools

  • 支持 HTTPS 协议(hugegraph-tools #71)
  • 移除 –protocol 参数,直接从URL中自动提取(hugegraph-tools #72)
  • 支持将数据 dump 到 HDFS 文件系统(hugegraph-tools #73)
  • 修复 trust store file 路径问题(hugegraph-tools #75)
  • 支持权限信息的备份恢复(hugegraph-tools #76)
  • 支持无参数的 Printer 打印(hugegraph-tools #79)
  • 修复 MacOS free_memory 计算问题(hugegraph-tools #82)
  • 支持备份恢复时指定线程数hugegraph-tools #83)
  • 支持动态创建图、克隆图、删除图等命令(hugegraph-tools #95)

9.4 - HugeGraph 0.10 Release Notes

API & Client

功能更新

  • 支持 HugeGraphServer 服务端内存紧张时返回错误拒绝请求 (hugegraph #476)
  • 支持 API 白名单和 HugeGraphServer GC 频率控制功能 (hugegraph #522)
  • 支持 Rings API 的 source_in_ring 参数 (hugegraph #528,hugegraph-client #48)
  • 支持批量按策略更新属性接口 (hugegraph #493,hugegraph-client #46)
  • 支持 Shard Index 前缀与范围检索索引 (hugegraph #574,hugegraph-client #56)
  • 支持顶点的 UUID ID 类型 (hugegraph #618,hugegraph-client #59)
  • 支持唯一性约束索引(Unique Index) (hugegraph #636,hugegraph-client #60)
  • 支持 API 请求超时功能 (hugegraph #674)
  • 支持根据名称列表查询 schema (hugegraph #686,hugegraph-client #63)
  • 支持按分页方式获取异步任务 (hugegraph #720)

内部修改

  • 保持 traverser 的参数与 server 端一致 (hugegraph-client #44)
  • 支持在 Shard 内使用分页方式遍历顶点或者边的方法 (hugegraph-client #47)
  • 支持 Gremlin 查询结果持有 GraphManager (hugegraph-client #49)
  • 改进 RestClient 的连接参数 (hugegraph-client #52)
  • 增加 Date 类型属性的测试 (hugegraph-client #55)
  • 适配 HugeGremlinException 异常 (hugegraph-client #57)
  • 增加新功能的版本匹配检查 (hugegraph-client #66)
  • 适配 UUID 的序列化 (hugegraph-client #67)

Core

功能更新

  • 支持 PostgreSQL 和 CockroachDB 存储后端 (hugegraph #484)
  • 支持负数索引 (hugegraph #513)
  • 支持边的 Vertex + SortKeys 的前缀范围查询 (hugegraph #574)
  • 支持顶点的邻接边按分页方式查询 (hugegraph #659)
  • 禁止通过 Gremlin 进行敏感操作 (hugegraph #176)
  • 支持 Lic 校验功能 (hugegraph #645)
  • 支持 Search Index 查询结果按匹配度排序的功能 (hugegraph #653)
  • 升级 tinkerpop 至版本 3.4.3 (hugegraph #648)

BUG修复

  • 修复按分页方式查询边时剩余数目(remaining count)错误 (hugegraph #515)
  • 修复清空后端时边缓存未清空的问题 (hugegraph #488)
  • 修复无法插入 List 类型的属性问题 (hugegraph #534)
  • 修复 PostgreSQL 后端的 existDatabase(), clearBackend() 和 rollback()功能 (hugegraph #531)
  • 修复程序关闭时 HugeGraphServer 和 GremlinServer 残留问题 (hugegraph #554)
  • 修复在 LockTable 中重复抓锁的问题 (hugegraph #566)
  • 修复从 Edge 中获取的 Vertex 没有属性的问题 (hugegraph #604)
  • 修复交叉关闭 RocksDB 的连接池问题 (hugegraph #598)
  • 修复在超级点查询时 limit 失效问题 (hugegraph #607)
  • 修复使用 Equal 条件和分页的情况下查询 Range Index 只返回第一页的问题 (hugegraph #614)
  • 修复查询 limit 在删除部分数据后失效的问题 (hugegraph #610)
  • 修复 Example1 的查询错误 (hugegraph #638)
  • 修复 HBase 的批量提交部分错误问题 (hugegraph #634)
  • 修复索引搜索时 compareNumber() 方法的空指针问题 (hugegraph #629)
  • 修复更新属性值为已经删除的顶点或边的属性时失败问题 (hugegraph #679)
  • 修复 system 类型残留索引无法清除问题 (hugegraph #675)
  • 修复 HBase 在 Metrics 信息中的单位问题 (hugegraph #713)
  • 修复存储后端未初始化问题 (hugegraph #708)
  • 修复按 Label 删除边时导致的 IN 边残留问题 (hugegraph #727)
  • 修复 init-store 会生成多份 backend_info 问题 (hugegraph #723)

内部修改

  • 抑制因 PostgreSQL 后端 database 不存在时的报警信息 (hugegraph #527)
  • 删除 PostgreSQL 后端的无用配置项 (hugegraph #533)
  • 改进错误信息中的 HugeType 为易读字符串 (hugegraph #546)
  • 增加 jdbc.storage_engine 配置项指定存储引擎 (hugegraph #555)
  • 增加使用后端链接时按需重连功能 (hugegraph #562)
  • 避免打印空的查询条件 (hugegraph #583)
  • 缩减 Variable 的字符串长度 (hugegraph #581)
  • 增加 RocksDB 后端的 cache 配置项 (hugegraph #567)
  • 改进异步任务的异常信息 (hugegraph #596)
  • 将 Range Index 拆分成 INT,LONG,FLOAT,DOUBLE 四个表存储 (hugegraph #574)
  • 改进顶点和边 API 的 Metrics 名字 (hugegraph #631)
  • 增加 G1GC 和 GC Log 的配置项 (hugegraph #616)
  • 拆分顶点和边的 Label Index 表 (hugegraph #635)
  • 减少顶点和边的属性存储空间 (hugegraph #650)
  • 支持对 Secondary Index 和 Primary Key 中的数字进行编码 (hugegraph #676)
  • 减少顶点和边的 ID 存储空间 (hugegraph #661)
  • 支持 Cassandra 后端存储的二进制序列化存储 (hugegraph #680)
  • 放松对最小内存的限制 (hugegraph #689)
  • 修复 RocksDB 后端批量写时的 Invalid column family 问题 (hugegraph #701)
  • 更新异步任务状态时删除残留索引 (hugegraph #719)
  • 删除 ScyllaDB 的 Label Index 表 (hugegraph #717)
  • 启动时使用多线程方式打开 RocksDB 后端存储多个数据目录 (hugegraph #721)
  • RocksDB 版本从 v5.17.2 升级至 v6.3.6 (hugegraph #722)

其它

  • 增加 API tests 到 codecov 统计中 (hugegraph #711)
  • 改进配置文件的默认配置项 (hugegraph #575)
  • 改进 README 中的致谢信息 (hugegraph #548)

Loader

功能更新

  • 支持 JSON 数据源的 selected 字段 (hugegraph-loader #62)
  • 支持定制化 List 元素之间的分隔符 (hugegraph-loader #66)
  • 支持值映射 (hugegraph-loader #67)
  • 支持通过文件后缀过滤文件 (hugegraph-loader #82)
  • 支持对导入进度进行记录和断点续传 (hugegraph-loader #70,hugegraph-loader #87)
  • 支持从不同的关系型数据库中读取 Header 信息 (hugegraph-loader #79)
  • 支持属性为 Unsigned Long 类型值 (hugegraph-loader #91)
  • 支持顶点的 UUID ID 类型 (hugegraph-loader #98)
  • 支持按照策略批量更新属性 (hugegraph-loader #97)

BUG修复

  • 修复 nullable key 在 mapping field 不工作的问题 (hugegraph-loader #64)
  • 修复 Parse Exception 无法捕获的问题 (hugegraph-loader #74)
  • 修复在等待异步任务完成时获取信号量数目错误的问题 (hugegraph-loader #86)
  • 修复空表时 hasNext() 返回 true 的问题 (hugegraph-loader #90)
  • 修复布尔值解析错误问题 (hugegraph-loader #92)

内部修改

  • 增加 HTTP 连接参数 (hugegraph-loader #81)
  • 改进导入完成的总结信息 (hugegraph-loader #80)
  • 改进一行数据缺少列或者有多余列的处理逻辑 (hugegraph-loader #93)

Tools

功能更新

  • 支持 0.8 版本 server 备份的数据恢复至 0.9 版本的 server 中 (hugegraph-tools #34)
  • 增加 timeout 全局参数 (hugegraph-tools #44)
  • 增加 migrate 子命令支持迁移图 (hugegraph-tools #45)

BUG修复

  • 修复 dump 命令不支持 split size 参数的问题 (hugegraph-tools #32)

内部修改

  • 删除 Hadoop 对 Jersey 1.19的依赖 (hugegraph-tools #31)
  • 优化子命令在 help 信息中的排序 (hugegraph-tools #37)
  • 使用 log4j2 清除 log4j 的警告信息 (hugegraph-tools #39)

9.5 - HugeGraph 0.9 Release Notes

API & Client

功能更新

  • 增加 personal rank API 和 neighbor rank API (hugegraph #274)
  • Shortest path API 增加 skip_degree 参数跳过超级点(hugegraph #433,hugegraph-client #42)
  • vertex/edge 的 scan API 支持分页机制 (hugegraph #428,hugegraph-client #35)
  • VertexAPI 使用简化的属性序列化器 (hugegraph #332,hugegraph-client #37)
  • 增加 customized paths API 和 customized crosspoints API (hugegraph #306,hugegraph-client #40)
  • 在 server 端所有线程忙时返回503错误 (hugegraph #343)
  • 保持 API 的 depth 和 degree 参数一致 (hugegraph #252,hugegraph-client #30)

BUG修复

  • 增加属性的时候验证 Date 而非 Timestamp 的值 (hugegraph-client #26)

内部修改

  • RestClient 支持重用连接 (hugegraph-client #33)
  • 使用 JsonUtil 替换冗余的 ObjectMapper (hugegraph-client #41)
  • Edge 直接引用 Vertex 使得批量插入更友好 (hugegraph-client #29)
  • 使用 JaCoCo 替换 Cobertura 统计代码覆盖率 (hugegraph-client #39)
  • 改进 Shard 反序列化机制 (hugegraph-client #34)

Core

功能更新

  • 支持 Cassandra 的 NetworkTopologyStrategy (hugegraph #448)
  • 元数据删除和索引重建使用分页机制 (hugegraph #417)
  • 支持将 HugeGraphServer 作为系统服务 (hugegraph #170)
  • 单一索引查询支持分页机制 (hugegraph #328)
  • 在初始化图库时支持定制化插件 (hugegraph #364)
  • 为HBase后端增加 hbase.zookeeper.znode.parent 配置项 (hugegraph #333)
  • 支持异步 Gremlin 任务的进度更新 (hugegraph #325)
  • 使用异步任务的方式删除残留索引 (hugegraph #285)
  • 支持按 sortKeys 范围查找功能 (hugegraph #271)

BUG修复

  • 修复二级索引删除时 Cassandra 后端的 batch 超过65535限制的问题 (hugegraph #386)
  • 修复 RocksDB 磁盘利用率的 metrics 不正确问题 (hugegraph #326)
  • 修复异步索引删除错误修复 (hugegraph #336)
  • 修复 BackendSessionPool.close() 的竞争条件问题 (hugegraph #330)
  • 修复保留的系统 ID 不工作问题 (hugegraph #315)
  • 修复 cache 的 metrics 信息丢失问题 (hugegraph #321)
  • 修复使用 hasId() 按 id 查询顶点时不支持数字 id 问题 (hugegraph #302)
  • 修复重建索引时的 80w 限制问题和 Cassandra 后端的 batch 65535问题 (hugegraph #292)
  • 修复残留索引删除无法处理未展开(none-flatten)查询的问题 (hugegraph #281)

内部修改

  • 迭代器变量统一命名为 ‘iter’(hugegraph #438)
  • 增加 PageState.page() 方法统一获取分页信息接口 (hugegraph #429)
  • 为基于 mapdb 的内存版后端调整代码结构,增加测试用例 (hugegraph #357)
  • 支持代码覆盖率统计 (hugegraph #376)
  • 设置 tx capacity 的下限为 COMMIT_BATCH(默认为500) (hugegraph #379)
  • 增加 shutdown hook 来自动关闭线程池 (hugegraph #355)
  • PerfExample 的统计时间排除环境初始化时间 (hugegraph #329)
  • 改进 BinarySerializer 中的 schema 序列化 (hugegraph #316)
  • 避免对 primary key 的属性创建多余的索引 (hugegraph #317)
  • 限制 Gremlin 异步任务的名字小于256字节 (hugegraph #313)
  • 使用 multi-get 优化 HBase 后端的按 id 查询 (hugegraph #279)
  • 支持更多的日期数据类型 (hugegraph #274)
  • 修改 Cassandra 和 HBase 的 port 范围为(1,65535) (hugegraph #263)

其它

  • 增加 travis API 测试 (hugegraph #299)
  • 删除 rest-server.properties 中的 GremlinServer 相关的默认配置项 (hugegraph #290)

Loader

功能更新

  • 支持从 HDFS 和 关系型数据库导入数据 (hugegraph-loader #14)
  • 支持传递权限 token 参数(hugegraph-loader #46)
  • 支持通过 regex 指定要跳过的行 (hugegraph-loader #43)
  • 支持导入 TEXT 文件时的 List/Set 属性(hugegraph-loader #38)
  • 支持自定义的日期格式 (hugegraph-loader #28)
  • 支持从指定目录导入数据 (hugegraph-loader #33)
  • 支持忽略最后多余的列或者 null 值的列 (hugegraph-loader #23)

BUG修复

  • 修复 Example 问题(hugegraph-loader #57)
  • 修复当 vertex 是 customized ID 策略时边解析问题(hugegraph-loader #24)

内部修改

  • URL regex 改进 (hugegraph-loader #47)

Tools

功能更新

  • 支持海量数据备份和恢复到本地和 HDFS,并支持压缩 (hugegraph-tools #21)
  • 支持异步任务取消和清理功能 (hugegraph-tools #20)
  • 改进 graph-clear 命令的提示信息 (hugegraph-tools #23)

BUG修复

  • 修复 restore 命令总是使用 ‘hugegraph’ 作为目标图的问题,支持指定图 (hugegraph-tools #26)

9.6 - HugeGraph 0.8 Release Notes

API & Client

功能更新

  • 服务端增加 rays 和 rings 的 RESTful API(hugegraph #45)
  • 使创建 IndexLabel 返回异步任务(hugegraph #95,hugegraph-client #9)
  • 客户端增加恢复模式相关的 API(hugegraph-client #10)
  • 让 task-list API 不返回 task_input 和 task_result(hugegraph #143)
  • 增加取消异步任务的API(hugegraph #167,hugegraph-client #15)
  • 增加获取后端 metrics 的 API(hugegraph #155)

BUG修复

  • 分页获取时最后一页的 page 应该为 null 而非 “null”(hugegraph #168)
  • 分页迭代获取服务端已经没有下一页了应该停止获取(hugegraph-client #16)
  • 添加顶点使用自定义 Number Id 时报类型无法转换(hugegraph-client #21)

内部修改

  • 增加持续集成测试(hugegraph-client #19)

Core

功能更新

  • 取消异步任务通过 label 查询时 80w 的限制(hugegraph #93)
  • 允许 cardinality 为 set 时传入 Json List 形式的属性值(hugegraph #109)
  • 支持在恢复模式和合并模式来恢复图(hugegraph #114)
  • RocksDB 后端支持多个图指定为同一个存储目录(hugegraph #123)
  • 支持用户自定义权限认证器(hugegraph-loader #133)
  • 当服务重启后重新开始未完成的任务(hugegraph #188)
  • 当顶点的 Id 策略为自定义时,检查是否已存在相同 Id 的顶点(hugegraph #189)

BUG修复

  • 增加对 HasContainer 的 predicate 不为 null 的检查(hugegraph #16)
  • RocksDB 后端由于数据目录和日志目录错误导致 init-store 失败(hugegraph #25)
  • 启动 hugegraph 时由于 logs 目录不存在导致提示超时但实际可访问(hugegraph #38)
  • ScyllaDB 后端遗漏注册顶点表(hugegraph #47)
  • 使用 hasLabel 查询传入多个 label 时失败(hugegraph #50)
  • Memory 后端未初始化 task 相关的 schema(hugegraph #100)
  • 当使用 hasLabel 查询时,如果元素数量超过 80w,即使加上 limit 也会报错(hugegraph #104)
  • 任务的在运行之后没有保存过状态(hugegraph #113)
  • 检查后端版本信息时直接强转 HugeGraphAuthProxy 为 HugeGraph(hugegraph #127)
  • 配置项 batch.max_vertices_per_batch 未生效(hugegraph #130)
  • 配置文件 rest-server.properties 有错误时 HugeGraphServer 启动不报错,但是无法访问(hugegraph #131)
  • MySQL 后端某个线程的提交对其他线程不可见(hugegraph #163)
  • 使用 union(branch) + has(date) 查询时提示 String 无法转换为 Date(hugegraph #181)
  • 使用 RocksDB 后端带 limit 查询顶点时会返回不完整的结果(hugegraph #197)
  • 提示其他线程无法操作 tx(hugegraph #204)

内部修改

  • 拆分 graph.cache_xx 配置项为 vertex.cache_xx 和 edge.cache_xx 两类(hugegraph #56)
  • 去除 hugegraph-dist 对 hugegraph-api 的依赖(hugegraph #61)
  • 优化集合取交集和取差集的操作(hugegraph #85)
  • 优化 transaction 的缓存处理和索引及 Id 查询(hugegraph #105)
  • 给各线程池的线程命名(hugegraph #124)
  • 增加并优化了一些 metrics 统计(hugegraph #138)
  • 增加了对未完成任务的 metrics 记录(hugegraph #141)
  • 让索引更新以分批方式提交,而不是全量提交(hugegraph #150)
  • 在添加顶点/边时一直持有 schema 的读锁,直到提交/回滚完成(hugegraph #180)
  • 加速 Tinkerpop 测试(hugegraph #19)
  • 修复 Tinkerpop 测试在 resource 目录下找不到 filter 文件的 BUG(hugegraph #26)
  • 开启 Tinkerpop 测试中 supportCustomIds 特性(hugegraph #69)
  • 持续集成中添加 HBase 后端的测试(hugegraph #41)
  • 避免持续集成的 deploy 脚本运行多次(hugegraph #170)
  • 修复 cache 单元测试跑不过的问题(hugegraph #177)
  • 持续集成中修改部分后端的存储为 tmpfs 以加快测试速度(hugegraph #206)

其它

  • 增加 issue 模版(hugegraph #42)
  • 增加 CONTRIBUTING 文件(hugegraph #59)

Loader

功能更新

  • 支持忽略源文件某些特定列(hugegraph-loader #2)
  • 支持导入 cardinality 为 Set 的属性数据(hugegraph-loader #10)
  • 单条插入也使用多个线程执行,解决了错误多时最后单条导入慢的问题(hugegraph-loader #12)

BUG修复

  • 导入过程可能统计出错(hugegraph-loader #4)
  • 顶点使用自定义 Number Id 导入出错(hugegraph-loader #6)
  • 顶点使用联合主键时导入出错(hugegraph-loader #18)

内部修改

  • 增加持续集成测试(hugegraph-loader #8)
  • 优化检测到文件不存在时的提示信息(hugegraph-loader #16)

Tools

功能更新

  • 增加 KgDumper (hugegraph-tools #6)
  • 支持在恢复模式和合并模式中恢复图(hugegraph-tools #9)

BUG修复

  • 脚本中的工具函数 get_ip 在系统未安装 ifconfig 时报错(hugegraph-tools #13)

9.7 - HugeGraph 0.7 Release Notes

API & Java Client

功能更新

  • 支持异步删除元数据和重建索引(HugeGraph-889)
  • 加入监控API,并与Gremlin的监控框架集成(HugeGraph-1273)

BUG修复

  • EdgeAPI更新属性时会将属性值也置为属性键(HugeGraph-81)
  • 当删除顶点或边时,如果id非法应该返回400错误而非404(HugeGraph-1337)

Core

功能更新

  • 支持HBase后端存储(HugeGraph-1280)
  • 增加异步API框架,耗时操作可通过调用异步API实现(HugeGraph-387)
  • 支持对长属性列建立二级索引,取消目前索引列长度256字节的限制(HugeGraph-1314)
  • 支持顶点属性的“创建或更新”操作(HugeGraph-1303)
  • 支持全文检索功能(HugeGraph-1322)
  • 支持数据库表的版本号检查(HugeGraph-1328)
  • 删除顶点时,如果遇到超级点的时候报错"Batch too large"或“Batch 65535 statements”(HugeGraph-1354)
  • 支持异步删除元数据和重建索引(HugeGraph-889)
  • 支持异步长时间执行Gremlin任务(HugeGraph-889)

BUG修复

  • 防止超级点访问时查询过多下一层顶点而阻塞服务(HugeGraph-1302)
  • HBase初始化时报错连接已经关闭(HugeGraph-1318)
  • 按照date属性过滤顶点报错String无法转为Date(HugeGraph-1319)
  • 残留索引删除,对range索引的判断存在错误(HugeGraph-1291)
  • 支持组合索引后,残留索引清理没有考虑索引组合的情况(HugeGraph-1311)
  • 根据otherV的条件来删除边时,可能会因为边的顶点不存在导致错误(HugeGraph-1347)
  • label索引对offset和limit结果错误(HugeGraph-1329)
  • vertex label或者edge label没有开启label index,删除label会导致数据无法删除(HugeGraph-1355)

内部修改

  • hbase后端代码引入较新版本的Jackson-databind包,导致HugeGraphServer启动异常(HugeGraph-1306)
  • Core和Client都自己持有一个shard类,而不是依赖于common模块(HugeGraph-1316)
  • 去掉rebuild index和删除vertex label和edge label时的80w的capacity限制(HugeGraph-1297)
  • 所有schema操作需要考虑同步问题(HugeGraph-1279)
  • 拆分Cassandra的索引表,把element id每条一行,避免聚合高时,导入速度非常慢甚至卡住(HugeGraph-1304)
  • 将hugegraph-test中关于common的测试用例移动到hugegraph-common中(HugeGraph-1297)
  • 异步任务支持保存任务参数,以支持任务恢复(HugeGraph-1344)
  • 支持通过脚本部署文档到GitHub(HugeGraph-1351)
  • RocksDB和Hbase后端索引删除实现(HugeGraph-1317)

Loader

功能更新

  • HugeLoader支持用户手动创建schema,以文件的方式传入(HugeGraph-1295)

BUG修复

  • HugeLoader导数据时未区分输入文件的编码,导致可能产生乱码(HugeGraph-1288)
  • HugeLoader打包的example目录的三个子目录下没有文件(HugeGraph-1288)
  • 导入的CSV文件中如果数据列本身包含逗号会解析出错(HugeGraph-1320)
  • 批量插入避免单条失败导致整个batch都无法插入(HugeGraph-1336)
  • 异常信息作为模板打印异常(HugeGraph-1345)
  • 导入边数据,当列数不对时导致程序退出(HugeGraph-1346)
  • HugeLoader的自动创建schema失败(HugeGraph-1363)
  • ID长度检查应该检查字节长度而非字符串长度(HugeGraph-1374)

内部修改

  • 添加测试用例(HugeGraph-1361)

Tools

功能更新

  • backup/restore使用多线程加速,并增加retry机制(HugeGraph-1307)
  • 一键部署支持传入路径以存放包(HugeGraph-1325)
  • 实现dump图功能(内存构建顶点及关联边)(HugeGraph-1339)
  • 增加backup-scheduler功能,支持定时备份且保留一定数目最新备份(HugeGraph-1326)
  • 增加异步任务查询和异步执行Gremlin的功能(HugeGraph-1357)

BUG修复

  • hugegraph-tools的backup和restore编码为UTF-8(HugeGraph-1321)
  • hugegraph-tools设置默认JVM堆大小和发布版本号(HugeGraph-1340)

Studio

BUG修复

  • HugeStudio中顶点id包含换行符时g.V()会导致groovy解析出错(HugeGraph-1292)
  • 限制返回的顶点及边的数量(HugeGraph-1333)
  • 加载note出现消失或者卡住情况(HugeGraph-1353)
  • HugeStudio打包时,编译失败但没有报错,导致发布包无法启动(HugeGraph-1368)

9.8 - HugeGraph 0.6 Release Notes

API & Java Client

功能更新

  • 增加RESTFul API paths和crosspoints,找出source到target顶点间多条路径或包含交叉点的路径(HugeGraph-1210)
  • 在API层添加批量插入并发数的控制,避免出现全部的线程都用于写而无法查询的情况(HugeGraph-1228)
  • 增加scan-API,允许客户端并发地获取顶点和边(HugeGraph-1197)
  • Client支持传入用户名密码访问带权限控制的HugeGraph(HugeGraph-1256)
  • 为顶点及边的list API添加offset参数(HugeGraph-1261)
  • RESTful API的顶点/边的list不允许同时传入page 和 [label,属性](HugeGraph-1262)
  • k-out、K-neighbor、paths、shortestpath等API增加degree、capacity和limit(HugeGraph-1176)
  • 增加restore status的set/get/clear接口(HugeGraph-1272)

BUG修复

  • 使 RestClient的basic auth使用Preemptive模式(HugeGraph-1257)
  • HugeGraph-Client中由ResultSet获取多次迭代器,除第一次外其他的无法迭代(HugeGraph-1278)

Core

功能更新

  • RocksDB实现scan特性(HugeGraph-1198)
  • Schema userdata 提供删除 key 功能(HugeGraph-1195)
  • 支持date类型属性的范围查询(HugeGraph-1208)
  • limit下沉到backend,尽可能不进行多余的索引读取(HugeGraph-1234)
  • 增加 API 权限与访问控制(HugeGraph-1162)
  • 禁止多个后端配置store为相同的值(HugeGraph-1269)

BUG修复

  • RocksDB的Range查询时如果只指定上界或下界会查出其他IndexLabel的记录(HugeGraph-1211)
  • RocksDB带limit查询时,graphTransaction查询返回的结果多一个(HugeGraph-1234)
  • init-store在CentOS上依赖通用的io.netty有时会卡住,改为使用netty-transport-native-epoll(HugeGraph-1255)
  • Cassandra后端in语句(按id查询)元素个数最大65535(HugeGraph-1239)
  • 主键加索引(或普通属性)作为查询条件时报错(HugeGraph-1276)
  • init-store.sh在Centos平台上初始化失败或者卡住(HugeGraph-1255)

测试

内部修改

  • 将compareNumber方法搬移至common模块(HugeGraph-1208)
  • 修复HugeGraphServer无法在Ubuntu机器上启动的Bug(HugeGraph-1154)
  • 修复init-store.sh无法在bin目录下执行的BUG(HugeGraph-1223)
  • 修复HugeGraphServer启动过程中无法通过CTRL+C终止的BUG(HugeGraph-1223)
  • HugeGraphServer启动前检查端口是否被占用(HugeGraph-1223)
  • HugeGraphServer启动前检查系统JDK是否安装以及版本是否为1.8(HugeGraph-1223)
  • 给HugeConfig类增加getMap()方法(HugeGraph-1236)
  • 修改默认配置项,后端使用RocksDB,注释重要的配置项(HugeGraph-1240)
  • 重命名userData为userdata(HugeGraph-1249)
  • centos 4.3系统HugeGraphServer进程使用jps命令查不到
  • 增加配置项ALLOW_TRACE,允许设置是否返回exception stack trace(HugeGraph-81)

Tools

功能更新

  • 增加自动化部署工具以安装所有组件(HugeGraph-1267)
  • 增加clear的脚本,并拆分deploy和start-all(HugeGraph-1274)
  • 对hugegraph服务进行监控以提高可用性(HugeGraph-1266)
  • 增加backup/restore功能和命令(HugeGraph-1272)
  • 增加graphs API对应的命令(HugeGraph-1272)

BUG修复

Loader

功能更新

  • 默认添加csv及json的示例(HugeGraph-1259)

BUG修复

9.9 - HugeGraph 0.5 Release Notes

API & Java Client

功能更新

  • VertexLabel与EdgeLabel增加bool参数enable_label_index表述是否构建label索引(HugeGraph-1085)
  • 增加RESTful API来支持高效shortest path,K-out和K-neighbor查询(HugeGraph-944)
  • 增加RESTful API支持按id列表批量查询顶点(HugeGraph-1153)
  • 支持迭代获取全部的顶点和边,使用分页实现(HugeGraph-1166)
  • 顶点id中包含 / % 等 URL 保留字符时通过 VertexAPI 查不出来(HugeGraph-1127)
  • 批量插入边时是否检查vertex的RESTful API参数从checkVertex改为check_vertex (HugeGraph-81)

BUG修复

  • hasId()无法正确匹配LongId(HugeGraph-1083)

Core

功能更新

  • RocksDB支持常用配置项(HugeGraph-1068)
  • 支持插入、删除、更新等操作的限速(HugeGraph-1071)
  • 支持RocksDB导入sst文件方案(HugeGraph-1077)
  • 增加MySQL后端存储(HugeGraph-1091)
  • 增加Palo后端存储(HugeGraph-1092)
  • 增加开关:支持是否构建顶点/边的label index(HugeGraph-1085)
  • 支持API分页获取数据(HugeGraph-1105)
  • RocksDB配置的数据存放目录如果不存在则自动创建(HugeGraph-1135)
  • 增加高级遍历函数shortest path、K-neighbor,K-out和按id列表批量查询顶点(HugeGraph-944)
  • init-store.sh增加超时重试机制(HugeGraph-1150)
  • 将边表拆分两个表:OUT表、IN表(HugeGraph-1002)
  • 限制顶点ID最大长度为128字节(HugeGraph-1168)
  • Cassandra通过压缩数据(可配置snappy、lz4)进行优化(HugeGraph-428)
  • 支持IN和OR操作(HugeGraph-137)
  • 支持RocksDB并行写多个磁盘(HugeGraph-1177)
  • MySQL通过批量插入进行性能优化(HugeGraph-1188)

BUG修复

  • Kryo系列化多线程时异常(HugeGraph-1066)
  • RocksDB索引内容中重复写了两次elem-id(HugeGraph-1094)
  • SnowflakeIdGenerator.instance在多线程环境下可能会初始化多个实例(HugeGraph-1095)
  • 如果查询边的顶点但顶点不存在时,异常信息不够明确(HugeGraph-1101)
  • RocksDB配置了多个图时,init-store失败(HugeGraph-1151)
  • 无法支持 Date 类型的属性值(HugeGraph-1165)
  • 创建了系统内部索引,但无法根据其进行搜索(HugeGraph-1167)
  • 拆表后根据label删除边时,edge-in表中的记录未被删除成功(HugeGraph-1182)

测试

  • 增加配置项:vertex.force_id_string,跑 tinkerpop 测试时打开(HugeGraph-1069)

内部修改

  • common库OptionChecker增加allowValues()函数用于枚举值(HugeGraph-1075)
  • 清理无用、版本老旧的依赖包,减少打包的压缩包的大小(HugeGraph-1078)
  • HugeConfig通过文件路径构造时,无法检查多次配置的配置项的值(HugeGraph-1079)
  • Server启动时可以支持智能分配最大内存(HugeGraph-1154)
  • 修复Mac OS因为不支持free命令导致无法启动server的问题(HugeGraph-1154)
  • 修改配置项的注册方式为字符串式,避免直接依赖Backend包(HugeGraph-1171)
  • 增加StoreDumper工具以查看后端存储的数据内容(HugeGraph-1172)
  • Jenkins把所有与内部服务器有关的构建机器信息都参数化传入(HugeGraph-1179)
  • 将RestClient移到common模块,令server和client都依赖common(HugeGraph-1183)
  • 增加配置项dump工具ConfDumper(HugeGraph-1193)

9.10 - HugeGraph 0.4.4 Release Notes

API & Java Client

功能更新

  • HugeGraph-Server支持WebSocket,能用Gremlin-Console连接使用;并支持直接编写groovy脚本调用Core的代码(HugeGraph-977)
  • 适配Schema-id(HugeGraph-1038)

BUG修复

  • hugegraph-0.3.3:删除vertex的属性,body中properties=null,返回500,空指针(HugeGraph-950)
  • hugegraph-0.3.3: graph.schema().getVertexLabel() 空指针(HugeGraph-955)
  • HugeGraph-Client 中顶点和边的属性集合不是线程安全的(HugeGraph-1013)
  • 批量操作的异常信息无法打印(HugeGraph-1013)
  • 异常message提示可读性太差,都是用propertyKey的id显示,对于用户来说无法立即识别(HugeGraph-1055)
  • 批量新增vertex实体,有一个body体为null,返回500,空指针(HugeGraph-1056)
  • 追加属性body体中只包含properties,功能出现回退,抛出异常The label of vertex can’t be null(HugeGraph-1057)
  • HugeGraph-Client适配:PropertyKey的DateType中Timestamp替换成Date(HugeGraph-1059)
  • 创建IndexLabel时baseValue为空会报出500错误(HugeGraph-1061)

Core

功能更新

  • 实现上层独立事务管理,并兼容tinkerpop事务规范(HugeGraph-918、HugeGraph-941)
  • 完善memory backend,可以通过API正确访问,且适配了tinkerpop事务(HugeGraph-41)
  • 增加RocksDB后端存储驱动框架(HugeGraph-929)
  • RocksDB数字索引range-query实现(HugeGraph-963)
  • 为所有的schema增加了id,并将各表原依赖name的列也换成id(HugeGraph-589)
  • 填充query key-value条件时,value的类型如果不匹配key定义的类型时需要转换为该类型(HugeGraph-964)
  • 统一各后端的offset、limit实现(HugeGraph-995)
  • 查询顶点、边时,Core支持迭代方式返回结果,而非一次性载入内存(HugeGraph-203)
  • memory backend支持range query(HugeGraph-967)
  • memory backend的secondary的支持方式从遍历改为IdQuery(HugeGraph-996)
  • 联合索引支持复杂的(只要逻辑上可以查都支持)多种索引组合查询(HugeGraph-903)
  • Schema中增加存储用户数据的域(map)(HugeGraph-902)
  • 统一ID的解析及系列化(包括API及Backend)(HugeGraph-965)
  • RocksDB没有keyspace概念,需要完善对多图实例的支持(HugeGraph-973)
  • 支持Cassandra设置连接用户名密码(HugeGraph-999)
  • Schema缓存支持缓存所有元数据(get-all-schema)(HugeGraph-1037)
  • 目前依然保持schema对外暴露name,暂不直接使用schema id(HugeGraph-1032)
  • 用户传入ID的策略的修改为支持String和Number(HugeGraph-956)

BUG修复

  • 删除旧的前缀indexLabel时数据库中的schemaLabel对象还有残留(HugeGraph-969)
  • HugeConfig解析时共用了公共的Option,导致不同graph的配置项有覆盖(HugeGraph-984)
  • 数据库数据不兼容时,提示更加友好的异常信息(HugeGraph-998)
  • 支持Cassandra设置连接用户名密码(HugeGraph-999)
  • RocksDB deleteRange end溢出后触发RocksDB assert错误(HugeGraph-971)
  • 允许根据null值id进行查询顶点/边,返回结果为空集合(HugeGraph-1045)
  • 内存中存在部分更新数据未提交时,搜索结果不对(HugeGraph-1046)
  • g.V().hasLabel(XX)传入不存在的label时报错: Internal Server Error and Undefined property key: ‘~label’(HugeGraph-1048)
  • gremlin获取的的schema只剩下名称字符串(HugeGraph-1049)
  • 大量数据情况下无法进行count操作(HugeGraph-1051)
  • RocksDB持续插入6~8千万条边时卡住(HugeGraph-1053)
  • 整理属性类型的支持,并在BinarySerializer中使用二进制格式系列化属性值(HugeGraph-1062)

测试

  • 增加tinkerpop的performance测试(HugeGraph-987)

内部修改

  • HugeFactory打开同一个图(name相同者)时,共用HugeGraph对象即可(HugeGraph-983)
  • 规范索引类型命名secondary、range、search(HugeGraph-991)
  • 数据库数据不兼容时,提示更加友好的异常信息(HugeGraph-998)
  • IO部分的 gryo 和 graphson 的module分开(HugeGraph-1041)
  • 增加query性能测试到PerfExample中(HugeGraph-1044)
  • 关闭gremlin-server的metric日志(HugeGraph-1050)

9.11 - HugeGraph 0.3.3 Release Notes

API & Java Client

功能更新

  • 为vertex-label和edge-label增加可空属性集合,允许在create和append时指定(HugeGraph-245)
  • 配合core的功能为用户提供tinkerpop variables RESTful API(HugeGraph-396)
  • 支持顶点/边属性的更新和删除(HugeGraph-894)
  • 支持顶点/边的条件查询(HugeGraph-919)

BUG修复

  • HugeGraph-API接收的RequestBody为null或"“时抛出空指针异常(HugeGraph-795)
  • 为HugeGraph-API添加输入参数检查,避免抛出空指针异常(HugeGraph-796 ~ HugeGraph-798,HugeGraph-802,HugeGraph-808 ~ HugeGraph-814,HugeGraph-817,HugeGraph-823,HugeGraph-860)
  • 创建缺失outV-label 或者 inV-label的实体边,依然能够被创建成功,不符合需求(HugeGraph-835)
  • 创建vertex-label和edge-label时可以任意传入index-names(HugeGraph-837)
  • 创建index,base-type=“VERTEX”等值(期望VL、EL),返回500(HugeGraph-846)
  • 创建index,base-type和base-value不匹配,提示不友好(HugeGraph-848)
  • 删除已经不存在的两个实体之间的关系,schema返回204,顶点和边类型的则返回404(期望统一为404)(HugeGraph-853,HugeGraph-854)
  • 给vertex-label追加属性,缺失id-strategy,返回信息有误(HugeGraph-861)
  • 给edge-label追加属性,name缺失,提示信息有误(HugeGraph-862)
  • 给edge-label追加属性,source-label为“null”,提示信息有误(HugeGraph-863)
  • 查询时的StringId如果为空字符串应该抛出异常(HugeGraph-868)
  • 通Rest API创建两个顶点之间的边,在studio中通过g.V()则刚新创建的边则不显示,g.E()则能够显示新创建的边(HugeGraph-869)
  • HugeGraph-Server的内部错误500,不应该将stack trace返回给Client(HugeGraph-879)
  • addEdge传入空的id字符串时会抛出非法参数异常(HugeGraph-885)
  • HugeGraph-Client 的 Gremlin 查询结果在解析 Path 时,如果不包含Vertex/Edge会反序列化异常(HugeGraph-891)
  • 枚举HugeKeys的字符串变成小写字母加下划线,导致API序列化时字段名与类中变量名不一致,进而序列化失败(HugeGraph-896)
  • 增加边到不存在的顶点时返回404(期望400)(HugeGraph-922)

Core

功能更新

  • 支持对顶点/边属性(包括索引列)的更新操作(HugeGraph-369)
  • 索引field为空或者空字符串的支持(hugegraph-553和hugegraph-288)
  • vertex/edge的属性一致性保证推迟到实际要访问属性时(hugegraph-763)
  • 增加ScyllaDB后端驱动(HugeGraph-772)
  • 支持tinkerpop的hasKey、hasValue查询(HugeGraph-826)
  • 支持tinkerpop的variables功能(HugeGraph-396)
  • 以“~”为开头的为系统隐藏属性,用户不可以创建(HugeGraph-842)
  • 增加Backend Features以兼容不同后端的特性(HugeGraph-844)
  • 对mutation的update可能出现的操作不直接抛错,进行细化处理(HugeGraph-887)
  • 对append到vertex-label/edge-label的property检查,必须是nullable的(HugeGraph-890)
  • 对于按照id查询,当有的id不存在时,返回其余存在的对象,而非直接抛异常(HugeGraph-900)

BUG修复

  • Vertex.edges(Direction.BOTH,…) assert error(HugeGraph-661)
  • 无法支持在addVertex函数中对同一property(single)多次赋值(HugeGraph-662)
  • 更新属性时不涉及更新的索引列会丢失(HugeGraph-801)
  • GraphTransaction中的ConditionQuery需要索引查询时,没有触发commit,导致查询失败(HugeGraph-805)
  • Cassandra不支持query offset,查询时limit=offset+limit取回所有记录后过滤(HugeGraph-851)
  • 多个插入操作加上一个删除操作,插入操作会覆盖删除操作(HugeGraph-857)
  • 查询时的StringId如果为空字符串应该抛出异常(HugeGraph-868)
  • 元数据schema方法只返回 hidden 信息(HugeGraph-912)

测试

  • tinkerpop的structure和process测试使用不同的keyspace(HugeGraph-763)
  • 将tinkerpop测试和unit测试添加到流水线release-after-merge中(HugeGraph-763)
  • jenkins脚本分离各阶段子脚本,修改项目中的子脚本即可生效构建(HugeGraph-800)
  • 增加clear backends功能,在tinkerpop suite运行完成后清除后端(HugeGraph-852)
  • 增加BackendMutation的测试(HugeGraph-801)
  • 多线程操作图时可能抛出NoHostAvailableException异常(HugeGraph-883)

内部修改

  • 调整HugeGraphServer和HugeGremlinServer启动时JVM的堆内存初始为256M,最大为2048M(HugeGraph-218)
  • 创建Cassandra Table时,使用schemaBuilder代替字符串拼接(hugegraph-773)
  • 运行测试用例时如果初始化图失败(比如数据库连接不上),clear()报错(HugeGraph-910)
  • Example抛异常 Need to specify a readable config file rather than…(HugeGraph-921)
  • HugeGraphServer和HugeGreminServer的缓存保持同步(HugeGraph-569)

9.12 - HugeGraph 0.2 Release Notes

API & Java Client

功能更新

0.2版实现了图数据库基本功能,提供如下功能:

元数据(Schema)

顶点类型(Vertex Label)

  • 创建顶点类型
  • 删除顶点类型
  • 查询顶点类型
  • 增加顶点类型的属性

边类型(Edge Label)

  • 创建边类型
  • 删除边类型
  • 查询边类型
  • 增加边类型的属性

属性(Property Key)

  • 创建属性
  • 删除属性
  • 查询属性

索引(Index Label)

  • 创建索引
  • 删除索引
  • 查询索引

元数据检查

  • 元数据依赖的其它元数据检查(如Vertex Label依赖Property Key)
  • 数据依赖的元数据检查(如Vertex依赖Vertex Label)

图数据

顶点(Vertex)

  • 增加顶点

  • 删除顶点

  • 增加顶点属性

  • 删除顶点属性(必须为非索引列)

  • 批量插入顶点

  • 查询

  • 批量查询

  • 顶点ID策略

    • 用户指定ID(字符串)
    • 用户指定某些属性组合作为ID(拼接为可见字符串)
    • 自动生成ID

边(Edge)

  • 增加边
  • 增加多条同类型边到指定的两个节点(SortKey)
  • 删除边
  • 增加边属性
  • 删除边属性(必须为非索引列)
  • 批量插入边
  • 查询
  • 批量查询

顶点/边属性

  • 属性类型支持

    • text
    • boolean
    • byte、blob
    • int、long
    • float、double
    • timestamp
    • uuid
  • 支持单值属性

  • 支持多值属性:List、Set(注意:非嵌套属性

事务

  • 原子性级别保证(依赖后端
  • 自动提交事务
  • 手动提交事务
  • 并行事务

索引

索引类型

  • 二级索引
  • 范围索引(数字类型)

索引操作

  • 为指定类型的顶点/边创建单列索引(不支持List或Set列创建索引)
  • 为指定类型的顶点/边创建复合索引(不支持List或Set列创建索引,复合索引为前缀索引)
  • 删除指定类型顶点/边的索引(部分或全部索引均可)
  • 重建指定类型顶点/边的索引(部分或全部索引均可)

查询/遍历

  • 列出所有元数据、图数据(支持Limit,不支持分页)

  • 根据ID查询元数据、图数据

  • 根据指定属性的值查询图数据

  • 根据指定属性的值范围查询图数据(属性必须为数字类型)

  • 根据指定顶点/边类型、指定属性的值查询顶点/边

  • 根据指定顶点/边类型、指定属性的值范围查询顶点(属性必须为数字类型)

  • 根据顶点类型(Vertex Label)查询顶点

  • 根据边类型(Edge Label)查询边

  • 根据顶点查询边

    • 查询顶点的所有边
    • 查询顶点的指定方向边(出边、入边)
    • 查询顶点的指定方向、指定类型边
    • 查询两个顶点的同类型边中的某条边(SortKey)
  • 标准Gremlin遍历

缓存

可缓存内容

  • 元数据缓存
  • 顶点缓存

缓存特性

  • LRU策略
  • 高性能并发访问
  • 支持超时过期机制

接口(RESTful API)

  • 版本号接口
  • 图实例接口
  • 元数据接口
  • 图数据接口
  • Gremlin接口

更多细节详见API文档

后端支持

支持Cassandra后端

  • 持久化
  • CQL3
  • 集群

支持Memory后端(仅用于测试)

  • 非持久化
  • 部分特性无法支持(如:更新边属性、根据边类型查询边)

其它

支持配置项

  • 后端存储类型
  • 序列化方式
  • 缓存参数

支持多图实例

  • 静态方式(增加多个图配置文件

版本检查

  • 内部依赖包匹配版本检查
  • API匹配版本检查

9.13 - HugeGraph 0.2.4 Release Notes

API & Java Client

功能更新

元数据(Schema)相关

BUG修复

  • Vertex Label为非primary-key id策略应该允许属性为空(HugeGraph-651)
  • Gremlin-Server 序列化的 EdgeLabel 仅有一个directed 属性,应该打印完整的schema描述(HugeGraph-680)
  • 创建IndexLabel时使用不存在的属性抛出空指针异常,应该抛非法参数异常(HugeGraph-682)
  • 创建schema如果已经存在并指定了ifNotExist时,结果应该返回原来的对象(HugeGraph-694)
  • 由于EdgeLabel的Frequency默认为null以及不允许修改特性,导致Append操作传递null值在API层反序列化失败(HugeGraph-729)
  • 增加对schema名称的正则检查配置项,默认不允许为全空白字符(HugeGraph-727)
  • 中文名的schema在前端显示为乱码(HugeGraph-711)

图数据(Vertex、Edge)相关

功能更新

  • DataType支持Array,并且List类型除了一个一个添加object,也需要支持直接赋值List对象(HugeGraph-719)
  • 自动生成的顶点id由十进制改为十六进制(字符串存储时)(HugeGraph-785)

BUG修复

  • HugeGraph-API的VertexLabel/EdgeLabel API未提供eliminate接口(HugeGraph-614)
  • 增加非primary-key id策略的顶点时,如果属性为空无法插入到数据库中(HugeGraph-652)
  • 使用HugeGraph-Client的gremlin发送无返回值groovy请求时,由于gremlin-server将无返回值序列化为null,导致前端迭代结果集时出现空指针异常(HugeGraph-664)
  • RESTful API在没有找到对应id的vertex/edge时返回500(HugeGraph-734)
  • HugeElement/HugeProperty的equals()与tinkerpop不兼容(HugeGraph-653)
  • HugeEdgeProperty的property的equals函数与tinkerpop兼容 (HugeGraph-740)
  • HugeElement/HugeVertexProperty的hashcode函数与tinkerpop不兼容(HugeGraph-728)
  • HugeVertex/HugeEdge的toString函数与tinkerpop不兼容(HugeGraph-665)
  • 与tinkerpop的异常不兼容,包括IllegalArgumentsException和UnsupportedOperationException(HugeGraph-667)
  • 通过id无法找到element时,抛出的异常类型与tinkerpop不兼容(HugeGraph-689)
  • vertex.addEdge没有检查properties的数目是否为2的倍数(HugeGraph-716)
  • vertex.addEdge()时,assignId调用时机太晚,导致vertex的Set中有重复的edge(HugeGraph-666)
  • 查询时包含大于等于三层逻辑嵌套时,会抛出ClassCastException,现改成抛出非法参数异常(HugeGraph-481)
  • 边查询如果同时包含source-vertex/direction和property作为条件,查询结果错误(HugeGraph-749)
  • HugeGraph-Server 在运行时如果 cassandra 宕掉,插入或查询操作时会抛出DataStax的异常以及详细的调用栈(HugeGraph-771)
  • 删除不存在的 indexLabel 时会抛出异常,而删除其他三种元数据(不存在的)则不会(HugeGraph-782)
  • 当传给EdgeApi的源顶点或目标顶点的id非法时,会因为查询不到该顶点向客户端返回404状态码(HugeGraph-784)
  • 提供内部使用获取元数据的接口,使SchemaManager仅为外部使用,当获取不存在的schema时抛出NotFoundException异常(HugeGraph-743)
  • HugeGraph-Client 创建/添加/移除 元数据都应该返回来自服务端的结果(HugeGraph-760)
  • 创建HugeGraph-Client时如果输入了错误的主机会导致进程阻塞,无法响应(HugeGraph-718)

查询、索引、缓存相关

功能更新

  • 缓存更新更加高效的锁方案(HugeGraph-555)
  • 索引查询增加支持只有一个元素的IN语句(原来仅支持EQ)(HugeGraph-739)

BUG修复

  • 防止请求数据量过大时服务本身hang住(HugeGraph-777)

其它

功能更新

  • 使Init-Store仅用于初始化数据库,清空后端由独立脚本实现(HugeGraph-650)

BUG修复

  • 单元测试跑完后在测试机上遗留了临时的keyspace(HugeGraph-611)
  • Cassandra的info日志信息过多,将大部分修改为debug级别(HugeGraph-722)
  • EventHub.containsListener(String event)判断逻辑有遗漏(HugeGraph-732)
  • EventHub.listeners/unlisten(String event)当没有对应event的listener时会抛空指针异常(HugeGraph-733)

测试

Tinkerpop合规测试

  • 增加自定义ignore机制,规避掉暂时不需要加入持续集成的测试用例(HugeGraph-647)
  • 为TestGraph注册GraphSon和Kryo序列化器,实现 IdGenerator$StringId 的 graphson-v1、graphson-v2 和 Kryo的序列化与反序列化(HugeGraph-660)
  • 增加了可配置的测试用例过滤器,使得tinkerpop测试可以用在开发分支和发布分支的回归测试中
  • 将tinkerpop测试通过配置文件,加入到回归测试中

单元测试

  • 增加Cache及Event的单元测试(HugeGraph-659)
  • HugeGraph-Client 增加API的测试(99个)
  • HugeGraph-Client 增加单元测试,包括RestResult反序列化的单测(12个)

内部修改

  • 改进LOG变量方面代码(HugeGraph-623/HugeGraph-631)
  • License格式调整(HugeGraph-625)
  • 将序列化器中持有的graph抽离,要用到graph的函数通过传参数实现 (HugeGraph-750)

10 - Contribution Guidelines

10.1 - 如何参与 HugeGraph 社区

TODO: translate this article to Chinese

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
    +

    推荐使用HugeGraph-Studio 通过可视化的方式来执行上述代码。另外也可以通过HugeGraph-Client、HugeApi、GremlinConsole和GremlinDriver等多种方式执行上述代码。

    3.2 总结

    HugeGraph 目前支持 Gremlin 的语法,用户可以通过 Gremlin / REST-API 实现各种查询需求。

8 - PERFORMANCE

8.1 - HugeGraph BenchMark Performance

1 测试环境

1.1 硬件信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 软件信息

1.2.1 测试用例

测试使用graphdb-benchmark,一个图数据库测试集。该测试集主要包含4类测试:

  • Massive Insertion,批量插入顶点和边,一定数量的顶点或边一次性提交

  • Single Insertion,单条插入,每个顶点或者每条边立即提交

  • Query,主要是图数据库的基本查询操作:

    • Find Neighbors,查询所有顶点的邻居
    • Find Adjacent Nodes,查询所有边的邻接顶点
    • Find Shortest Path,查询第一个顶点到100个随机顶点的最短路径
  • Clustering,基于Louvain Method的社区发现算法

1.2.2 测试数据集

测试使用人造数据和真实数据

本测试用到的数据集规模
名称vertex数目edge数目文件大小
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB
com-lj.ungraph.txt399796134681189479MB

1.3 服务配置

  • HugeGraph版本:0.5.6,RestServer和Gremlin Server和backends都在同一台服务器上

    • RocksDB版本:rocksdbjni-5.8.6
  • Titan版本:0.5.4, 使用thrift+Cassandra模式

    • Cassandra版本:cassandra-3.10,commit-log 和 data 共用SSD
  • Neo4j版本:2.0.1

graphdb-benchmark适配的Titan版本为0.5.4

2 测试结果

2.1 Batch插入性能

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.6295.7115.24367.033
Titan10.15108.569150.2661217.944
Neo4j3.88418.93824.890281.537

说明

  • 表头"()“中数据是数据规模,以边为单位
  • 表中数据是批量插入的时间,单位是s
  • 例如,HugeGraph使用RocksDB插入amazon0601数据集的300w条边,花费5.711s
结论
  • 批量插入性能 HugeGraph(RocksDB) > Neo4j > Titan(thrift+Cassandra)

2.2 遍历性能

2.2.1 术语说明
  • FN(Find Neighbor), 遍历所有vertex, 根据vertex查邻接edge, 通过edge和vertex查other vertex
  • FA(Find Adjacent), 遍历所有edge,根据edge获得source vertex和target vertex
2.2.2 FN性能
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)com-lj.ungraph(400w)
HugeGraph4.07245.11866.006609.083
Titan8.08492.507184.5431099.371
Neo4j2.42410.53711.609106.919

说明

  • 表头”()“中数据是数据规模,以顶点为单位
  • 表中数据是遍历顶点花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有顶点,并查找邻接边和另一顶点,总共耗时45.118s
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph1.54010.76411.243151.271
Titan7.36193.344169.2181085.235
Neo4j1.6734.7754.28440.507

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是遍历边花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有边,并查询每条边的两个顶点,总共耗时10.764s
结论
  • 遍历性能 Neo4j > HugeGraph(RocksDB) > Titan(thrift+Cassandra)

2.3 HugeGraph-图常用分析方法性能

术语说明
  • FS(Find Shortest Path), 寻找最短路径
  • K-neighbor,从起始vertex出发,通过K跳边能够到达的所有顶点, 包括1, 2, 3…(K-1), K跳边可达vertex
  • K-out, 从起始vertex出发,恰好经过K跳out边能够到达的顶点
FS性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.4940.1033.3648.155
Titan11.8180.239377.709575.678
Neo4j1.7191.8001.9568.530

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是找到从第一个顶点出发到达随机选择的100个顶点的最短路径的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端在图amazon0601中查找第一个顶点到100个随机顶点的最短路径,总共耗时0.103s
结论
  • 在数据规模小或者顶点关联关系少的场景下,HugeGraph性能优于Neo4j和Titan
  • 随着数据规模增大且顶点的关联度增高,HugeGraph与Neo4j性能趋近,都远高于Titan
K-neighbor性能
顶点深度一度二度三度四度五度六度
v1时间0.031s0.033s0.048s0.500s11.27sOOM
v111时间0.027s0.034s0.1151.36sOOM
v1111时间0.039s0.027s0.052s0.511s10.96sOOM

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
K-out性能
顶点深度一度二度三度四度五度六度
v1时间0.054s0.057s0.109s0.526s3.77sOOM
10133245350,8301,128,688
v111时间0.032s0.042s0.136s1.25s20.62sOOM
1021149441131502,629,970
v1111时间0.039s0.045s0.053s1.10s2.92sOOM
101402555508251,070,230

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
结论
  • FS场景,HugeGraph性能优于Neo4j和Titan
  • K-neighbor和K-out场景,HugeGraph能够实现在5度范围内秒级返回结果

2.4 图综合性能测试-CW

数据库规模1000规模5000规模10000规模20000
HugeGraph(core)20.804242.099744.7801700.547
Titan45.790820.6332652.2359568.623
Neo4j5.91350.267142.354460.880

说明

  • “规模"以顶点为单位
  • 表中数据是社区发现完成需要的时间,单位是s,例如HugeGraph使用RocksDB后端在规模10000的数据集,社区聚合不再变化,需要耗时744.780s
  • CW测试是CRUD的综合评估
  • 该测试中HugeGraph跟Titan一样,没有通过client,直接对core操作
结论
  • 社区聚类算法性能 Neo4j > HugeGraph > Titan

8.2 - HugeGraph-API Performance

HugeGraph API性能测试主要测试HugeGraph-Server对RESTful API请求的并发处理能力,包括:

  • 顶点/边的单条插入
  • 顶点/边的批量插入
  • 顶点/边的查询

HugeGraph的每个发布版本的RESTful API的性能测试情况可以参考:

即将更新,敬请期待!

8.2.1 - v0.5.6 Stand-alone(RocksDB)

1 测试环境

被压机器信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • 起压力机器信息:与被压机器同配置
  • 测试工具:apache-Jmeter-2.5.1

注:起压机器和被压机器在同一机房

2 测试说明

2.1 名词定义(时间的单位均为ms)

  • Samples – 本次场景中一共完成了多少个线程
  • Average – 平均响应时间
  • Median – 统计意义上面的响应时间的中值
  • 90% Line – 所有线程中90%的线程的响应时间都小于xx
  • Min – 最小响应时间
  • Max – 最大响应时间
  • Error – 出错率
  • Throughput – 吞吐量
  • KB/sec – 以流量做衡量的吞吐量

2.2 底层存储

后端存储使用RocksDB,HugeGraph与RocksDB都在同一机器上启动,server相关的配置文件除主机和端口有修改外,其余均保持默认。

3 性能结果总结

  1. HugeGraph单条插入顶点和边的速度在每秒1w左右
  2. 顶点和边的批量插入速度远大于单条插入速度
  3. 按id查询顶点和边的并发度可达到13000以上,且请求的平均延时小于50ms

4 测试结果及分析

4.1 batch插入

4.1.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数

持续时间:5min

顶点的最大插入速度:
image

####### 结论:

  • 并发2200,顶点的吞吐量是2026.8,每秒可处理的数据:2026.8*200=405360/s
边的最大插入速度
image

####### 结论:

  • 并发900,边的吞吐量是776.9,每秒可处理的数据:776.9*500=388450/s

4.2 single插入

4.2.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数
  • 持续时间:5min
  • 服务异常标志:错误率大于0.00%
顶点的单条插入
image

####### 结论:

  • 并发11500,吞吐量为10730,顶点的单条插入并发能力为11500
边的单条插入
image

####### 结论:

  • 并发9000,吞吐量是8418,边的单条插入并发能力为9000

4.3 按id查询

4.3.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数
  • 持续时间:5min
  • 服务异常标志:错误率大于0.00%
顶点的按id查询
image

####### 结论:

  • 并发14000,吞吐量是12663,顶点的按id查询的并发能力为14000,平均延时为44ms
边的按id查询
image

####### 结论:

  • 并发13000,吞吐量是12225,边的按id查询的并发能力为13000,平均延时为12ms

8.2.2 - v0.5.6 Cluster(Cassandra)

1 测试环境

被压机器信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • 起压力机器信息:与被压机器同配置
  • 测试工具:apache-Jmeter-2.5.1

注:起压机器和被压机器在同一机房

2 测试说明

2.1 名词定义(时间的单位均为ms)

  • Samples – 本次场景中一共完成了多少个线程
  • Average – 平均响应时间
  • Median – 统计意义上面的响应时间的中值
  • 90% Line – 所有线程中90%的线程的响应时间都小于xx
  • Min – 最小响应时间
  • Max – 最大响应时间
  • Error – 出错率
  • Throughput – 吞吐量
  • KB/sec – 以流量做衡量的吞吐量

2.2 底层存储

后端存储使用15节点Cassandra集群,HugeGraph与Cassandra集群位于不同的服务器,server相关的配置文件除主机和端口有修改外,其余均保持默认。

3 性能结果总结

  1. HugeGraph单条插入顶点和边的速度分别为9000和4500
  2. 顶点和边的批量插入速度分别为5w/s和15w/s,远大于单条插入速度
  3. 按id查询顶点和边的并发度可达到12000以上,且请求的平均延时小于70ms

4 测试结果及分析

4.1 batch插入

4.1.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数

持续时间:5min

顶点的最大插入速度:
image

####### 结论:

  • 并发3500,顶点的吞吐量是261,每秒可处理的数据:261*200=52200/s
边的最大插入速度
image

####### 结论:

  • 并发1000,边的吞吐量是323,每秒可处理的数据:323*500=161500/s

4.2 single插入

4.2.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数
  • 持续时间:5min
  • 服务异常标志:错误率大于0.00%
顶点的单条插入
image

####### 结论:

  • 并发9000,吞吐量为8400,顶点的单条插入并发能力为9000
边的单条插入
image

####### 结论:

  • 并发4500,吞吐量是4160,边的单条插入并发能力为4500

4.3 按id查询

4.3.1 压力上限测试
测试方法

不断提升并发量,测试server仍能正常提供服务的压力上限

压力参数
  • 持续时间:5min
  • 服务异常标志:错误率大于0.00%
顶点的按id查询
image

####### 结论:

  • 并发14500,吞吐量是13576,顶点的按id查询的并发能力为14500,平均延时为11ms
边的按id查询
image

####### 结论:

  • 并发12000,吞吐量是10688,边的按id查询的并发能力为12000,平均延时为63ms

8.3 - HugeGraph-Loader Performance

使用场景

当要批量插入的图数据(包括顶点和边)条数为billion级别及以下,或者总数据量小于TB时,可以采用HugeGraph-Loader工具持续、高速导入图数据

性能

测试均采用网址数据的边数据

RocksDB单机性能

  • 关闭label index,22.8w edges/s
  • 开启label index,15.3w edges/s

Cassandra集群性能

  • 默认开启label index,6.3w edges/s

8.4 -

1 测试环境

1.1 硬件信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 软件信息

1.2.1 测试用例

测试使用graphdb-benchmark,一个图数据库测试集。该测试集主要包含4类测试:

  • Massive Insertion,批量插入顶点和边,一定数量的顶点或边一次性提交

  • Single Insertion,单条插入,每个顶点或者每条边立即提交

  • Query,主要是图数据库的基本查询操作:

    • Find Neighbors,查询所有顶点的邻居
    • Find Adjacent Nodes,查询所有边的邻接顶点
    • Find Shortest Path,查询第一个顶点到100个随机顶点的最短路径
  • Clustering,基于Louvain Method的社区发现算法

1.2.2 测试数据集

测试使用人造数据和真实数据

本测试用到的数据集规模
名称vertex数目edge数目文件大小
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB

1.3 服务配置

  • HugeGraph版本:0.4.4,RestServer和Gremlin Server和backends都在同一台服务器上
  • Cassandra版本:cassandra-3.10,commit-log 和data共用SSD
  • RocksDB版本:rocksdbjni-5.8.6
  • Titan版本:0.5.4, 使用thrift+Cassandra模式

graphdb-benchmark适配的Titan版本为0.5.4

2 测试结果

2.1 Batch插入性能

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan9.51688.123111.586
RocksDB2.34514.07616.636
Cassandra11.930108.709101.959
Memory3.07715.20413.841

说明

  • 表头"()“中数据是数据规模,以边为单位
  • 表中数据是批量插入的时间,单位是s
  • 例如,HugeGraph使用RocksDB插入amazon0601数据集的300w条边,花费14.076s,速度约为21w edges/s
结论
  • RocksDB和Memory后端插入性能优于Cassandra
  • HugeGraph和Titan同样使用Cassandra作为后端的情况下,插入性能接近

2.2 遍历性能

2.2.1 术语说明
  • FN(Find Neighbor), 遍历所有vertex, 根据vertex查邻接edge, 通过edge和vertex查other vertex
  • FA(Find Adjacent), 遍历所有edge,根据edge获得source vertex和target vertex
2.2.2 FN性能
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)
Titan7.72470.935128.884
RocksDB8.87665.85263.388
Cassandra13.125126.959102.580
Memory22.309207.411165.609

说明

  • 表头”()“中数据是数据规模,以顶点为单位
  • 表中数据是遍历顶点花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有顶点,并查找邻接边和另一顶点,总共耗时65.852s
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan7.11963.353115.633
RocksDB6.03264.52652.721
Cassandra9.410102.76694.197
Memory12.340195.444140.89

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是遍历边花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有边,并查询每条边的两个顶点,总共耗时64.526s
结论
  • HugeGraph RocksDB > Titan thrift+Cassandra > HugeGraph Cassandra > HugeGraph Memory

2.3 HugeGraph-图常用分析方法性能

术语说明
  • FS(Find Shortest Path), 寻找最短路径
  • K-neighbor,从起始vertex出发,通过K跳边能够到达的所有顶点, 包括1, 2, 3…(K-1), K跳边可达vertex
  • K-out, 从起始vertex出发,恰好经过K跳out边能够到达的顶点
FS性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan11.3330.313376.06
RocksDB44.3912.221268.792
Cassandra39.8453.337331.113
Memory35.6382.059388.987

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是找到从第一个顶点出发到达随机选择的100个顶点的最短路径的时间,单位是s
  • 例如,HugeGraph使用RocksDB查找第一个顶点到100个随机顶点的最短路径,总共耗时2.059s
结论
  • 在数据规模小或者顶点关联关系少的场景下,Titan最短路径性能优于HugeGraph
  • 随着数据规模增大且顶点的关联度增高,HugeGraph最短路径性能优于Titan
K-neighbor性能
顶点深度一度二度三度四度五度六度
v1时间0.031s0.033s0.048s0.500s11.27sOOM
v111时间0.027s0.034s0.1151.36sOOM
v1111时间0.039s0.027s0.052s0.511s10.96sOOM

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
K-out性能
顶点深度一度二度三度四度五度六度
v1时间0.054s0.057s0.109s0.526s3.77sOOM
10133245350,8301,128,688
v111时间0.032s0.042s0.136s1.25s20.62sOOM
1021149441131502,629,970
v1111时间0.039s0.045s0.053s1.10s2.92sOOM
101402555508251,070,230

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
结论
  • FS场景,HugeGraph性能优于Titan
  • K-neighbor和K-out场景,HugeGraph能够实现在5度范围内秒级返回结果

2.4 图综合性能测试-CW

数据库规模1000规模5000规模10000规模20000
Titan45.943849.1682737.1179791.46
Memory(core)41.0771825.905**
Cassandra(core)39.783862.7442423.1366564.191
RocksDB(core)33.383199.894763.8691677.813

说明

  • “规模"以顶点为单位
  • 表中数据是社区发现完成需要的时间,单位是s,例如HugeGraph使用RocksDB后端在规模10000的数据集,社区聚合不再变化,需要耗时763.869s
  • “*“表示超过10000s未完成
  • CW测试是CRUD的综合评估
  • 后三者分别是HugeGraph的不同后端,该测试中HugeGraph跟Titan一样,没有通过client,直接对core操作
结论
  • HugeGraph在使用Cassandra后端时,性能略优于Titan,随着数据规模的增大,优势越来越明显,数据规模20000时,比Titan快30%
  • HugeGraph在使用RocksDB后端时,性能远高于Titan和HugeGraph的Cassandra后端,分别比两者快了6倍和4倍

9 - CHANGELOGS

9.1 - HugeGraph 1.0.0 Release Notes

OLTP API & Client 更新

API/Client 接口更新

  • 支持热更新trace开关的 /exception/trace API。
  • 支持 Cypher 图查询语言 API。
  • 支持通过 Swagger UI 接口来查看提供的 API 列表。
  • 将各算法中 ’limit’ 参数的类型由 long 调整为 int。
  • 支持在 Client 端跳过 Server 对 HBase 写入数据 (Beta)。

Core & Server

功能更新

  • 支持 Java 11 版本。
  • 支持 2 个新的 OLTP 算法: adamic-adar 和 resource-allocation。
  • 支持 HBase 后端使用哈希 RowKey,并且允许预初始化 HBase 表。
  • 支持 Cypher 图查询语言。
  • 支持集群 Master 角色的自动管理与故障转移。
  • 支持 16 个 OLAP 算法, 包括:LPA, Louvain, PageRank, BetweennessCentrality, RingsDetect等。
  • 根据 Apache 基金会对项目的发版要求进行适配,包括 License 合规性、发版流程、代码风格等,支持 Apache 版本发布。

Bug 修复

  • 修复无法根据多个 Label 和属性来查询边数据。
  • 增加对环路检测算法的最大深度限制。
  • 修复 tree() 语句返回结果异常问题。
  • 修复批量更新边传入 Id 时的检查异常问题。
  • 解决非预期的 Task 状态问题。
  • 解决在更新顶点时未清除边缓存的问题。
  • 修复 MySQL 后端执行 g.V() 时的错误。
  • 修复因为 server-info 无法超时导致的问题。
  • 导出了 ConditionP 类型用于 Gremlin 中用户使用。
  • 修复 within + Text.contains 查询问题。
  • 修复 addIndexLabel/removeIndexLabel 接口的竞争条件问题。
  • 限制仅 Admin 允许输出图实例。
  • 修复 Profile API 的检查问题。
  • 修复在 count().is(0) 查询中 Empty Graph 的问题。
  • 修复在异常时无法关闭服务的问题。
  • 修复在 Apple M1 系统上的 JNA 报错 UnsatisfiedLinkError 的问题。
  • 修复启动 RpcServer 时报 NPE 的问题。
  • 修复 ACTION_CLEARED 参数数量的问题。
  • 修复 RpcServer 服务启动问题。
  • 修复用户传入参数可能得数字转换隐患问题。
  • 移除了 Word 分词器依赖。
  • 修复 Cassandra 与 MySQL 后端在异常时未优雅关闭迭代器的问题。

配置项更新

  • 将配置项 raft.endpoint 从 Graph 作用域移动到 Server 作用域中。

其它修改

  • refact(core): enhance schema job module.
  • refact(raft): improve raft module & test & install snapshot and add peer.
  • refact(core): remove early cycle detection & limit max depth.
  • cache: fix assert node.next==empty.
  • fix apache license conflicts: jnr-posix and jboss-logging.
  • chore: add logo in README & remove outdated log4j version.
  • refact(core): improve CachedGraphTransaction perf.
  • chore: update CI config & support ci robot & add codeQL SEC-check & graph option.
  • refact: ignore security check api & fix some bugs & clean code.
  • doc: enhance CONTRIBUTING.md & README.md.
  • refact: add checkstyle plugin & clean/format the code.
  • refact(core): improve decode string empty bytes & avoid array-construct columns in BackendEntry.
  • refact(cassandra): translate ipv4 to ipv6 metrics & update cassandra dependency version.
  • chore: use .asf.yaml for apache workflow & replace APPLICATION_JSON with TEXT_PLAIN.
  • feat: add system schema store.
  • refact(rocksdb): update rocksdb version to 6.22 & improve rocksdb code.
  • refact: update mysql scope to test & clean protobuf style/configs.
  • chore: upgrade Dockerfile server to 0.12.0 & add editorconfig & improve ci.
  • chore: upgrade grpc version.
  • feat: support updateIfPresent/updateIfAbsent operation.
  • chore: modify abnormal logs & upgrade netty-all to 4.1.44.
  • refact: upgrade dependencies & adopt new analyzer & clean code.
  • chore: improve .gitignore & update ci configs & add RAT/flatten plugin.
  • chore(license): add dependencies-check ci & 3rd-party dependency licenses.
  • refact: Shutdown log when shutdown process & fix tx leak & enhance the file path.
  • refact: rename package to apache & dependency in all modules (Breaking Change).
  • chore: add license checker & update antrun plugin & fix building problem in windows.
  • feat: support one-step script for apache release v1.0.0 release.

Computer (OLAP)

Algorithm Changes

  • 支持 PageRank 算法。
  • 支持 WCC 算法。
  • 支持 degree centrality 算法。
  • 支持 triangle count 算法。
  • 支持 rings detection 算法。
  • 支持 LPA 算法。
  • 支持 k-core 算法。
  • 支持 closeness centrality 算法。
  • 支持 betweenness centrality 算法。
  • 支持 cluster coefficient 算法。

Platform Changes

  • feat: init module computer-core & computer-algorithm & etcd dependency.
  • feat: add Id as base type of vertex id.
  • feat: init Vertex/Edge/Properties & JsonStructGraphOutput.
  • feat: load data from hugegraph server.
  • feat: init basic combiner, Bsp4Worker, Bsp4Master.
  • feat: init sort & transport interface & basic FileInput/Output Stream.
  • feat: init computation & ComputerOutput/Driver interface.
  • feat: init Partitioner and HashPartitioner
  • feat: init Master/WorkerService module.
  • feat: init Heap/LoserTree sorting.
  • feat: init rpc module.
  • feat: init transport server, client, en/decode, flowControl, heartbeat.
  • feat: init DataDirManager & PointerCombiner.
  • feat: init aggregator module & add copy() and assign() methods to Value class.
  • feat: add startAsync and finishAsync on client side, add onStarted and onFinished on server side.
  • feat: init store/sort module.
  • feat: link managers in worker sending end.
  • feat: implement data receiver of worker.
  • feat: implement StreamGraphInput and EntryInput.
  • feat: add Sender and Receiver to process compute message.
  • feat: add seqfile fromat.
  • feat: add ComputeManager.
  • feat: add computer-k8s and computer-k8s-operator.
  • feat: add startup and make docker image code.
  • feat: sort different type of message use different combiner.
  • feat: add HDFS output format.
  • feat: mount config-map and secret to container.
  • feat: support java11.
  • feat: support partition concurrent compute.
  • refact: abstract computer-api from computer-core.
  • refact: optimize data receiving.
  • fix: release file descriptor after input and compute.
  • doc: add operator deploy readme.
  • feat: prepare for Apache release.

Toolchain (loader, tools, hubble)

  • 支持 Loader 使用 SQL 格式来选取从关系数据库导入哪些数据。
  • 支持 Loader 从 Spark 导入数据(包括 JDBC 方式)。
  • 支持 Loader 增加 Flink-CDC 模式。
  • 解决 Loader 导入 ORC 格式数据时,报错 NPE。
  • 解决 Loader 在 Spark/Flink 模式时未缓存 Schema 的问题。
  • 解决 Loader 的 Json 反序列化问题。
  • 解决 Loader 的 Jackson 版本冲突与依赖问题。
  • 支持 Hubble 高级算法接口的 UI 界面。
  • 支持 Hubble 中 Gremlin 语句的高亮格式显示.
  • 支持 Hubble 使用 Docker 镜像部署。
  • 支持 输出构建日志。
  • 解决 Hubble 的端口输入框问题。
  • 支持 Apache 项目发版的适配。

Commons (common,rpc)

  • 支持 assert-throws 方法返回 Future。
  • 增加 Cnm 与 Anm 方法到 CollectionUtil 中。
  • 支持 用户自定义的 content-type。
  • 支持 Apache 项目发版的适配。

Release Details

更加详细的版本变更信息,可以查看各个子仓库的链接:

9.2 - HugeGraph 0.11 Release Notes

API & Client

功能更新

  • 支持梭形相似度算法(hugegraph #671,hugegraph-client #62)
  • 支持创建 Schema 时,记录创建的时间(hugegraph #746,hugegraph-client #69)
  • 支持 RESTful API 中基于属性的范围查询顶点/边(hugegraph #782,hugegraph-client #73)
  • 支持顶点和边的 TTL (hugegraph #794,hugegraph-client #83)
  • 统一 RESTful API Server 和 Gremlin Server 的日期格式为字符串(hugegraph #1014,hugegraph-client #82)
  • 支持共同邻居,Jaccard 相似度,全部最短路径,带权最短路径和单源最短路径5种遍历算法(hugegraph #936,hugegraph-client #80)
  • 支持用户认证和细粒度权限控制(hugegraph #749,hugegraph #985,hugegraph-client #81)
  • 支持遍历 API 的顶点计数功能(hugegraph #995,hugegraph-client #84)
  • 支持 HTTPS 协议(hugegrap #1036,hugegraph-client #85)
  • 支持创建索引时控制是否重建索引(hugegraph #1106,hugegraph-client #91)
  • 支持定制的 kout/kneighbor,多点最短路径,最相似 Jaccard 点和模板路径5种遍历算法(hugegraph #1174,hugegraph-client #100,hugegraph-client #106)

内部修改

  • 启动 HugeGraphServer 出现异常时快速失败(hugegraph #748)
  • 定义 LOADING 模式来加速导入(hugegraph-client #101)

Core

功能更新

  • 支持多属性顶点/边的分页查询(hugegraph #759)
  • 支持聚合运算的性能优化(hugegraph #813)
  • 支持堆外缓存(hugegraph #846)
  • 支持属性权限管理(hugegraph #971)
  • 支持 MySQL 和 Memory 后端分片,并改进 HBase 分片方法(hugegraph #974)
  • 支持基于 Raft 的分布式一致性协议(hugegraph #1020)
  • 支持元数据拷贝功能(hugegraph #1024)
  • 支持集群的异步任务调度功能(hugegraph #1030)
  • 支持发生 OOM 时打印堆信息功能(hugegraph #1093)
  • 支持 Raft 状态机更新缓存(hugegraph #1119)
  • 支持 Raft 节点管理功能(hugegraph #1137)
  • 支持限制查询请求速率的功能(hugegraph #1158)
  • 支持顶点/边的属性默认值功能(hugegraph #1182)
  • 支持插件化查询加速机制 RamTable(hugegraph #1183)
  • 支持索引重建失败时设置为 INVALID 状态(hugegraph #1226)
  • 支持 HBase 启用 Kerberos 认证(hugegraph #1234)

BUG修复

  • 修复配置权限时 start-hugegraph.sh 的超时问题(hugegraph #761)
  • 修复在 studio 执行 gremlin 时的 MySQL 连接失败问题(hugegraph #765)
  • 修复 HBase 后端 truncate 时出现的 TableNotFoundException(hugegraph #771)
  • 修复限速配置项值未检查的问题(hugegraph #773)
  • 修复唯一索引(Unique Index)的返回的异常信息不准确问题(hugegraph #797)
  • 修复 RocksDB 后端执行 g.V().hasLabel().count() 时 OOM 问题 (hugegraph-798)
  • 修复 traverseByLabel() 分页设置错误问题(hugegraph #805)
  • 修复根据 ID 和 SortKeys 更新边属性时误创建边的问题(hugegraph #819)
  • 修复部分存储后端的覆盖写问题(hugegraph #820)
  • 修复保存执行失败的异步任务时无法取消的问题(hugegraph #827)
  • 修复 MySQL 后端在 SSL 模式下无法打开数据库的问题(hugegraph #842)
  • 修复索引查询时 offset 无效问题(hugegraph #866)
  • 修复 Gremlin 中绝对路径泄露的安全问题(hugegraph #871)
  • 修复 reconnectIfNeeded() 方法的 NPE 问题(hugegraph #874)
  • 修复 PostgreSQL 的 JDBC_URL 配置没有"/“前缀的问题(hugegraph #891)
  • 修复 RocksDB 内存统计问题(hugegraph #937)
  • 修复环路检测的两点成环无法检测的问题(hugegraph #939)
  • 修复梭形算法计算结束后没有清理计数的问题(hugegraph #947)
  • 修复 gremlin-console 无法工作的问题(hugegraph #1027)
  • 修复限制数目的按条件过滤邻接边问题(hugegraph #1057)
  • 修复 MySQL 执行 SQL 时的 auto-commit 问题(hugegraph #1064)
  • 修复通过两个索引查询时发生超时 80w 限制的问题(hugegraph #1088)
  • 修复范围索引检查规则错误(hugegraph #1090)
  • 修复删除残留索引的错误(hugegraph #1101)
  • 修复当前线程为 task-worker 时关闭事务卡住的问题(hugegraph #1111)
  • 修复最短路径查询出现 NoSuchElementException 的问题(hugegraph #1116)
  • 修复异步任务有时提交两次的问题(hugegraph #1130)
  • 修复值很小的 date 反序列化的问题(hugegraph #1152)
  • 修复遍历算法未检查起点或者终点是否存在的问题(hugegraph #1156)
  • 修复 bin/start-hugegraph.sh 参数解析错误的问题(hugegraph #1178)
  • 修复 gremlin-console 运行时的 log4j 错误信息的问题(hugegraph #1229)

内部修改

  • 延迟检查非空属性(hugegraph #756)
  • 为存储后端增加查看集群节点信息的功能 (hugegraph #821)
  • 为 RocksDB 后端增加 compaction 高级配置项(hugegraph #825)
  • 增加 vertex.check_adjacent_vertex_exist 配置项(hugegraph #837)
  • 检查主键属性不允许为空(hugegraph #847)
  • 增加图名字的合法性检查(hugegraph #854)
  • 增加对非预期的 SysProp 的查询(hugegraph #862)
  • 使用 disableTableAsync 加速 HBase 后端的数据清除(hugegraph #868)
  • 允许 Gremlin 环境触发系统异步任务(hugegraph #892)
  • 编码字符类型索引中的类型 ID(hugegraph #894)
  • 安全模块允许 Cassandra 在执行 CQL 时按需创建线程(hugegraph #896)
  • 将 GremlinServer 的默认通道设置为 WsAndHttpChannelizer(hugegraph #903)
  • 将 Direction 和遍历算法的类导出到 Gremlin 环境(hugegraph #904)
  • 增加顶点属性缓存限制(hugegraph #941,hugegraph #942)
  • 优化列表属性的读(hugegraph #943)
  • 增加缓存的 L1 和 L2 配置(hugegraph #945)
  • 优化 EdgeId.asString() 方法(hugegraph #946)
  • 优化当顶点没有属性时跳过后端存储查询(hugegraph #951)
  • 创建名字相同但属性不同的元数据时抛出 ExistedException(hugegraph #1009)
  • 查询顶点和边后按需关闭事务(hugegraph #1039)
  • 当图关闭时清空缓存(hugegraph #1078)
  • 关闭图时加锁避免竞争问题(hugegraph #1104)
  • 优化顶点和边的删除效率,当提供 Label+ID 删除时免去查询(hugegraph #1150)
  • 使用 IntObjectMap 优化元数据缓存效率(hugegraph #1185)
  • 使用单个 Raft 节点管理目前的三个 store(hugegraph #1187)
  • 在重建索引时提前释放索引删除的锁(hugegraph #1193)
  • 在压缩和解压缩异步任务的结果时,使用 LZ4 替代 Gzip(hugegraph #1198)
  • 实现 RocksDB 删除 CF 操作的排他性来避免竞争(hugegraph #1202)
  • 修改 CSV reporter 的输出目录,并默认设置为不输出(hugegraph #1233)

其它

  • cherry-pick 0.10.4 版本的 bug 修复代码(hugegraph #785,hugegraph #1047)
  • Jackson 升级到 2.10.2 版本(hugegraph #859)
  • Thanks 信息中增加对 Titan 的感谢(hugegraph #906)
  • 适配 TinkerPop 测试(hugegraph #1048)
  • 修改允许输出的日志最低等级为 TRACE(hugegraph #1050)
  • 增加 IDEA 的格式配置文件(hugegraph #1060)
  • 修复 Travis CI 太多错误信息的问题(hugegraph #1098)

Loader

功能更新

  • 支持读取 Hadoop 配置文件(hugegraph-loader #105)
  • 支持指定 Date 属性的时区(hugegraph-loader #107)
  • 支持从 ORC 压缩文件导入数据(hugegraph-loader #113)
  • 支持单条边插入时设置是否检查顶点(hugegraph-loader #117)
  • 支持从 Snappy-raw 压缩文件导入数据(hugegraph-loader #119)
  • 支持导入映射文件 2.0 版本(hugegraph-loader #121)
  • 增加一个将 utf8-bom 转换为 utf8 的命令行工具(hugegraph-loader #128)
  • 支持导入任务开始前清理元数据信息的功能(hugegraph-loader #140)
  • 支持 id 列作为属性存储(hugegraph-loader #143)
  • 支持导入任务配置 username(hugegraph-loader #146)
  • 支持从 Parquet 文件导入数据(hugegraph-loader #153)
  • 支持指定读取文件的最大行数(hugegraph-loader #159)
  • 支持 HTTPS 协议(hugegraph-loader #161)
  • 支持时间戳作为日期格式(hugegraph-loader #164)

BUG修复

  • 修复行的 retainAll() 方法没有修改 names 和 values 数组(hugegraph-loader #110)
  • 修复 JSON 文件重新加载时的 NPE 问题(hugegraph-loader #112)

内部修改

  • 只打印一次插入错误信息,以避免过多的错误信息(hugegraph-loader #118)
  • 拆分批量插入和单条插入的线程(hugegraph-loader #120)
  • CSV 的解析器改为 SimpleFlatMapper(hugegraph-loader #124)
  • 编码主键中的数字和日期字段(hugegraph-loader #136)
  • 确保主键列合法或者存在映射(hugegraph-loader #141)
  • 跳过主键属性全部为空的顶点(hugegraph-loader #166)
  • 在导入任务开始前设置为 LOADING 模式,并在导入完成后恢复原来模式(hugegraph-loader #169)
  • 改进停止导入任务的实现(hugegraph-loader #170)

Tools

功能更新

  • 支持 Memory 后端的备份功能 (hugegraph-tools #53)
  • 支持 HTTPS 协议(hugegraph-tools #58)
  • 支持 migrate 子命令配置用户名和密码(hugegraph-tools #61)
  • 支持备份顶点和边时指定类型和过滤属性信息(hugegraph-tools #63)

BUG修复

  • 修复 dump 命令的 NPE 问题(hugegraph-tools #49)

内部修改

  • 在 backup/dump 之前清除分片文件(hugegraph-tools #53)
  • 改进 HugeGraph-tools 的报错信息(hugegraph-tools #67)
  • 改进 migrate 子命令,删除掉不支持的子配置(hugegraph-tools #68)

9.3 - HugeGraph 0.12 Release Notes

API & Client

接口更新

  • 支持 https + auth 模式连接图服务 (hugegraph-client #109 #110)
  • 统一 kout/kneighbor 等 OLTP 接口的参数命名及默认值(hugegraph-client #122 #123)
  • 支持 RESTful 接口利用 P.textcontains() 进行属性全文检索(hugegraph #1312)
  • 增加 graph_read_mode API 接口,以切换 OLTP、OLAP 读模式(hugegraph #1332)
  • 支持 list/set 类型的聚合属性 aggregate property(hugegraph #1332)
  • 权限接口增加 METRICS 资源类型(hugegraph #1355、hugegraph-client #114)
  • 权限接口增加 SCHEMA 资源类型(hugegraph #1362、hugegraph-client #117)
  • 增加手动 compact API 接口,支持 rocksdb/cassandra/hbase 后端(hugegraph #1378)
  • 权限接口增加 login/logout API,支持颁发或回收 Token(hugegraph #1500、hugegraph-client #125)
  • 权限接口增加 project API(hugegraph #1504、hugegraph-client #127)
  • 增加 OLAP 回写接口,支持 cassandra/rocksdb 后端(hugegraph #1506、hugegraph-client #129)
  • 增加返回一个图的所有 Schema 的 API 接口(hugegraph #1567、hugegraph-client #134)
  • 变更 property key 创建与更新 API 的 HTTP 返回码为 202(hugegraph #1584)
  • 增强 Text.contains() 支持3种格式:“word”、"(word)"、"(word1|word2|word3)"(hugegraph #1652)
  • 统一了属性中特殊字符的行为(hugegraph #1670 #1684)
  • 支持动态创建图实例、克隆图实例、删除图实例(hugegraph-client #135)

其它修改

  • 修复在恢复 index label 时 IndexLabelV56 id 丢失的问题(hugegraph-client #118)
  • 为 Edge 类增加 name() 方法(hugegraph-client #121)

Core & Server

功能更新

  • 支持动态创建图实例(hugegraph #1065)
  • 支持通过 Gremlin 调用 OLTP 算法(hugegraph #1289)
  • 支持多集群使用同一个图权限服务,以共享权限信息(hugegraph #1350)
  • 支持跨多节点的 Cache 缓存同步(hugegraph #1357)
  • 支持 OLTP 算法使用原生集合以降低 GC 压力提升性能(hugegraph #1409)
  • 支持对新增的 Raft 节点打快照或恢复快照(hugegraph #1439)
  • 支持对集合属性建立二级索引 Secondary Index(hugegraph #1474)
  • 支持审计日志,及其压缩、限速等功能(hugegraph #1492 #1493)
  • 支持 OLTP 算法使用高性能并行无锁原生集合以提升性能(hugegraph #1552)

BUG修复

  • 修复带权最短路径算法(weighted shortest path)NPE问题 (hugegraph #1250)
  • 增加 Raft 相关的安全操作白名单(hugegraph #1257)
  • 修复 RocksDB 实例未正确关闭的问题(hugegraph #1264)
  • 在清空数据 truncate 操作之后,显示的发起写快照 Raft Snapshot(hugegraph #1275)
  • 修复 Raft Leader 在收到 Follower 转发请求时未更新缓存的问题(hugegraph #1279)
  • 修复带权最短路径算法(weighted shortest path)结果不稳定的问题(hugegraph #1280)
  • 修复 rays 算法 limit 参数不生效问题(hugegraph #1284)
  • 修复 neighborrank 算法 capacity 参数未检查的问题(hugegraph #1290)
  • 修复 PostgreSQL 因为不存在与用户同名的数据库而初始化失败的问题(hugegraph #1293)
  • 修复 HBase 后端当启用 Kerberos 时初始化失败的问题(hugegraph #1294)
  • 修复 HBase/RocksDB 后端 shard 结束判断错误问题(hugegraph #1306)
  • 修复带权最短路径算法(weighted shortest path)未检查目标顶点存在的问题(hugegraph #1307)
  • 修复 personalrank/neighborrank 算法中非 String 类型 id 的问题(hugegraph #1310)
  • 检查必须是 master 节点才允许调度 gremlin job(hugegraph #1314)
  • 修复 g.V().hasLabel().limit(n) 因为索引覆盖导致的部分结果不准确问题(hugegraph #1316)
  • 修复 jaccardsimilarity 算法当并集为空时报 NaN 错误的问题(hugegraph #1324)
  • 修复 Raft Follower 节点操作 Schema 多节点之间数据不同步问题(hugegraph #1325)
  • 修复因为 tx 未关闭导致的 TTL 不生效问题(hugegraph #1330)
  • 修复 gremlin job 的执行结果大于 Cassandra 限制但小于任务限制时的异常处理(hugegraph #1334)
  • 检查权限接口 auth-delete 和 role-get API 操作时图必须存在(hugegraph #1338)
  • 修复异步任务结果中包含 path/tree 时系列化不正常的问题(hugegraph #1351)
  • 修复初始化 admin 用户时的 NPE 问题(hugegraph #1360)
  • 修复异步任务原子性操作问题,确保 update/get fields 及 re-schedule 的原子性(hugegraph #1361)
  • 修复权限 NONE 资源类型的问题(hugegraph #1362)
  • 修复启用权限后,truncate 操作报错 SecurityException 及管理员信息丢失问题(hugegraph #1365)
  • 修复启用权限后,解析数据忽略了权限异常的问题(hugegraph #1380)
  • 修复 AuthManager 在初始化时会尝试连接其它节点的问题(hugegraph #1381)
  • 修复特定的 shard 信息导致 base64 解码错误的问题(hugegraph #1383)
  • 修复启用权限后,使用 consistent-hash LB 在校验权限时,creator 为空的问题(hugegraph #1385)
  • 改进权限中 VAR 资源不再依赖于 VERTEX 资源(hugegraph #1386)
  • 规范启用权限后,Schema 操作仅依赖具体的资源(hugegraph #1387)
  • 规范启用权限后,部分操作由依赖 STATUS 资源改为依赖 ANY 资源(hugegraph #1391)
  • 规范启用权限后,禁止初始化管理员密码为空(hugegraph #1400)
  • 检查创建用户时 username/password 不允许为空(hugegraph #1402)
  • 修复更新 Label 时,PrimaryKey 或 SortKey 被设置为可空属性的问题(hugegraph #1406)
  • 修复 ScyllaDB 丢失分页结果问题(hugegraph #1407)
  • 修复带权最短路径算法(weighted shortest path)权重属性强制转换为 double 的问题(hugegraph #1432)
  • 统一 OLTP 算法中的 degree 参数命名(hugegraph #1433)
  • 修复 fusiformsimilarity 算法当 similars 为空的时候返回所有的顶点问题(hugegraph #1434)
  • 改进 paths 算法,当起始点与目标点相同时应该返回空路径(hugegraph #1435)
  • 修改 kout/kneighbor 的 limit 参数默认值 10 为 10000000(hugegraph #1436)
  • 修复分页信息中的 ‘+’ 被 URL 编码为空格的问题(hugegraph #1437)
  • 改进边更新接口的错误提示信息(hugegraph #1443)
  • 修复 kout 算法 degree 未在所有 label 范围生效的问题(hugegraph #1459)
  • 改进 kneighbor/kout 算法,起始点不允许出现在结果集中(hugegraph #1459 #1463)
  • 统一 kout/kneighbor 的 Get 和 Post 版本行为(hugegraph #1470)
  • 改进创建边时顶点类型不匹配的错误提示信息(hugegraph #1477)
  • 修复 Range Index 的残留索引问题(hugegraph #1498)
  • 修复权限操作未失效缓存的问题(hugegraph #1528)
  • 修复 sameneighbor 的 limit 参数默认值 10 为 10000000(hugegraph #1530)
  • 修复 clear API 不应该所有后端都调用 create snapshot 的问题(hugegraph #1532)
  • 修复当 loading 模式时创建 Index Label 阻塞问题(hugegraph #1548)
  • 修复增加图到 project 或从 project 移除图的问题(hugegraph #1562)
  • 改进权限操作的一些错误提示信息(hugegraph #1563)
  • 支持浮点属性设置为 Infinity/NaN 的值(hugegraph #1578)
  • 修复 Raft 启用 safe_read 时的 quorum read 问题(hugegraph #1618)
  • 修复 token 过期时间配置的单位问题(hugegraph #1625)
  • 修复 MySQL Statement 资源泄露问题(hugegraph #1627)
  • 修复竞争条件下 Schema.getIndexLabel 获取不到数据的问题(hugegraph #1629)
  • 修复 HugeVertex4Insert 无法系列化问题(hugegraph #1630)
  • 修复 MySQL count Statement 未关闭问题(hugegraph #1640)
  • 修复当删除 Index Label 异常时,导致状态不同步问题(hugegraph #1642)
  • 修复 MySQL 执行 gremlin timeout 导致的 statement 未关闭问题(hugegraph #1643)
  • 改进 Search Index 以兼容特殊 Unicode 字符:\u0000 to \u0003(hugegraph #1659)
  • 修复 #1659 引入的 Char 未转化为 String 的问题(hugegraph #1664)
  • 修复 has() + within() 查询时结果异常问题(hugegraph #1680)
  • 升级 Log4j 版本到 2.17 以修复安全漏洞(hugegraph #1686 #1698 #1702)
  • 修复 HBase 后端 shard scan 中 startkey 包含空串时 NPE 问题(hugegraph #1691)
  • 修复 paths 算法在深层环路遍历时性能下降问题 (hugegraph #1694)
  • 改进 personalrank 算法的参数默认值及错误检查(hugegraph #1695)
  • 修复 RESTful 接口 P.within 条件不生效问题(hugegraph #1704)
  • 修复启用权限时无法动态创建图的问题(hugegraph #1708)

配置项修改:

  • 共享 SSL 相关配置项命名(hugegraph #1260)
  • 支持 RocksDB 配置项 rocksdb.level_compaction_dynamic_level_bytes(hugegraph #1262)
  • 去除 RESFful Server 服务协议配置项 restserver.protocol,自动提取 URL 中的 Schema(hugegraph #1272)
  • 增加 PostgreSQL 配置项 jdbc.postgresql.connect_database(hugegraph #1293)
  • 增加针对顶点主键是否编码的配置项 vertex.encode_primary_key_number(hugegraph #1323)
  • 增加针对聚合查询是否启用索引优化的配置项 query.optimize_aggregate_by_index(hugegraph #1549)
  • 修改 cache_type 的默认值 l1 为 l2(hugegraph #1681)
  • 增加 JDBC 强制重连配置项 jdbc.forced_auto_reconnect(hugegraph #1710)

其它修改

  • 增加默认的 SSL Certificate 文件(hugegraph #1254)
  • OLTP 并行请求共享线程池,而非每个请求使用单独的线程池(hugegraph #1258)
  • 修复 Example 的问题(hugegraph #1308)
  • 使用 jraft 版本 1.3.5(hugegraph #1313)
  • 如果启用了 Raft 模式时,关闭 RocksDB 的 WAL(hugegraph #1318)
  • 使用 TarLz4Util 来提升快照 Snapshot 压缩的性能(hugegraph #1336)
  • 升级存储的版本号(store version),因为 property key 增加了 read frequency(hugegraph #1341)
  • 顶点/边 vertex/edge 的 Get API 使用 queryVertex/queryEdge 方法来替代 iterator 方法(hugegraph #1345)
  • 支持 BFS 优化的多度查询(hugegraph #1359)
  • 改进 RocksDB deleteRange() 带来的查询性能问题(hugegraph #1375)
  • 修复 travis-ci cannot find symbol Namifiable 问题(hugegraph #1376)
  • 确保 RocksDB 快照的磁盘与 data path 指定的一致(hugegraph #1392)
  • 修复 MacOS 空闲内存 free_memory 计算不准确问题(hugegraph #1396)
  • 增加 Raft onBusy 回调来配合限速(hugegraph #1401)
  • 升级 netty-all 版本 4.1.13.Final 到 4.1.42.Final(hugegraph #1403)
  • 支持 TaskScheduler 暂停当设置为 loading 模式时(hugegraph #1414)
  • 修复 raft-tools 脚本的问题(hugegraph #1416)
  • 修复 license params 问题(hugegraph #1420)
  • 提升写权限日志的性能,通过 batch flush & async write 方式改进(hugegraph #1448)
  • 增加 MySQL 连接 URL 的日志记录(hugegraph #1451)
  • 提升用户信息校验性能(hugegraph# 1460)
  • 修复 TTL 因为起始时间问题导致的错误(hugegraph #1478)
  • 支持日志配置的热加载及对审计日志的压缩(hugegraph #1492)
  • 支持针对用户级别的审计日志的限速(hugegraph #1493)
  • 缓存 RamCache 支持用户自定义的过期时间(hugegraph #1494)
  • 在 auth client 端缓存 login role 以避免重复的 RPC 调用(hugegraph #1507)
  • 修复 IdSet.contains() 未复写 AbstractCollection.contains() 问题(hugegraph #1511)
  • 修复当 commitPartOfEdgeDeletions() 失败时,未回滚 rollback 的问题(hugegraph #1513)
  • 提升 Cache metrics 性能(hugegraph #1515)
  • 当发生 license 操作错误时,增加打印异常日志(hugegraph #1522)
  • 改进 SimilarsMap 实现(hugegraph #1523)
  • 使用 tokenless 方式来更新 coverage(hugegraph #1529)
  • 改进 project update 接口的代码(hugegraph #1537)
  • 允许从 option() 访问 GRAPH_STORE(hugegraph #1546)
  • 优化 kout/kneighbor 的 count 查询以避免拷贝集合(hugegraph #1550)
  • 优化 shortestpath 遍历方式,以数据量少的一端优先遍历(hugegraph #1569)
  • 完善 rocksdb.data_disks 配置项的 allowed keys 提示信息(hugegraph #1585)
  • 为 number id 优化 OLTP 遍历中的 id2code 方法性能(hugegraph #1623)
  • 优化 HugeElement.getProperties() 返回 Collection<Property>(hugegraph #1624)
  • 增加 APACHE PROPOSAL 文件(hugegraph #1644)
  • 改进 close tx 的流程(hugegraph #1655)
  • 当 reset() 时为 MySQL close 捕获所有类型异常(hugegraph #1661)
  • 改进 OLAP property 模块代码(hugegraph #1675)
  • 改进查询模块的执行性能(hugegraph #1711)

Loader

  • 支持导入 Parquet 格式文件(hugegraph-loader #174)
  • 支持 HDFS Kerberos 权限验证(hugegraph-loader #176)
  • 支持 HTTPS 协议连接到服务端导入数据(hugegraph-loader #183)
  • 修复 trust store file 路径问题(hugegraph-loader #186)
  • 处理 loading mode 重置的异常(hugegraph-loader #187)
  • 增加在插入数据时对非空属性的检查(hugegraph-loader #190)
  • 修复客户端与服务端时区不同导致的时间判断问题(hugegraph-loader #192)
  • 优化数据解析性能(hugegraph-loader #194)
  • 当用户指定了文件头时,检查其必须不为空(hugegraph-loader #195)
  • 修复示例程序中 MySQL struct.json 格式问题(hugegraph-loader #198)
  • 修复顶点边导入速度不精确的问题(hugegraph-loader #200 #205)
  • 当导入启用 check-vertex 时,确保先导入顶点再导入边(hugegraph-loader #206)
  • 修复边 Json 数据导入格式不统一时数组溢出的问题(hugegraph-loader #211)
  • 修复因边 mapping 文件不存在导致的 NPE 问题(hugegraph-loader #213)
  • 修复读取时间可能出现负数的问题(hugegraph-loader #215)
  • 改进目录文件的日志打印(hugegraph-loader #223)
  • 改进 loader 的的 Schema 处理流程(hugegraph-loader #230)

Tools

  • 支持 HTTPS 协议(hugegraph-tools #71)
  • 移除 –protocol 参数,直接从URL中自动提取(hugegraph-tools #72)
  • 支持将数据 dump 到 HDFS 文件系统(hugegraph-tools #73)
  • 修复 trust store file 路径问题(hugegraph-tools #75)
  • 支持权限信息的备份恢复(hugegraph-tools #76)
  • 支持无参数的 Printer 打印(hugegraph-tools #79)
  • 修复 MacOS free_memory 计算问题(hugegraph-tools #82)
  • 支持备份恢复时指定线程数hugegraph-tools #83)
  • 支持动态创建图、克隆图、删除图等命令(hugegraph-tools #95)

9.4 - HugeGraph 0.10 Release Notes

API & Client

功能更新

  • 支持 HugeGraphServer 服务端内存紧张时返回错误拒绝请求 (hugegraph #476)
  • 支持 API 白名单和 HugeGraphServer GC 频率控制功能 (hugegraph #522)
  • 支持 Rings API 的 source_in_ring 参数 (hugegraph #528,hugegraph-client #48)
  • 支持批量按策略更新属性接口 (hugegraph #493,hugegraph-client #46)
  • 支持 Shard Index 前缀与范围检索索引 (hugegraph #574,hugegraph-client #56)
  • 支持顶点的 UUID ID 类型 (hugegraph #618,hugegraph-client #59)
  • 支持唯一性约束索引(Unique Index) (hugegraph #636,hugegraph-client #60)
  • 支持 API 请求超时功能 (hugegraph #674)
  • 支持根据名称列表查询 schema (hugegraph #686,hugegraph-client #63)
  • 支持按分页方式获取异步任务 (hugegraph #720)

内部修改

  • 保持 traverser 的参数与 server 端一致 (hugegraph-client #44)
  • 支持在 Shard 内使用分页方式遍历顶点或者边的方法 (hugegraph-client #47)
  • 支持 Gremlin 查询结果持有 GraphManager (hugegraph-client #49)
  • 改进 RestClient 的连接参数 (hugegraph-client #52)
  • 增加 Date 类型属性的测试 (hugegraph-client #55)
  • 适配 HugeGremlinException 异常 (hugegraph-client #57)
  • 增加新功能的版本匹配检查 (hugegraph-client #66)
  • 适配 UUID 的序列化 (hugegraph-client #67)

Core

功能更新

  • 支持 PostgreSQL 和 CockroachDB 存储后端 (hugegraph #484)
  • 支持负数索引 (hugegraph #513)
  • 支持边的 Vertex + SortKeys 的前缀范围查询 (hugegraph #574)
  • 支持顶点的邻接边按分页方式查询 (hugegraph #659)
  • 禁止通过 Gremlin 进行敏感操作 (hugegraph #176)
  • 支持 Lic 校验功能 (hugegraph #645)
  • 支持 Search Index 查询结果按匹配度排序的功能 (hugegraph #653)
  • 升级 tinkerpop 至版本 3.4.3 (hugegraph #648)

BUG修复

  • 修复按分页方式查询边时剩余数目(remaining count)错误 (hugegraph #515)
  • 修复清空后端时边缓存未清空的问题 (hugegraph #488)
  • 修复无法插入 List 类型的属性问题 (hugegraph #534)
  • 修复 PostgreSQL 后端的 existDatabase(), clearBackend() 和 rollback()功能 (hugegraph #531)
  • 修复程序关闭时 HugeGraphServer 和 GremlinServer 残留问题 (hugegraph #554)
  • 修复在 LockTable 中重复抓锁的问题 (hugegraph #566)
  • 修复从 Edge 中获取的 Vertex 没有属性的问题 (hugegraph #604)
  • 修复交叉关闭 RocksDB 的连接池问题 (hugegraph #598)
  • 修复在超级点查询时 limit 失效问题 (hugegraph #607)
  • 修复使用 Equal 条件和分页的情况下查询 Range Index 只返回第一页的问题 (hugegraph #614)
  • 修复查询 limit 在删除部分数据后失效的问题 (hugegraph #610)
  • 修复 Example1 的查询错误 (hugegraph #638)
  • 修复 HBase 的批量提交部分错误问题 (hugegraph #634)
  • 修复索引搜索时 compareNumber() 方法的空指针问题 (hugegraph #629)
  • 修复更新属性值为已经删除的顶点或边的属性时失败问题 (hugegraph #679)
  • 修复 system 类型残留索引无法清除问题 (hugegraph #675)
  • 修复 HBase 在 Metrics 信息中的单位问题 (hugegraph #713)
  • 修复存储后端未初始化问题 (hugegraph #708)
  • 修复按 Label 删除边时导致的 IN 边残留问题 (hugegraph #727)
  • 修复 init-store 会生成多份 backend_info 问题 (hugegraph #723)

内部修改

  • 抑制因 PostgreSQL 后端 database 不存在时的报警信息 (hugegraph #527)
  • 删除 PostgreSQL 后端的无用配置项 (hugegraph #533)
  • 改进错误信息中的 HugeType 为易读字符串 (hugegraph #546)
  • 增加 jdbc.storage_engine 配置项指定存储引擎 (hugegraph #555)
  • 增加使用后端链接时按需重连功能 (hugegraph #562)
  • 避免打印空的查询条件 (hugegraph #583)
  • 缩减 Variable 的字符串长度 (hugegraph #581)
  • 增加 RocksDB 后端的 cache 配置项 (hugegraph #567)
  • 改进异步任务的异常信息 (hugegraph #596)
  • 将 Range Index 拆分成 INT,LONG,FLOAT,DOUBLE 四个表存储 (hugegraph #574)
  • 改进顶点和边 API 的 Metrics 名字 (hugegraph #631)
  • 增加 G1GC 和 GC Log 的配置项 (hugegraph #616)
  • 拆分顶点和边的 Label Index 表 (hugegraph #635)
  • 减少顶点和边的属性存储空间 (hugegraph #650)
  • 支持对 Secondary Index 和 Primary Key 中的数字进行编码 (hugegraph #676)
  • 减少顶点和边的 ID 存储空间 (hugegraph #661)
  • 支持 Cassandra 后端存储的二进制序列化存储 (hugegraph #680)
  • 放松对最小内存的限制 (hugegraph #689)
  • 修复 RocksDB 后端批量写时的 Invalid column family 问题 (hugegraph #701)
  • 更新异步任务状态时删除残留索引 (hugegraph #719)
  • 删除 ScyllaDB 的 Label Index 表 (hugegraph #717)
  • 启动时使用多线程方式打开 RocksDB 后端存储多个数据目录 (hugegraph #721)
  • RocksDB 版本从 v5.17.2 升级至 v6.3.6 (hugegraph #722)

其它

  • 增加 API tests 到 codecov 统计中 (hugegraph #711)
  • 改进配置文件的默认配置项 (hugegraph #575)
  • 改进 README 中的致谢信息 (hugegraph #548)

Loader

功能更新

  • 支持 JSON 数据源的 selected 字段 (hugegraph-loader #62)
  • 支持定制化 List 元素之间的分隔符 (hugegraph-loader #66)
  • 支持值映射 (hugegraph-loader #67)
  • 支持通过文件后缀过滤文件 (hugegraph-loader #82)
  • 支持对导入进度进行记录和断点续传 (hugegraph-loader #70,hugegraph-loader #87)
  • 支持从不同的关系型数据库中读取 Header 信息 (hugegraph-loader #79)
  • 支持属性为 Unsigned Long 类型值 (hugegraph-loader #91)
  • 支持顶点的 UUID ID 类型 (hugegraph-loader #98)
  • 支持按照策略批量更新属性 (hugegraph-loader #97)

BUG修复

  • 修复 nullable key 在 mapping field 不工作的问题 (hugegraph-loader #64)
  • 修复 Parse Exception 无法捕获的问题 (hugegraph-loader #74)
  • 修复在等待异步任务完成时获取信号量数目错误的问题 (hugegraph-loader #86)
  • 修复空表时 hasNext() 返回 true 的问题 (hugegraph-loader #90)
  • 修复布尔值解析错误问题 (hugegraph-loader #92)

内部修改

  • 增加 HTTP 连接参数 (hugegraph-loader #81)
  • 改进导入完成的总结信息 (hugegraph-loader #80)
  • 改进一行数据缺少列或者有多余列的处理逻辑 (hugegraph-loader #93)

Tools

功能更新

  • 支持 0.8 版本 server 备份的数据恢复至 0.9 版本的 server 中 (hugegraph-tools #34)
  • 增加 timeout 全局参数 (hugegraph-tools #44)
  • 增加 migrate 子命令支持迁移图 (hugegraph-tools #45)

BUG修复

  • 修复 dump 命令不支持 split size 参数的问题 (hugegraph-tools #32)

内部修改

  • 删除 Hadoop 对 Jersey 1.19的依赖 (hugegraph-tools #31)
  • 优化子命令在 help 信息中的排序 (hugegraph-tools #37)
  • 使用 log4j2 清除 log4j 的警告信息 (hugegraph-tools #39)

9.5 - HugeGraph 0.9 Release Notes

API & Client

功能更新

  • 增加 personal rank API 和 neighbor rank API (hugegraph #274)
  • Shortest path API 增加 skip_degree 参数跳过超级点(hugegraph #433,hugegraph-client #42)
  • vertex/edge 的 scan API 支持分页机制 (hugegraph #428,hugegraph-client #35)
  • VertexAPI 使用简化的属性序列化器 (hugegraph #332,hugegraph-client #37)
  • 增加 customized paths API 和 customized crosspoints API (hugegraph #306,hugegraph-client #40)
  • 在 server 端所有线程忙时返回503错误 (hugegraph #343)
  • 保持 API 的 depth 和 degree 参数一致 (hugegraph #252,hugegraph-client #30)

BUG修复

  • 增加属性的时候验证 Date 而非 Timestamp 的值 (hugegraph-client #26)

内部修改

  • RestClient 支持重用连接 (hugegraph-client #33)
  • 使用 JsonUtil 替换冗余的 ObjectMapper (hugegraph-client #41)
  • Edge 直接引用 Vertex 使得批量插入更友好 (hugegraph-client #29)
  • 使用 JaCoCo 替换 Cobertura 统计代码覆盖率 (hugegraph-client #39)
  • 改进 Shard 反序列化机制 (hugegraph-client #34)

Core

功能更新

  • 支持 Cassandra 的 NetworkTopologyStrategy (hugegraph #448)
  • 元数据删除和索引重建使用分页机制 (hugegraph #417)
  • 支持将 HugeGraphServer 作为系统服务 (hugegraph #170)
  • 单一索引查询支持分页机制 (hugegraph #328)
  • 在初始化图库时支持定制化插件 (hugegraph #364)
  • 为HBase后端增加 hbase.zookeeper.znode.parent 配置项 (hugegraph #333)
  • 支持异步 Gremlin 任务的进度更新 (hugegraph #325)
  • 使用异步任务的方式删除残留索引 (hugegraph #285)
  • 支持按 sortKeys 范围查找功能 (hugegraph #271)

BUG修复

  • 修复二级索引删除时 Cassandra 后端的 batch 超过65535限制的问题 (hugegraph #386)
  • 修复 RocksDB 磁盘利用率的 metrics 不正确问题 (hugegraph #326)
  • 修复异步索引删除错误修复 (hugegraph #336)
  • 修复 BackendSessionPool.close() 的竞争条件问题 (hugegraph #330)
  • 修复保留的系统 ID 不工作问题 (hugegraph #315)
  • 修复 cache 的 metrics 信息丢失问题 (hugegraph #321)
  • 修复使用 hasId() 按 id 查询顶点时不支持数字 id 问题 (hugegraph #302)
  • 修复重建索引时的 80w 限制问题和 Cassandra 后端的 batch 65535问题 (hugegraph #292)
  • 修复残留索引删除无法处理未展开(none-flatten)查询的问题 (hugegraph #281)

内部修改

  • 迭代器变量统一命名为 ‘iter’(hugegraph #438)
  • 增加 PageState.page() 方法统一获取分页信息接口 (hugegraph #429)
  • 为基于 mapdb 的内存版后端调整代码结构,增加测试用例 (hugegraph #357)
  • 支持代码覆盖率统计 (hugegraph #376)
  • 设置 tx capacity 的下限为 COMMIT_BATCH(默认为500) (hugegraph #379)
  • 增加 shutdown hook 来自动关闭线程池 (hugegraph #355)
  • PerfExample 的统计时间排除环境初始化时间 (hugegraph #329)
  • 改进 BinarySerializer 中的 schema 序列化 (hugegraph #316)
  • 避免对 primary key 的属性创建多余的索引 (hugegraph #317)
  • 限制 Gremlin 异步任务的名字小于256字节 (hugegraph #313)
  • 使用 multi-get 优化 HBase 后端的按 id 查询 (hugegraph #279)
  • 支持更多的日期数据类型 (hugegraph #274)
  • 修改 Cassandra 和 HBase 的 port 范围为(1,65535) (hugegraph #263)

其它

  • 增加 travis API 测试 (hugegraph #299)
  • 删除 rest-server.properties 中的 GremlinServer 相关的默认配置项 (hugegraph #290)

Loader

功能更新

  • 支持从 HDFS 和 关系型数据库导入数据 (hugegraph-loader #14)
  • 支持传递权限 token 参数(hugegraph-loader #46)
  • 支持通过 regex 指定要跳过的行 (hugegraph-loader #43)
  • 支持导入 TEXT 文件时的 List/Set 属性(hugegraph-loader #38)
  • 支持自定义的日期格式 (hugegraph-loader #28)
  • 支持从指定目录导入数据 (hugegraph-loader #33)
  • 支持忽略最后多余的列或者 null 值的列 (hugegraph-loader #23)

BUG修复

  • 修复 Example 问题(hugegraph-loader #57)
  • 修复当 vertex 是 customized ID 策略时边解析问题(hugegraph-loader #24)

内部修改

  • URL regex 改进 (hugegraph-loader #47)

Tools

功能更新

  • 支持海量数据备份和恢复到本地和 HDFS,并支持压缩 (hugegraph-tools #21)
  • 支持异步任务取消和清理功能 (hugegraph-tools #20)
  • 改进 graph-clear 命令的提示信息 (hugegraph-tools #23)

BUG修复

  • 修复 restore 命令总是使用 ‘hugegraph’ 作为目标图的问题,支持指定图 (hugegraph-tools #26)

9.6 - HugeGraph 0.8 Release Notes

API & Client

功能更新

  • 服务端增加 rays 和 rings 的 RESTful API(hugegraph #45)
  • 使创建 IndexLabel 返回异步任务(hugegraph #95,hugegraph-client #9)
  • 客户端增加恢复模式相关的 API(hugegraph-client #10)
  • 让 task-list API 不返回 task_input 和 task_result(hugegraph #143)
  • 增加取消异步任务的API(hugegraph #167,hugegraph-client #15)
  • 增加获取后端 metrics 的 API(hugegraph #155)

BUG修复

  • 分页获取时最后一页的 page 应该为 null 而非 “null”(hugegraph #168)
  • 分页迭代获取服务端已经没有下一页了应该停止获取(hugegraph-client #16)
  • 添加顶点使用自定义 Number Id 时报类型无法转换(hugegraph-client #21)

内部修改

  • 增加持续集成测试(hugegraph-client #19)

Core

功能更新

  • 取消异步任务通过 label 查询时 80w 的限制(hugegraph #93)
  • 允许 cardinality 为 set 时传入 Json List 形式的属性值(hugegraph #109)
  • 支持在恢复模式和合并模式来恢复图(hugegraph #114)
  • RocksDB 后端支持多个图指定为同一个存储目录(hugegraph #123)
  • 支持用户自定义权限认证器(hugegraph-loader #133)
  • 当服务重启后重新开始未完成的任务(hugegraph #188)
  • 当顶点的 Id 策略为自定义时,检查是否已存在相同 Id 的顶点(hugegraph #189)

BUG修复

  • 增加对 HasContainer 的 predicate 不为 null 的检查(hugegraph #16)
  • RocksDB 后端由于数据目录和日志目录错误导致 init-store 失败(hugegraph #25)
  • 启动 hugegraph 时由于 logs 目录不存在导致提示超时但实际可访问(hugegraph #38)
  • ScyllaDB 后端遗漏注册顶点表(hugegraph #47)
  • 使用 hasLabel 查询传入多个 label 时失败(hugegraph #50)
  • Memory 后端未初始化 task 相关的 schema(hugegraph #100)
  • 当使用 hasLabel 查询时,如果元素数量超过 80w,即使加上 limit 也会报错(hugegraph #104)
  • 任务的在运行之后没有保存过状态(hugegraph #113)
  • 检查后端版本信息时直接强转 HugeGraphAuthProxy 为 HugeGraph(hugegraph #127)
  • 配置项 batch.max_vertices_per_batch 未生效(hugegraph #130)
  • 配置文件 rest-server.properties 有错误时 HugeGraphServer 启动不报错,但是无法访问(hugegraph #131)
  • MySQL 后端某个线程的提交对其他线程不可见(hugegraph #163)
  • 使用 union(branch) + has(date) 查询时提示 String 无法转换为 Date(hugegraph #181)
  • 使用 RocksDB 后端带 limit 查询顶点时会返回不完整的结果(hugegraph #197)
  • 提示其他线程无法操作 tx(hugegraph #204)

内部修改

  • 拆分 graph.cache_xx 配置项为 vertex.cache_xx 和 edge.cache_xx 两类(hugegraph #56)
  • 去除 hugegraph-dist 对 hugegraph-api 的依赖(hugegraph #61)
  • 优化集合取交集和取差集的操作(hugegraph #85)
  • 优化 transaction 的缓存处理和索引及 Id 查询(hugegraph #105)
  • 给各线程池的线程命名(hugegraph #124)
  • 增加并优化了一些 metrics 统计(hugegraph #138)
  • 增加了对未完成任务的 metrics 记录(hugegraph #141)
  • 让索引更新以分批方式提交,而不是全量提交(hugegraph #150)
  • 在添加顶点/边时一直持有 schema 的读锁,直到提交/回滚完成(hugegraph #180)
  • 加速 Tinkerpop 测试(hugegraph #19)
  • 修复 Tinkerpop 测试在 resource 目录下找不到 filter 文件的 BUG(hugegraph #26)
  • 开启 Tinkerpop 测试中 supportCustomIds 特性(hugegraph #69)
  • 持续集成中添加 HBase 后端的测试(hugegraph #41)
  • 避免持续集成的 deploy 脚本运行多次(hugegraph #170)
  • 修复 cache 单元测试跑不过的问题(hugegraph #177)
  • 持续集成中修改部分后端的存储为 tmpfs 以加快测试速度(hugegraph #206)

其它

  • 增加 issue 模版(hugegraph #42)
  • 增加 CONTRIBUTING 文件(hugegraph #59)

Loader

功能更新

  • 支持忽略源文件某些特定列(hugegraph-loader #2)
  • 支持导入 cardinality 为 Set 的属性数据(hugegraph-loader #10)
  • 单条插入也使用多个线程执行,解决了错误多时最后单条导入慢的问题(hugegraph-loader #12)

BUG修复

  • 导入过程可能统计出错(hugegraph-loader #4)
  • 顶点使用自定义 Number Id 导入出错(hugegraph-loader #6)
  • 顶点使用联合主键时导入出错(hugegraph-loader #18)

内部修改

  • 增加持续集成测试(hugegraph-loader #8)
  • 优化检测到文件不存在时的提示信息(hugegraph-loader #16)

Tools

功能更新

  • 增加 KgDumper (hugegraph-tools #6)
  • 支持在恢复模式和合并模式中恢复图(hugegraph-tools #9)

BUG修复

  • 脚本中的工具函数 get_ip 在系统未安装 ifconfig 时报错(hugegraph-tools #13)

9.7 - HugeGraph 0.7 Release Notes

API & Java Client

功能更新

  • 支持异步删除元数据和重建索引(HugeGraph-889)
  • 加入监控API,并与Gremlin的监控框架集成(HugeGraph-1273)

BUG修复

  • EdgeAPI更新属性时会将属性值也置为属性键(HugeGraph-81)
  • 当删除顶点或边时,如果id非法应该返回400错误而非404(HugeGraph-1337)

Core

功能更新

  • 支持HBase后端存储(HugeGraph-1280)
  • 增加异步API框架,耗时操作可通过调用异步API实现(HugeGraph-387)
  • 支持对长属性列建立二级索引,取消目前索引列长度256字节的限制(HugeGraph-1314)
  • 支持顶点属性的“创建或更新”操作(HugeGraph-1303)
  • 支持全文检索功能(HugeGraph-1322)
  • 支持数据库表的版本号检查(HugeGraph-1328)
  • 删除顶点时,如果遇到超级点的时候报错"Batch too large"或“Batch 65535 statements”(HugeGraph-1354)
  • 支持异步删除元数据和重建索引(HugeGraph-889)
  • 支持异步长时间执行Gremlin任务(HugeGraph-889)

BUG修复

  • 防止超级点访问时查询过多下一层顶点而阻塞服务(HugeGraph-1302)
  • HBase初始化时报错连接已经关闭(HugeGraph-1318)
  • 按照date属性过滤顶点报错String无法转为Date(HugeGraph-1319)
  • 残留索引删除,对range索引的判断存在错误(HugeGraph-1291)
  • 支持组合索引后,残留索引清理没有考虑索引组合的情况(HugeGraph-1311)
  • 根据otherV的条件来删除边时,可能会因为边的顶点不存在导致错误(HugeGraph-1347)
  • label索引对offset和limit结果错误(HugeGraph-1329)
  • vertex label或者edge label没有开启label index,删除label会导致数据无法删除(HugeGraph-1355)

内部修改

  • hbase后端代码引入较新版本的Jackson-databind包,导致HugeGraphServer启动异常(HugeGraph-1306)
  • Core和Client都自己持有一个shard类,而不是依赖于common模块(HugeGraph-1316)
  • 去掉rebuild index和删除vertex label和edge label时的80w的capacity限制(HugeGraph-1297)
  • 所有schema操作需要考虑同步问题(HugeGraph-1279)
  • 拆分Cassandra的索引表,把element id每条一行,避免聚合高时,导入速度非常慢甚至卡住(HugeGraph-1304)
  • 将hugegraph-test中关于common的测试用例移动到hugegraph-common中(HugeGraph-1297)
  • 异步任务支持保存任务参数,以支持任务恢复(HugeGraph-1344)
  • 支持通过脚本部署文档到GitHub(HugeGraph-1351)
  • RocksDB和Hbase后端索引删除实现(HugeGraph-1317)

Loader

功能更新

  • HugeLoader支持用户手动创建schema,以文件的方式传入(HugeGraph-1295)

BUG修复

  • HugeLoader导数据时未区分输入文件的编码,导致可能产生乱码(HugeGraph-1288)
  • HugeLoader打包的example目录的三个子目录下没有文件(HugeGraph-1288)
  • 导入的CSV文件中如果数据列本身包含逗号会解析出错(HugeGraph-1320)
  • 批量插入避免单条失败导致整个batch都无法插入(HugeGraph-1336)
  • 异常信息作为模板打印异常(HugeGraph-1345)
  • 导入边数据,当列数不对时导致程序退出(HugeGraph-1346)
  • HugeLoader的自动创建schema失败(HugeGraph-1363)
  • ID长度检查应该检查字节长度而非字符串长度(HugeGraph-1374)

内部修改

  • 添加测试用例(HugeGraph-1361)

Tools

功能更新

  • backup/restore使用多线程加速,并增加retry机制(HugeGraph-1307)
  • 一键部署支持传入路径以存放包(HugeGraph-1325)
  • 实现dump图功能(内存构建顶点及关联边)(HugeGraph-1339)
  • 增加backup-scheduler功能,支持定时备份且保留一定数目最新备份(HugeGraph-1326)
  • 增加异步任务查询和异步执行Gremlin的功能(HugeGraph-1357)

BUG修复

  • hugegraph-tools的backup和restore编码为UTF-8(HugeGraph-1321)
  • hugegraph-tools设置默认JVM堆大小和发布版本号(HugeGraph-1340)

Studio

BUG修复

  • HugeStudio中顶点id包含换行符时g.V()会导致groovy解析出错(HugeGraph-1292)
  • 限制返回的顶点及边的数量(HugeGraph-1333)
  • 加载note出现消失或者卡住情况(HugeGraph-1353)
  • HugeStudio打包时,编译失败但没有报错,导致发布包无法启动(HugeGraph-1368)

9.8 - HugeGraph 0.6 Release Notes

API & Java Client

功能更新

  • 增加RESTFul API paths和crosspoints,找出source到target顶点间多条路径或包含交叉点的路径(HugeGraph-1210)
  • 在API层添加批量插入并发数的控制,避免出现全部的线程都用于写而无法查询的情况(HugeGraph-1228)
  • 增加scan-API,允许客户端并发地获取顶点和边(HugeGraph-1197)
  • Client支持传入用户名密码访问带权限控制的HugeGraph(HugeGraph-1256)
  • 为顶点及边的list API添加offset参数(HugeGraph-1261)
  • RESTful API的顶点/边的list不允许同时传入page 和 [label,属性](HugeGraph-1262)
  • k-out、K-neighbor、paths、shortestpath等API增加degree、capacity和limit(HugeGraph-1176)
  • 增加restore status的set/get/clear接口(HugeGraph-1272)

BUG修复

  • 使 RestClient的basic auth使用Preemptive模式(HugeGraph-1257)
  • HugeGraph-Client中由ResultSet获取多次迭代器,除第一次外其他的无法迭代(HugeGraph-1278)

Core

功能更新

  • RocksDB实现scan特性(HugeGraph-1198)
  • Schema userdata 提供删除 key 功能(HugeGraph-1195)
  • 支持date类型属性的范围查询(HugeGraph-1208)
  • limit下沉到backend,尽可能不进行多余的索引读取(HugeGraph-1234)
  • 增加 API 权限与访问控制(HugeGraph-1162)
  • 禁止多个后端配置store为相同的值(HugeGraph-1269)

BUG修复

  • RocksDB的Range查询时如果只指定上界或下界会查出其他IndexLabel的记录(HugeGraph-1211)
  • RocksDB带limit查询时,graphTransaction查询返回的结果多一个(HugeGraph-1234)
  • init-store在CentOS上依赖通用的io.netty有时会卡住,改为使用netty-transport-native-epoll(HugeGraph-1255)
  • Cassandra后端in语句(按id查询)元素个数最大65535(HugeGraph-1239)
  • 主键加索引(或普通属性)作为查询条件时报错(HugeGraph-1276)
  • init-store.sh在Centos平台上初始化失败或者卡住(HugeGraph-1255)

测试

内部修改

  • 将compareNumber方法搬移至common模块(HugeGraph-1208)
  • 修复HugeGraphServer无法在Ubuntu机器上启动的Bug(HugeGraph-1154)
  • 修复init-store.sh无法在bin目录下执行的BUG(HugeGraph-1223)
  • 修复HugeGraphServer启动过程中无法通过CTRL+C终止的BUG(HugeGraph-1223)
  • HugeGraphServer启动前检查端口是否被占用(HugeGraph-1223)
  • HugeGraphServer启动前检查系统JDK是否安装以及版本是否为1.8(HugeGraph-1223)
  • 给HugeConfig类增加getMap()方法(HugeGraph-1236)
  • 修改默认配置项,后端使用RocksDB,注释重要的配置项(HugeGraph-1240)
  • 重命名userData为userdata(HugeGraph-1249)
  • centos 4.3系统HugeGraphServer进程使用jps命令查不到
  • 增加配置项ALLOW_TRACE,允许设置是否返回exception stack trace(HugeGraph-81)

Tools

功能更新

  • 增加自动化部署工具以安装所有组件(HugeGraph-1267)
  • 增加clear的脚本,并拆分deploy和start-all(HugeGraph-1274)
  • 对hugegraph服务进行监控以提高可用性(HugeGraph-1266)
  • 增加backup/restore功能和命令(HugeGraph-1272)
  • 增加graphs API对应的命令(HugeGraph-1272)

BUG修复

Loader

功能更新

  • 默认添加csv及json的示例(HugeGraph-1259)

BUG修复

9.9 - HugeGraph 0.5 Release Notes

API & Java Client

功能更新

  • VertexLabel与EdgeLabel增加bool参数enable_label_index表述是否构建label索引(HugeGraph-1085)
  • 增加RESTful API来支持高效shortest path,K-out和K-neighbor查询(HugeGraph-944)
  • 增加RESTful API支持按id列表批量查询顶点(HugeGraph-1153)
  • 支持迭代获取全部的顶点和边,使用分页实现(HugeGraph-1166)
  • 顶点id中包含 / % 等 URL 保留字符时通过 VertexAPI 查不出来(HugeGraph-1127)
  • 批量插入边时是否检查vertex的RESTful API参数从checkVertex改为check_vertex (HugeGraph-81)

BUG修复

  • hasId()无法正确匹配LongId(HugeGraph-1083)

Core

功能更新

  • RocksDB支持常用配置项(HugeGraph-1068)
  • 支持插入、删除、更新等操作的限速(HugeGraph-1071)
  • 支持RocksDB导入sst文件方案(HugeGraph-1077)
  • 增加MySQL后端存储(HugeGraph-1091)
  • 增加Palo后端存储(HugeGraph-1092)
  • 增加开关:支持是否构建顶点/边的label index(HugeGraph-1085)
  • 支持API分页获取数据(HugeGraph-1105)
  • RocksDB配置的数据存放目录如果不存在则自动创建(HugeGraph-1135)
  • 增加高级遍历函数shortest path、K-neighbor,K-out和按id列表批量查询顶点(HugeGraph-944)
  • init-store.sh增加超时重试机制(HugeGraph-1150)
  • 将边表拆分两个表:OUT表、IN表(HugeGraph-1002)
  • 限制顶点ID最大长度为128字节(HugeGraph-1168)
  • Cassandra通过压缩数据(可配置snappy、lz4)进行优化(HugeGraph-428)
  • 支持IN和OR操作(HugeGraph-137)
  • 支持RocksDB并行写多个磁盘(HugeGraph-1177)
  • MySQL通过批量插入进行性能优化(HugeGraph-1188)

BUG修复

  • Kryo系列化多线程时异常(HugeGraph-1066)
  • RocksDB索引内容中重复写了两次elem-id(HugeGraph-1094)
  • SnowflakeIdGenerator.instance在多线程环境下可能会初始化多个实例(HugeGraph-1095)
  • 如果查询边的顶点但顶点不存在时,异常信息不够明确(HugeGraph-1101)
  • RocksDB配置了多个图时,init-store失败(HugeGraph-1151)
  • 无法支持 Date 类型的属性值(HugeGraph-1165)
  • 创建了系统内部索引,但无法根据其进行搜索(HugeGraph-1167)
  • 拆表后根据label删除边时,edge-in表中的记录未被删除成功(HugeGraph-1182)

测试

  • 增加配置项:vertex.force_id_string,跑 tinkerpop 测试时打开(HugeGraph-1069)

内部修改

  • common库OptionChecker增加allowValues()函数用于枚举值(HugeGraph-1075)
  • 清理无用、版本老旧的依赖包,减少打包的压缩包的大小(HugeGraph-1078)
  • HugeConfig通过文件路径构造时,无法检查多次配置的配置项的值(HugeGraph-1079)
  • Server启动时可以支持智能分配最大内存(HugeGraph-1154)
  • 修复Mac OS因为不支持free命令导致无法启动server的问题(HugeGraph-1154)
  • 修改配置项的注册方式为字符串式,避免直接依赖Backend包(HugeGraph-1171)
  • 增加StoreDumper工具以查看后端存储的数据内容(HugeGraph-1172)
  • Jenkins把所有与内部服务器有关的构建机器信息都参数化传入(HugeGraph-1179)
  • 将RestClient移到common模块,令server和client都依赖common(HugeGraph-1183)
  • 增加配置项dump工具ConfDumper(HugeGraph-1193)

9.10 - HugeGraph 0.4.4 Release Notes

API & Java Client

功能更新

  • HugeGraph-Server支持WebSocket,能用Gremlin-Console连接使用;并支持直接编写groovy脚本调用Core的代码(HugeGraph-977)
  • 适配Schema-id(HugeGraph-1038)

BUG修复

  • hugegraph-0.3.3:删除vertex的属性,body中properties=null,返回500,空指针(HugeGraph-950)
  • hugegraph-0.3.3: graph.schema().getVertexLabel() 空指针(HugeGraph-955)
  • HugeGraph-Client 中顶点和边的属性集合不是线程安全的(HugeGraph-1013)
  • 批量操作的异常信息无法打印(HugeGraph-1013)
  • 异常message提示可读性太差,都是用propertyKey的id显示,对于用户来说无法立即识别(HugeGraph-1055)
  • 批量新增vertex实体,有一个body体为null,返回500,空指针(HugeGraph-1056)
  • 追加属性body体中只包含properties,功能出现回退,抛出异常The label of vertex can’t be null(HugeGraph-1057)
  • HugeGraph-Client适配:PropertyKey的DateType中Timestamp替换成Date(HugeGraph-1059)
  • 创建IndexLabel时baseValue为空会报出500错误(HugeGraph-1061)

Core

功能更新

  • 实现上层独立事务管理,并兼容tinkerpop事务规范(HugeGraph-918、HugeGraph-941)
  • 完善memory backend,可以通过API正确访问,且适配了tinkerpop事务(HugeGraph-41)
  • 增加RocksDB后端存储驱动框架(HugeGraph-929)
  • RocksDB数字索引range-query实现(HugeGraph-963)
  • 为所有的schema增加了id,并将各表原依赖name的列也换成id(HugeGraph-589)
  • 填充query key-value条件时,value的类型如果不匹配key定义的类型时需要转换为该类型(HugeGraph-964)
  • 统一各后端的offset、limit实现(HugeGraph-995)
  • 查询顶点、边时,Core支持迭代方式返回结果,而非一次性载入内存(HugeGraph-203)
  • memory backend支持range query(HugeGraph-967)
  • memory backend的secondary的支持方式从遍历改为IdQuery(HugeGraph-996)
  • 联合索引支持复杂的(只要逻辑上可以查都支持)多种索引组合查询(HugeGraph-903)
  • Schema中增加存储用户数据的域(map)(HugeGraph-902)
  • 统一ID的解析及系列化(包括API及Backend)(HugeGraph-965)
  • RocksDB没有keyspace概念,需要完善对多图实例的支持(HugeGraph-973)
  • 支持Cassandra设置连接用户名密码(HugeGraph-999)
  • Schema缓存支持缓存所有元数据(get-all-schema)(HugeGraph-1037)
  • 目前依然保持schema对外暴露name,暂不直接使用schema id(HugeGraph-1032)
  • 用户传入ID的策略的修改为支持String和Number(HugeGraph-956)

BUG修复

  • 删除旧的前缀indexLabel时数据库中的schemaLabel对象还有残留(HugeGraph-969)
  • HugeConfig解析时共用了公共的Option,导致不同graph的配置项有覆盖(HugeGraph-984)
  • 数据库数据不兼容时,提示更加友好的异常信息(HugeGraph-998)
  • 支持Cassandra设置连接用户名密码(HugeGraph-999)
  • RocksDB deleteRange end溢出后触发RocksDB assert错误(HugeGraph-971)
  • 允许根据null值id进行查询顶点/边,返回结果为空集合(HugeGraph-1045)
  • 内存中存在部分更新数据未提交时,搜索结果不对(HugeGraph-1046)
  • g.V().hasLabel(XX)传入不存在的label时报错: Internal Server Error and Undefined property key: ‘~label’(HugeGraph-1048)
  • gremlin获取的的schema只剩下名称字符串(HugeGraph-1049)
  • 大量数据情况下无法进行count操作(HugeGraph-1051)
  • RocksDB持续插入6~8千万条边时卡住(HugeGraph-1053)
  • 整理属性类型的支持,并在BinarySerializer中使用二进制格式系列化属性值(HugeGraph-1062)

测试

  • 增加tinkerpop的performance测试(HugeGraph-987)

内部修改

  • HugeFactory打开同一个图(name相同者)时,共用HugeGraph对象即可(HugeGraph-983)
  • 规范索引类型命名secondary、range、search(HugeGraph-991)
  • 数据库数据不兼容时,提示更加友好的异常信息(HugeGraph-998)
  • IO部分的 gryo 和 graphson 的module分开(HugeGraph-1041)
  • 增加query性能测试到PerfExample中(HugeGraph-1044)
  • 关闭gremlin-server的metric日志(HugeGraph-1050)

9.11 - HugeGraph 0.3.3 Release Notes

API & Java Client

功能更新

  • 为vertex-label和edge-label增加可空属性集合,允许在create和append时指定(HugeGraph-245)
  • 配合core的功能为用户提供tinkerpop variables RESTful API(HugeGraph-396)
  • 支持顶点/边属性的更新和删除(HugeGraph-894)
  • 支持顶点/边的条件查询(HugeGraph-919)

BUG修复

  • HugeGraph-API接收的RequestBody为null或"“时抛出空指针异常(HugeGraph-795)
  • 为HugeGraph-API添加输入参数检查,避免抛出空指针异常(HugeGraph-796 ~ HugeGraph-798,HugeGraph-802,HugeGraph-808 ~ HugeGraph-814,HugeGraph-817,HugeGraph-823,HugeGraph-860)
  • 创建缺失outV-label 或者 inV-label的实体边,依然能够被创建成功,不符合需求(HugeGraph-835)
  • 创建vertex-label和edge-label时可以任意传入index-names(HugeGraph-837)
  • 创建index,base-type=“VERTEX”等值(期望VL、EL),返回500(HugeGraph-846)
  • 创建index,base-type和base-value不匹配,提示不友好(HugeGraph-848)
  • 删除已经不存在的两个实体之间的关系,schema返回204,顶点和边类型的则返回404(期望统一为404)(HugeGraph-853,HugeGraph-854)
  • 给vertex-label追加属性,缺失id-strategy,返回信息有误(HugeGraph-861)
  • 给edge-label追加属性,name缺失,提示信息有误(HugeGraph-862)
  • 给edge-label追加属性,source-label为“null”,提示信息有误(HugeGraph-863)
  • 查询时的StringId如果为空字符串应该抛出异常(HugeGraph-868)
  • 通Rest API创建两个顶点之间的边,在studio中通过g.V()则刚新创建的边则不显示,g.E()则能够显示新创建的边(HugeGraph-869)
  • HugeGraph-Server的内部错误500,不应该将stack trace返回给Client(HugeGraph-879)
  • addEdge传入空的id字符串时会抛出非法参数异常(HugeGraph-885)
  • HugeGraph-Client 的 Gremlin 查询结果在解析 Path 时,如果不包含Vertex/Edge会反序列化异常(HugeGraph-891)
  • 枚举HugeKeys的字符串变成小写字母加下划线,导致API序列化时字段名与类中变量名不一致,进而序列化失败(HugeGraph-896)
  • 增加边到不存在的顶点时返回404(期望400)(HugeGraph-922)

Core

功能更新

  • 支持对顶点/边属性(包括索引列)的更新操作(HugeGraph-369)
  • 索引field为空或者空字符串的支持(hugegraph-553和hugegraph-288)
  • vertex/edge的属性一致性保证推迟到实际要访问属性时(hugegraph-763)
  • 增加ScyllaDB后端驱动(HugeGraph-772)
  • 支持tinkerpop的hasKey、hasValue查询(HugeGraph-826)
  • 支持tinkerpop的variables功能(HugeGraph-396)
  • 以“~”为开头的为系统隐藏属性,用户不可以创建(HugeGraph-842)
  • 增加Backend Features以兼容不同后端的特性(HugeGraph-844)
  • 对mutation的update可能出现的操作不直接抛错,进行细化处理(HugeGraph-887)
  • 对append到vertex-label/edge-label的property检查,必须是nullable的(HugeGraph-890)
  • 对于按照id查询,当有的id不存在时,返回其余存在的对象,而非直接抛异常(HugeGraph-900)

BUG修复

  • Vertex.edges(Direction.BOTH,…) assert error(HugeGraph-661)
  • 无法支持在addVertex函数中对同一property(single)多次赋值(HugeGraph-662)
  • 更新属性时不涉及更新的索引列会丢失(HugeGraph-801)
  • GraphTransaction中的ConditionQuery需要索引查询时,没有触发commit,导致查询失败(HugeGraph-805)
  • Cassandra不支持query offset,查询时limit=offset+limit取回所有记录后过滤(HugeGraph-851)
  • 多个插入操作加上一个删除操作,插入操作会覆盖删除操作(HugeGraph-857)
  • 查询时的StringId如果为空字符串应该抛出异常(HugeGraph-868)
  • 元数据schema方法只返回 hidden 信息(HugeGraph-912)

测试

  • tinkerpop的structure和process测试使用不同的keyspace(HugeGraph-763)
  • 将tinkerpop测试和unit测试添加到流水线release-after-merge中(HugeGraph-763)
  • jenkins脚本分离各阶段子脚本,修改项目中的子脚本即可生效构建(HugeGraph-800)
  • 增加clear backends功能,在tinkerpop suite运行完成后清除后端(HugeGraph-852)
  • 增加BackendMutation的测试(HugeGraph-801)
  • 多线程操作图时可能抛出NoHostAvailableException异常(HugeGraph-883)

内部修改

  • 调整HugeGraphServer和HugeGremlinServer启动时JVM的堆内存初始为256M,最大为2048M(HugeGraph-218)
  • 创建Cassandra Table时,使用schemaBuilder代替字符串拼接(hugegraph-773)
  • 运行测试用例时如果初始化图失败(比如数据库连接不上),clear()报错(HugeGraph-910)
  • Example抛异常 Need to specify a readable config file rather than…(HugeGraph-921)
  • HugeGraphServer和HugeGreminServer的缓存保持同步(HugeGraph-569)

9.12 - HugeGraph 0.2 Release Notes

API & Java Client

功能更新

0.2版实现了图数据库基本功能,提供如下功能:

元数据(Schema)

顶点类型(Vertex Label)

  • 创建顶点类型
  • 删除顶点类型
  • 查询顶点类型
  • 增加顶点类型的属性

边类型(Edge Label)

  • 创建边类型
  • 删除边类型
  • 查询边类型
  • 增加边类型的属性

属性(Property Key)

  • 创建属性
  • 删除属性
  • 查询属性

索引(Index Label)

  • 创建索引
  • 删除索引
  • 查询索引

元数据检查

  • 元数据依赖的其它元数据检查(如Vertex Label依赖Property Key)
  • 数据依赖的元数据检查(如Vertex依赖Vertex Label)

图数据

顶点(Vertex)

  • 增加顶点

  • 删除顶点

  • 增加顶点属性

  • 删除顶点属性(必须为非索引列)

  • 批量插入顶点

  • 查询

  • 批量查询

  • 顶点ID策略

    • 用户指定ID(字符串)
    • 用户指定某些属性组合作为ID(拼接为可见字符串)
    • 自动生成ID

边(Edge)

  • 增加边
  • 增加多条同类型边到指定的两个节点(SortKey)
  • 删除边
  • 增加边属性
  • 删除边属性(必须为非索引列)
  • 批量插入边
  • 查询
  • 批量查询

顶点/边属性

  • 属性类型支持

    • text
    • boolean
    • byte、blob
    • int、long
    • float、double
    • timestamp
    • uuid
  • 支持单值属性

  • 支持多值属性:List、Set(注意:非嵌套属性

事务

  • 原子性级别保证(依赖后端
  • 自动提交事务
  • 手动提交事务
  • 并行事务

索引

索引类型

  • 二级索引
  • 范围索引(数字类型)

索引操作

  • 为指定类型的顶点/边创建单列索引(不支持List或Set列创建索引)
  • 为指定类型的顶点/边创建复合索引(不支持List或Set列创建索引,复合索引为前缀索引)
  • 删除指定类型顶点/边的索引(部分或全部索引均可)
  • 重建指定类型顶点/边的索引(部分或全部索引均可)

查询/遍历

  • 列出所有元数据、图数据(支持Limit,不支持分页)

  • 根据ID查询元数据、图数据

  • 根据指定属性的值查询图数据

  • 根据指定属性的值范围查询图数据(属性必须为数字类型)

  • 根据指定顶点/边类型、指定属性的值查询顶点/边

  • 根据指定顶点/边类型、指定属性的值范围查询顶点(属性必须为数字类型)

  • 根据顶点类型(Vertex Label)查询顶点

  • 根据边类型(Edge Label)查询边

  • 根据顶点查询边

    • 查询顶点的所有边
    • 查询顶点的指定方向边(出边、入边)
    • 查询顶点的指定方向、指定类型边
    • 查询两个顶点的同类型边中的某条边(SortKey)
  • 标准Gremlin遍历

缓存

可缓存内容

  • 元数据缓存
  • 顶点缓存

缓存特性

  • LRU策略
  • 高性能并发访问
  • 支持超时过期机制

接口(RESTful API)

  • 版本号接口
  • 图实例接口
  • 元数据接口
  • 图数据接口
  • Gremlin接口

更多细节详见API文档

后端支持

支持Cassandra后端

  • 持久化
  • CQL3
  • 集群

支持Memory后端(仅用于测试)

  • 非持久化
  • 部分特性无法支持(如:更新边属性、根据边类型查询边)

其它

支持配置项

  • 后端存储类型
  • 序列化方式
  • 缓存参数

支持多图实例

  • 静态方式(增加多个图配置文件

版本检查

  • 内部依赖包匹配版本检查
  • API匹配版本检查

9.13 - HugeGraph 0.2.4 Release Notes

API & Java Client

功能更新

元数据(Schema)相关

BUG修复

  • Vertex Label为非primary-key id策略应该允许属性为空(HugeGraph-651)
  • Gremlin-Server 序列化的 EdgeLabel 仅有一个directed 属性,应该打印完整的schema描述(HugeGraph-680)
  • 创建IndexLabel时使用不存在的属性抛出空指针异常,应该抛非法参数异常(HugeGraph-682)
  • 创建schema如果已经存在并指定了ifNotExist时,结果应该返回原来的对象(HugeGraph-694)
  • 由于EdgeLabel的Frequency默认为null以及不允许修改特性,导致Append操作传递null值在API层反序列化失败(HugeGraph-729)
  • 增加对schema名称的正则检查配置项,默认不允许为全空白字符(HugeGraph-727)
  • 中文名的schema在前端显示为乱码(HugeGraph-711)

图数据(Vertex、Edge)相关

功能更新

  • DataType支持Array,并且List类型除了一个一个添加object,也需要支持直接赋值List对象(HugeGraph-719)
  • 自动生成的顶点id由十进制改为十六进制(字符串存储时)(HugeGraph-785)

BUG修复

  • HugeGraph-API的VertexLabel/EdgeLabel API未提供eliminate接口(HugeGraph-614)
  • 增加非primary-key id策略的顶点时,如果属性为空无法插入到数据库中(HugeGraph-652)
  • 使用HugeGraph-Client的gremlin发送无返回值groovy请求时,由于gremlin-server将无返回值序列化为null,导致前端迭代结果集时出现空指针异常(HugeGraph-664)
  • RESTful API在没有找到对应id的vertex/edge时返回500(HugeGraph-734)
  • HugeElement/HugeProperty的equals()与tinkerpop不兼容(HugeGraph-653)
  • HugeEdgeProperty的property的equals函数与tinkerpop兼容 (HugeGraph-740)
  • HugeElement/HugeVertexProperty的hashcode函数与tinkerpop不兼容(HugeGraph-728)
  • HugeVertex/HugeEdge的toString函数与tinkerpop不兼容(HugeGraph-665)
  • 与tinkerpop的异常不兼容,包括IllegalArgumentsException和UnsupportedOperationException(HugeGraph-667)
  • 通过id无法找到element时,抛出的异常类型与tinkerpop不兼容(HugeGraph-689)
  • vertex.addEdge没有检查properties的数目是否为2的倍数(HugeGraph-716)
  • vertex.addEdge()时,assignId调用时机太晚,导致vertex的Set中有重复的edge(HugeGraph-666)
  • 查询时包含大于等于三层逻辑嵌套时,会抛出ClassCastException,现改成抛出非法参数异常(HugeGraph-481)
  • 边查询如果同时包含source-vertex/direction和property作为条件,查询结果错误(HugeGraph-749)
  • HugeGraph-Server 在运行时如果 cassandra 宕掉,插入或查询操作时会抛出DataStax的异常以及详细的调用栈(HugeGraph-771)
  • 删除不存在的 indexLabel 时会抛出异常,而删除其他三种元数据(不存在的)则不会(HugeGraph-782)
  • 当传给EdgeApi的源顶点或目标顶点的id非法时,会因为查询不到该顶点向客户端返回404状态码(HugeGraph-784)
  • 提供内部使用获取元数据的接口,使SchemaManager仅为外部使用,当获取不存在的schema时抛出NotFoundException异常(HugeGraph-743)
  • HugeGraph-Client 创建/添加/移除 元数据都应该返回来自服务端的结果(HugeGraph-760)
  • 创建HugeGraph-Client时如果输入了错误的主机会导致进程阻塞,无法响应(HugeGraph-718)

查询、索引、缓存相关

功能更新

  • 缓存更新更加高效的锁方案(HugeGraph-555)
  • 索引查询增加支持只有一个元素的IN语句(原来仅支持EQ)(HugeGraph-739)

BUG修复

  • 防止请求数据量过大时服务本身hang住(HugeGraph-777)

其它

功能更新

  • 使Init-Store仅用于初始化数据库,清空后端由独立脚本实现(HugeGraph-650)

BUG修复

  • 单元测试跑完后在测试机上遗留了临时的keyspace(HugeGraph-611)
  • Cassandra的info日志信息过多,将大部分修改为debug级别(HugeGraph-722)
  • EventHub.containsListener(String event)判断逻辑有遗漏(HugeGraph-732)
  • EventHub.listeners/unlisten(String event)当没有对应event的listener时会抛空指针异常(HugeGraph-733)

测试

Tinkerpop合规测试

  • 增加自定义ignore机制,规避掉暂时不需要加入持续集成的测试用例(HugeGraph-647)
  • 为TestGraph注册GraphSon和Kryo序列化器,实现 IdGenerator$StringId 的 graphson-v1、graphson-v2 和 Kryo的序列化与反序列化(HugeGraph-660)
  • 增加了可配置的测试用例过滤器,使得tinkerpop测试可以用在开发分支和发布分支的回归测试中
  • 将tinkerpop测试通过配置文件,加入到回归测试中

单元测试

  • 增加Cache及Event的单元测试(HugeGraph-659)
  • HugeGraph-Client 增加API的测试(99个)
  • HugeGraph-Client 增加单元测试,包括RestResult反序列化的单测(12个)

内部修改

  • 改进LOG变量方面代码(HugeGraph-623/HugeGraph-631)
  • License格式调整(HugeGraph-625)
  • 将序列化器中持有的graph抽离,要用到graph的函数通过传参数实现 (HugeGraph-750)

10 - Contribution Guidelines

10.1 - 如何参与 HugeGraph 社区

TODO: translate this article to Chinese

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

建议: 使用 GitHub desktop 可以大幅简化和改善你提交 PR/Commit 的过程, 特别适合新人

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
     git clone https://github.com/${GITHUB_USER_NAME}/hugegraph
     
  4. Configure local HugeGraph repo

    cd hugegraph
     
    @@ -6462,7 +6462,7 @@
     # set name and email to push code to github
     git config user.name "{full-name}" # like "Jermy Li"
     git config user.email "{email-address-of-github}" # like "jermy@apache.org"
    -

Optional: You can use GitHub desktop to greatly simplify the commit and update process.

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
+

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
 git checkout master
 # pull the latest code from official hugegraph
 git pull hugegraph
@@ -6484,7 +6484,7 @@
 

Please remember to fill in the issue id, which was generated by GitHub after issue creation.

3.4 Push commit to GitHub fork repo

Push the local commit to GitHub fork repo:

# push the local commit to fork repo
 git push origin bugfix-branch:bugfix-branch
 

Note that since GitHub requires submitting code through username + token (instead of using username + password directly), you need to create a GitHub token from https://github.com/settings/tokens: -image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:

I have read the CLA Document and I hereby sign the CLA

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: +image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: image

5. Code review

Maintainers will start the code review after all the automatic checks are passed:

  • Check: Contributor License Agreement is signed
  • Check: Travis CI builds is passed (automatically Test and Deploy)

The commit will be accepted and merged if there is no problem after review.

Please click on “Details” to find the problem if any check does not pass.

If there are checks not passed or changes requested, then continue to modify the code and push again.

6. More changes after review

If we have not passed the review, don’t be discouraged. Usually a commit needs to be reviewed several times before being accepted! Please follow the review comments and make further changes.

After the further changes, we submit them to the local repo:

# commit all updated files in a new commit,
 # please feel free to enter any appropriate commit message, note that
 # we will squash all commits in the pull request as one commit when
diff --git a/cn/docs/contribution-guidelines/_print/index.html b/cn/docs/contribution-guidelines/_print/index.html
index d93aaeb02..31834bc18 100644
--- a/cn/docs/contribution-guidelines/_print/index.html
+++ b/cn/docs/contribution-guidelines/_print/index.html
@@ -1,6 +1,6 @@
 Contribution Guidelines | HugeGraph
 

1 - 如何参与 HugeGraph 社区

TODO: translate this article to Chinese

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
    +Click here to print.

    Return to the regular view of this page.

    Contribution Guidelines

1 - 如何参与 HugeGraph 社区

TODO: translate this article to Chinese

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

建议: 使用 GitHub desktop 可以大幅简化和改善你提交 PR/Commit 的过程, 特别适合新人

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
     git clone https://github.com/${GITHUB_USER_NAME}/hugegraph
     
  4. Configure local HugeGraph repo

    cd hugegraph
     
    @@ -10,7 +10,7 @@
     # set name and email to push code to github
     git config user.name "{full-name}" # like "Jermy Li"
     git config user.email "{email-address-of-github}" # like "jermy@apache.org"
    -

Optional: You can use GitHub desktop to greatly simplify the commit and update process.

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
+

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
 git checkout master
 # pull the latest code from official hugegraph
 git pull hugegraph
@@ -32,7 +32,7 @@
 

Please remember to fill in the issue id, which was generated by GitHub after issue creation.

3.4 Push commit to GitHub fork repo

Push the local commit to GitHub fork repo:

# push the local commit to fork repo
 git push origin bugfix-branch:bugfix-branch
 

Note that since GitHub requires submitting code through username + token (instead of using username + password directly), you need to create a GitHub token from https://github.com/settings/tokens: -image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:

I have read the CLA Document and I hereby sign the CLA

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: +image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: image

5. Code review

Maintainers will start the code review after all the automatic checks are passed:

  • Check: Contributor License Agreement is signed
  • Check: Travis CI builds is passed (automatically Test and Deploy)

The commit will be accepted and merged if there is no problem after review.

Please click on “Details” to find the problem if any check does not pass.

If there are checks not passed or changes requested, then continue to modify the code and push again.

6. More changes after review

If we have not passed the review, don’t be discouraged. Usually a commit needs to be reviewed several times before being accepted! Please follow the review comments and make further changes.

After the further changes, we submit them to the local repo:

# commit all updated files in a new commit,
 # please feel free to enter any appropriate commit message, note that
 # we will squash all commits in the pull request as one commit when
diff --git a/cn/docs/contribution-guidelines/contribute/index.html b/cn/docs/contribution-guidelines/contribute/index.html
index eec0ae726..31bc7be37 100644
--- a/cn/docs/contribution-guidelines/contribute/index.html
+++ b/cn/docs/contribution-guidelines/contribute/index.html
@@ -4,22 +4,25 @@
 Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be …">
 

如何参与 HugeGraph 社区

TODO: translate this article to Chinese

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
    + Print entire section

    如何参与 HugeGraph 社区

    TODO: translate this article to Chinese

    Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

    The following is a contribution guide for HugeGraph:

    image

    1. Preparation

    建议: 使用 GitHub desktop 可以大幅简化和改善你提交 PR/Commit 的过程, 特别适合新人

    We can contribute by reporting issues, submitting code patches or any other feedback.

    Before submitting the code, we need to do some preparation:

    1. Sign up or login to GitHub: https://github.com

    2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

    3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

      # clone code from remote to local repo
       git clone https://github.com/${GITHUB_USER_NAME}/hugegraph
       
    4. Configure local HugeGraph repo

      cd hugegraph
       
      @@ -29,7 +32,7 @@
       # set name and email to push code to github
       git config user.name "{full-name}" # like "Jermy Li"
       git config user.email "{email-address-of-github}" # like "jermy@apache.org"
      -

    Optional: You can use GitHub desktop to greatly simplify the commit and update process.

    2. Create an Issue on GitHub

    If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

    3. Make changes of code locally

    3.1 Create a new branch

    Please don’t use master branch for development. We should create a new branch instead:

    # checkout master branch
    +

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
 git checkout master
 # pull the latest code from official hugegraph
 git pull hugegraph
@@ -51,7 +54,7 @@
 

Please remember to fill in the issue id, which was generated by GitHub after issue creation.

3.4 Push commit to GitHub fork repo

Push the local commit to GitHub fork repo:

# push the local commit to fork repo
 git push origin bugfix-branch:bugfix-branch
 

Note that since GitHub requires submitting code through username + token (instead of using username + password directly), you need to create a GitHub token from https://github.com/settings/tokens: -image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:

I have read the CLA Document and I hereby sign the CLA

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: +image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: image

5. Code review

Maintainers will start the code review after all the automatic checks are passed:

  • Check: Contributor License Agreement is signed
  • Check: Travis CI builds is passed (automatically Test and Deploy)

The commit will be accepted and merged if there is no problem after review.

Please click on “Details” to find the problem if any check does not pass.

If there are checks not passed or changes requested, then continue to modify the code and push again.

6. More changes after review

If we have not passed the review, don’t be discouraged. Usually a commit needs to be reviewed several times before being accepted! Please follow the review comments and make further changes.

After the further changes, we submit them to the local repo:

# commit all updated files in a new commit,
 # please feel free to enter any appropriate commit message, note that
 # we will squash all commits in the pull request as one commit when
@@ -65,7 +68,7 @@
 git rebase -i master
 

And push it to GitHub fork repo again:

# force push the local commit to fork repo
 git push -f origin bugfix-branch:bugfix-branch
-

GitHub will automatically update the Pull Request after we push it, just wait for code review.


+

GitHub will automatically update the Pull Request after we push it, just wait for code review.


diff --git a/cn/docs/contribution-guidelines/index.xml b/cn/docs/contribution-guidelines/index.xml index 757af8ba7..bbe03726e 100644 --- a/cn/docs/contribution-guidelines/index.xml +++ b/cn/docs/contribution-guidelines/index.xml @@ -6,6 +6,7 @@ <p>The following is a contribution guide for HugeGraph:</p> <img width="884" alt="image" src="https://user-images.githubusercontent.com/9625821/159643158-8bf72c0a-93c3-4a58-8912-7b2ab20ced1d.png"> <h2 id="1-preparation">1. Preparation</h2> +<p><strong>建议</strong>: 使用 <a href="https://desktop.github.com/">GitHub desktop</a> 可以大幅简化和改善你提交 PR/Commit 的过程, 特别适合新人</p> <p>We can contribute by reporting issues, submitting code patches or any other feedback.</p> <p>Before submitting the code, we need to do some preparation:</p> <ol> @@ -32,7 +33,6 @@ </span></span><span style="display:flex;"><span>git config user.email <span style="color:#4e9a06">&#34;{email-address-of-github}&#34;</span> <span style="color:#8f5902;font-style:italic"># like &#34;jermy@apache.org&#34;</span> </span></span></code></pre></div></li> </ol> -<p>Optional: You can use <a href="https://desktop.github.com/">GitHub desktop</a> to greatly simplify the commit and update process.</p> <h2 id="2-create-an-issue-on-github">2. Create an Issue on GitHub</h2> <p>If you encounter bugs or have any questions, please go to <a href="https://github.com/apache/incubator-hugegraph/issues">GitHub Issues</a> to report them and feel free to <a href="https://github.com/apache/hugegraph/issues/new">create an issue</a>.</p> <h2 id="3-make-changes-of-code-locally">3. Make changes of code locally</h2> @@ -89,8 +89,6 @@ <img width="1280" alt="image" src="https://user-images.githubusercontent.com/9625821/163524204-7fe0e6bf-9c8b-4b1a-ac65-6a0ac423eb16.png"></p> <h2 id="4-create-a-pull-request">4. Create a Pull Request</h2> <p>Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button &ldquo;Compare &amp; pull request&rdquo; to do it. Then edit the description for proposed changes, which can just be copied from the commit message.</p> -<p>Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:</p> -<p><code>I have read the CLA Document and I hereby sign the CLA</code></p> <p>Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to <a href="https://github.com/settings/emails">https://github.com/settings/emails</a>: <img width="1280" alt="image" src="https://user-images.githubusercontent.com/9625821/163522445-2a50a72a-dea2-434f-9868-3a0d40d0d037.png"></p> <h2 id="5-code-review">5. Code review</h2> diff --git a/cn/docs/index.xml b/cn/docs/index.xml index bf520a0df..7d9ed5938 100644 --- a/cn/docs/index.xml +++ b/cn/docs/index.xml @@ -2306,6 +2306,7 @@ HugeGraph支持多用户并行操作,用户可输入Gremlin查询语句,并 <p>The following is a contribution guide for HugeGraph:</p> <img width="884" alt="image" src="https://user-images.githubusercontent.com/9625821/159643158-8bf72c0a-93c3-4a58-8912-7b2ab20ced1d.png"> <h2 id="1-preparation">1. Preparation</h2> +<p><strong>建议</strong>: 使用 <a href="https://desktop.github.com/">GitHub desktop</a> 可以大幅简化和改善你提交 PR/Commit 的过程, 特别适合新人</p> <p>We can contribute by reporting issues, submitting code patches or any other feedback.</p> <p>Before submitting the code, we need to do some preparation:</p> <ol> @@ -2332,7 +2333,6 @@ HugeGraph支持多用户并行操作,用户可输入Gremlin查询语句,并 </span></span><span style="display:flex;"><span>git config user.email <span style="color:#4e9a06">&#34;{email-address-of-github}&#34;</span> <span style="color:#8f5902;font-style:italic"># like &#34;jermy@apache.org&#34;</span> </span></span></code></pre></div></li> </ol> -<p>Optional: You can use <a href="https://desktop.github.com/">GitHub desktop</a> to greatly simplify the commit and update process.</p> <h2 id="2-create-an-issue-on-github">2. Create an Issue on GitHub</h2> <p>If you encounter bugs or have any questions, please go to <a href="https://github.com/apache/incubator-hugegraph/issues">GitHub Issues</a> to report them and feel free to <a href="https://github.com/apache/hugegraph/issues/new">create an issue</a>.</p> <h2 id="3-make-changes-of-code-locally">3. Make changes of code locally</h2> @@ -2389,8 +2389,6 @@ HugeGraph支持多用户并行操作,用户可输入Gremlin查询语句,并 <img width="1280" alt="image" src="https://user-images.githubusercontent.com/9625821/163524204-7fe0e6bf-9c8b-4b1a-ac65-6a0ac423eb16.png"></p> <h2 id="4-create-a-pull-request">4. Create a Pull Request</h2> <p>Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button &ldquo;Compare &amp; pull request&rdquo; to do it. Then edit the description for proposed changes, which can just be copied from the commit message.</p> -<p>Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:</p> -<p><code>I have read the CLA Document and I hereby sign the CLA</code></p> <p>Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to <a href="https://github.com/settings/emails">https://github.com/settings/emails</a>: <img width="1280" alt="image" src="https://user-images.githubusercontent.com/9625821/163522445-2a50a72a-dea2-434f-9868-3a0d40d0d037.png"></p> <h2 id="5-code-review">5. Code review</h2> diff --git a/cn/sitemap.xml b/cn/sitemap.xml index 2f2860fa4..a651944c5 100644 --- a/cn/sitemap.xml +++ b/cn/sitemap.xml @@ -1 +1 @@ -/cn/docs/guides/architectural/2023-06-25T21:06:07+08:00/cn/docs/config/config-guide/2023-06-21T14:48:04+08:00/cn/docs/language/hugegraph-gremlin/2023-01-01T16:16:43+08:00/cn/docs/performance/hugegraph-benchmark-0.5.6/2022-09-15T15:16:23+08:00/cn/docs/quickstart/hugegraph-server/2023-06-25T21:06:07+08:00/cn/docs/introduction/readme/2023-06-18T14:57:33+08:00/cn/docs/changelog/hugegraph-1.0.0-release-notes/2023-01-09T07:41:46+08:00/cn/docs/clients/restful-api/2023-07-31T23:55:30+08:00/cn/docs/clients/restful-api/schema/2023-05-14T19:35:13+08:00/cn/docs/performance/api-preformance/hugegraph-api-0.5.6-rocksdb/2023-01-01T16:16:43+08:00/cn/docs/contribution-guidelines/contribute/2023-06-26T14:59:53+08:00/cn/docs/config/config-option/2023-02-08T20:56:09+08:00/cn/docs/guides/desgin-concept/2022-04-17T11:36:55+08:00/cn/docs/download/download/2023-06-17T14:43:04+08:00/cn/docs/language/hugegraph-example/2023-02-02T01:21:10+08:00/cn/docs/clients/hugegraph-client/2022-09-15T15:16:23+08:00/cn/docs/performance/api-preformance/2023-06-17T14:43:04+08:00/cn/docs/quickstart/hugegraph-loader/2023-05-17T23:12:35+08:00/cn/docs/clients/restful-api/propertykey/2023-05-19T05:15:56-05:00/cn/docs/changelog/hugegraph-0.11.2-release-notes/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.12.0-release-notes/2023-01-01T16:16:43+08:00/cn/docs/performance/api-preformance/hugegraph-api-0.5.6-cassandra/2023-01-01T16:16:43+08:00/cn/docs/contribution-guidelines/subscribe/2023-06-17T14:43:04+08:00/cn/docs/config/config-authentication/2022-04-17T11:36:55+08:00/cn/docs/clients/gremlin-console/2023-06-12T23:52:07+08:00/cn/docs/guides/custom-plugin/2022-09-15T15:16:23+08:00/cn/docs/performance/hugegraph-loader-performance/2022-04-17T11:36:55+08:00/cn/docs/quickstart/hugegraph-tools/2023-05-09T21:27:34+08:00/cn/docs/quickstart/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.10.4-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/vertexlabel/2022-04-17T11:36:55+08:00/cn/docs/contribution-guidelines/validate-release/2023-02-15T16:14:21+08:00/cn/docs/guides/backup-restore/2022-04-17T11:36:55+08:00/cn/docs/config/2022-04-17T11:36:55+08:00/cn/docs/config/config-https/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/edgelabel/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.9.2-release-notes/2022-04-17T11:36:55+08:00/cn/docs/quickstart/hugegraph-hubble/2023-01-01T16:16:43+08:00/cn/docs/contribution-guidelines/hugegraph-server-idea-setup/2023-06-25T21:06:07+08:00/cn/docs/clients/2022-04-17T11:36:55+08:00/cn/docs/config/config-computer/2023-01-01T16:16:43+08:00/cn/docs/guides/faq/2023-01-04T22:59:07+08:00/cn/docs/clients/restful-api/indexlabel/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.8.0-release-notes/2022-04-17T11:36:55+08:00/cn/docs/quickstart/hugegraph-client/2023-05-18T11:09:55+08:00/cn/docs/guides/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/rebuild/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.7.4-release-notes/2022-04-17T11:36:55+08:00/cn/docs/quickstart/hugegraph-computer/2023-06-25T21:06:46+08:00/cn/docs/language/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.6.1-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/vertex/2023-06-04T23:04:47+08:00/cn/docs/clients/restful-api/edge/2023-06-29T10:17:29+08:00/cn/docs/performance/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.5.6-release-notes/2022-04-17T11:36:55+08:00/cn/docs/changelog/2022-04-17T11:36:55+08:00/cn/docs/contribution-guidelines/2022-12-30T19:57:48+08:00/cn/docs/changelog/hugegraph-0.4.4-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/traverser/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/rank/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.3.3-release-notes/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.2-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/variable/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/graphs/2022-05-27T09:27:37+08:00/cn/docs/changelog/hugegraph-0.2.4-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/task/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/gremlin/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/cypher/2023-07-31T23:55:30+08:00/cn/docs/clients/restful-api/auth/2023-07-31T23:55:30+08:00/cn/docs/clients/restful-api/other/2023-07-31T23:55:30+08:00/cn/docs/2022-12-30T19:57:48+08:00/cn/blog/news/2022-04-17T11:36:55+08:00/cn/blog/releases/2022-04-17T11:36:55+08:00/cn/blog/2018/10/06/easy-documentation-with-docsy/2022-04-17T11:36:55+08:00/cn/blog/2018/10/06/the-second-blog-post/2022-04-17T11:36:55+08:00/cn/blog/2018/01/04/another-great-release/2022-04-17T11:36:55+08:00/cn/docs/cla/2022-04-17T11:36:55+08:00/cn/docs/performance/hugegraph-benchmark-0.4.4/2022-09-15T15:16:23+08:00/cn/docs/summary/2023-07-31T23:55:30+08:00/cn/blog/2022-04-17T11:36:55+08:00/cn/categories//cn/community/2022-04-17T11:36:55+08:00/cn/2023-01-04T22:59:07+08:00/cn/search/2022-04-17T11:36:55+08:00/cn/tags/ \ No newline at end of file +/cn/docs/guides/architectural/2023-06-25T21:06:07+08:00/cn/docs/config/config-guide/2023-06-21T14:48:04+08:00/cn/docs/language/hugegraph-gremlin/2023-01-01T16:16:43+08:00/cn/docs/performance/hugegraph-benchmark-0.5.6/2022-09-15T15:16:23+08:00/cn/docs/quickstart/hugegraph-server/2023-06-25T21:06:07+08:00/cn/docs/introduction/readme/2023-06-18T14:57:33+08:00/cn/docs/changelog/hugegraph-1.0.0-release-notes/2023-01-09T07:41:46+08:00/cn/docs/clients/restful-api/2023-07-31T23:55:30+08:00/cn/docs/clients/restful-api/schema/2023-05-14T19:35:13+08:00/cn/docs/performance/api-preformance/hugegraph-api-0.5.6-rocksdb/2023-01-01T16:16:43+08:00/cn/docs/contribution-guidelines/contribute/2023-09-09T20:50:32+08:00/cn/docs/config/config-option/2023-02-08T20:56:09+08:00/cn/docs/guides/desgin-concept/2022-04-17T11:36:55+08:00/cn/docs/download/download/2023-06-17T14:43:04+08:00/cn/docs/language/hugegraph-example/2023-02-02T01:21:10+08:00/cn/docs/clients/hugegraph-client/2022-09-15T15:16:23+08:00/cn/docs/performance/api-preformance/2023-06-17T14:43:04+08:00/cn/docs/quickstart/hugegraph-loader/2023-05-17T23:12:35+08:00/cn/docs/clients/restful-api/propertykey/2023-05-19T05:15:56-05:00/cn/docs/changelog/hugegraph-0.11.2-release-notes/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.12.0-release-notes/2023-01-01T16:16:43+08:00/cn/docs/performance/api-preformance/hugegraph-api-0.5.6-cassandra/2023-01-01T16:16:43+08:00/cn/docs/contribution-guidelines/subscribe/2023-06-17T14:43:04+08:00/cn/docs/config/config-authentication/2022-04-17T11:36:55+08:00/cn/docs/clients/gremlin-console/2023-06-12T23:52:07+08:00/cn/docs/guides/custom-plugin/2022-09-15T15:16:23+08:00/cn/docs/performance/hugegraph-loader-performance/2022-04-17T11:36:55+08:00/cn/docs/quickstart/hugegraph-tools/2023-05-09T21:27:34+08:00/cn/docs/quickstart/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.10.4-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/vertexlabel/2022-04-17T11:36:55+08:00/cn/docs/contribution-guidelines/validate-release/2023-02-15T16:14:21+08:00/cn/docs/guides/backup-restore/2022-04-17T11:36:55+08:00/cn/docs/config/2022-04-17T11:36:55+08:00/cn/docs/config/config-https/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/edgelabel/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.9.2-release-notes/2022-04-17T11:36:55+08:00/cn/docs/quickstart/hugegraph-hubble/2023-01-01T16:16:43+08:00/cn/docs/contribution-guidelines/hugegraph-server-idea-setup/2023-06-25T21:06:07+08:00/cn/docs/clients/2022-04-17T11:36:55+08:00/cn/docs/config/config-computer/2023-01-01T16:16:43+08:00/cn/docs/guides/faq/2023-01-04T22:59:07+08:00/cn/docs/clients/restful-api/indexlabel/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.8.0-release-notes/2022-04-17T11:36:55+08:00/cn/docs/quickstart/hugegraph-client/2023-05-18T11:09:55+08:00/cn/docs/guides/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/rebuild/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.7.4-release-notes/2022-04-17T11:36:55+08:00/cn/docs/quickstart/hugegraph-computer/2023-06-25T21:06:46+08:00/cn/docs/language/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.6.1-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/vertex/2023-06-04T23:04:47+08:00/cn/docs/clients/restful-api/edge/2023-06-29T10:17:29+08:00/cn/docs/performance/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.5.6-release-notes/2022-04-17T11:36:55+08:00/cn/docs/changelog/2022-04-17T11:36:55+08:00/cn/docs/contribution-guidelines/2022-12-30T19:57:48+08:00/cn/docs/changelog/hugegraph-0.4.4-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/traverser/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/rank/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.3.3-release-notes/2022-04-17T11:36:55+08:00/cn/docs/changelog/hugegraph-0.2-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/variable/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/graphs/2022-05-27T09:27:37+08:00/cn/docs/changelog/hugegraph-0.2.4-release-notes/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/task/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/gremlin/2022-04-17T11:36:55+08:00/cn/docs/clients/restful-api/cypher/2023-07-31T23:55:30+08:00/cn/docs/clients/restful-api/auth/2023-07-31T23:55:30+08:00/cn/docs/clients/restful-api/other/2023-07-31T23:55:30+08:00/cn/docs/2022-12-30T19:57:48+08:00/cn/blog/news/2022-04-17T11:36:55+08:00/cn/blog/releases/2022-04-17T11:36:55+08:00/cn/blog/2018/10/06/easy-documentation-with-docsy/2022-04-17T11:36:55+08:00/cn/blog/2018/10/06/the-second-blog-post/2022-04-17T11:36:55+08:00/cn/blog/2018/01/04/another-great-release/2022-04-17T11:36:55+08:00/cn/docs/cla/2022-04-17T11:36:55+08:00/cn/docs/performance/hugegraph-benchmark-0.4.4/2022-09-15T15:16:23+08:00/cn/docs/summary/2023-07-31T23:55:30+08:00/cn/blog/2022-04-17T11:36:55+08:00/cn/categories//cn/community/2022-04-17T11:36:55+08:00/cn/2023-01-04T22:59:07+08:00/cn/search/2022-04-17T11:36:55+08:00/cn/tags/ \ No newline at end of file diff --git a/docs/_print/index.html b/docs/_print/index.html index 2853b6ff1..ec4ef80ee 100644 --- a/docs/_print/index.html +++ b/docs/_print/index.html @@ -6460,7 +6460,7 @@
// what is the name of the brother and the name of the place? g.V(pluto).out('brother').as('god').out('lives').as('place').select('god','place').by('name') -

推荐使用HugeGraph-Studio 通过可视化的方式来执行上述代码。另外也可以通过HugeGraph-Client、HugeApi、GremlinConsole和GremlinDriver等多种方式执行上述代码。

3.2 总结

HugeGraph 目前支持 Gremlin 的语法,用户可以通过 Gremlin / REST-API 实现各种查询需求。

8 - PERFORMANCE

8.1 - HugeGraph BenchMark Performance

1 Test environment

1.1 Hardware information

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 Software information

1.2.1 Test cases

Testing is done using the graphdb-benchmark, a benchmark suite for graph databases. This benchmark suite mainly consists of four types of tests:

  • Massive Insertion, which involves batch insertion of vertices and edges, with a certain number of vertices or edges being submitted at once.
  • Single Insertion, which involves the immediate insertion of each vertex or edge, one at a time.
  • Query, which mainly includes the basic query operations of the graph database:
    • Find Neighbors, which queries the neighbors of all vertices.
    • Find Adjacent Nodes, which queries the adjacent vertices of all edges.
    • Find Shortest Path, which queries the shortest path from the first vertex to 100 random vertices.
  • Clustering, which is a community detection algorithm based on the Louvain Method.
1.2.2 Test dataset

Tests are conducted using both synthetic and real data.

The size of the datasets used in this test are not mentioned.

NameNumber of VerticesNumber of EdgesFile Size
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB
com-lj.ungraph.txt399796134681189479MB

1.3 Service configuration

  • HugeGraph version: 0.5.6, RestServer and Gremlin Server and backends are on the same server

    • RocksDB version: rocksdbjni-5.8.6
  • Titan version: 0.5.4, using thrift+Cassandra mode

    • Cassandra version: cassandra-3.10, commit-log and data use SSD together
  • Neo4j version: 2.0.1

The Titan version adapted by graphdb-benchmark is 0.5.4.

2 Test results

2.1 Batch insertion performance

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.6295.7115.24367.033
Titan10.15108.569150.2661217.944
Neo4j3.88418.93824.890281.537

Instructions

  • The data scale is in the table header in terms of edges
  • The data in the table is the time for batch insertion, in seconds
  • For example, HugeGraph(RocksDB) spent 5.711 seconds to insert 3 million edges of the amazon0601 dataset.
Conclusion
  • The performance of batch insertion: HugeGraph(RocksDB) > Neo4j > Titan(thrift+Cassandra)

2.2 Traversal performance

2.2.1 Explanation of terms
  • FN(Find Neighbor): Traverse all vertices, find the adjacent edges based on each vertex, and use the edges and vertices to find the other vertices adjacent to the original vertex.
  • FA(Find Adjacent): Traverse all edges, get the source vertex and target vertex based on each edge.
2.2.2 FN performance
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)com-lj.ungraph(400w)
HugeGraph4.07245.11866.006609.083
Titan8.08492.507184.5431099.371
Neo4j2.42410.53711.609106.919

Instructions

  • The data in the table header “( )” represents the data scale, in terms of vertices.
  • The data in the table represents the time spent traversing vertices, in seconds.
  • For example, HugeGraph uses the RocksDB backend to traverse all vertices in amazon0601, and search for adjacent edges and another vertex, which takes a total of 45.118 seconds.
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph1.54010.76411.243151.271
Titan7.36193.344169.2181085.235
Neo4j1.6734.7754.28440.507

Explanation

  • The data size in the header “( )” is based on the number of vertices.
  • The data in the table is the time it takes to traverse the vertices, in seconds.
  • For example, HugeGraph with RocksDB backend traverses all vertices in the amazon0601 dataset, and looks up adjacent edges and other vertices, taking a total of 45.118 seconds.
Conclusion
  • Traversal performance: Neo4j > HugeGraph(RocksDB) > Titan(thrift+Cassandra)

2.3 Performance of Common Graph Analysis Methods in HugeGraph

Terminology Explanation
  • FS (Find Shortest Path): finding the shortest path between two vertices
  • K-neighbor: all vertices that can be reached by traversing K hops (including 1, 2, 3…(K-1) hops) from the starting vertex
  • K-out: all vertices that can be reached by traversing exactly K out-edges from the starting vertex.
FS performance
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.4940.1033.3648.155
Titan11.8180.239377.709575.678
Neo4j1.7191.8001.9568.530

Explanation

  • The data in the header “()” represents the data scale in terms of edges
  • The data in the table is the time it takes to find the shortest path from the first vertex to 100 randomly selected vertices in seconds
  • For example, HugeGraph using the RocksDB backend to find the shortest path from the first vertex to 100 randomly selected vertices in the amazon0601 graph took a total of 0.103s.
Conclusion
  • In scenarios with small data size or few vertex relationships, HugeGraph outperforms Neo4j and Titan.
  • As the data size increases and the degree of vertex association increases, the performance of HugeGraph and Neo4j tends to be similar, both far exceeding Titan.
K-neighbor Performance
VertexDepthDegree 1Degree 2Degree 3Degree 4Degree 5Degree 6
v1Time0.031s0.033s0.048s0.500s11.27sOOM
v111Time0.027s0.034s0.115s1.36sOOM
v1111Time0.039s0.027s0.052s0.511s10.96sOOM

Explanation

  • HugeGraph-Server’s JVM memory is set to 32GB and may experience OOM when the data is too large.
K-out performance
VertexDepth1st Degree2nd Degree3rd Degree4th Degree5th Degree6th Degree
v1Time0.054s0.057s0.109s0.526s3.77sOOM
Degree10133245350,8301,128,688
v111Time0.032s0.042s0.136s1.25s20.62sOOM
Degree1021149441131502,629,970
v1111Time0.039s0.045s0.053s1.10s2.92sOOM
Degree101402555508251,070,230

Explanation

  • The JVM memory of HugeGraph-Server is set to 32GB, and OOM may occur when the data is too large.
Conclusion
  • In the FS scenario, HugeGraph outperforms Neo4j and Titan in terms of performance.
  • In the K-neighbor and K-out scenarios, HugeGraph can achieve results returned within seconds within 5 degrees.

2.4 Comprehensive Performance Test - CW

DatabaseSize 1000Size 5000Size 10000Size 20000
HugeGraph(core)20.804242.099744.7801700.547
Titan45.790820.6332652.2359568.623
Neo4j5.91350.267142.354460.880

Explanation

  • The “scale” is based on the number of vertices.
  • The data in the table is the time required to complete community discovery, in seconds. For example, if HugeGraph uses the RocksDB backend and operates on a dataset of 10,000 vertices, and the community aggregation is no longer changing, it takes 744.780 seconds.
  • The CW test is a comprehensive evaluation of CRUD operations.
  • In this test, HugeGraph, like Titan, did not use the client and directly operated on the core.
Conclusion
  • Performance of community detection algorithm: Neo4j > HugeGraph > Titan

8.2 - HugeGraph-API Performance

The HugeGraph API performance test mainly tests HugeGraph-Server’s ability to concurrently process RESTful API requests, including:

  • Single insertion of vertices/edges
  • Batch insertion of vertices/edges
  • Vertex/Edge Queries

For the performance test of the RESTful API of each release version of HugeGraph, please refer to:

Updates coming soon, stay tuned!

8.2.1 - v0.5.6 Stand-alone(RocksDB)

1 Test environment

Compressed machine information:

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • Information about the machine used to generate load: configured the same as the machine that is being tested under load.
  • Testing tool: Apache JMeter 2.5.1

Note: The load-generating machine and the machine under test are located in the same local network.

2 Test description

2.1 Definition of terms (the unit of time is ms)

  • Samples: The total number of threads completed in the current scenario.
  • Average: The average response time.
  • Median: The statistical median of the response time.
  • 90% Line: The response time below which 90% of all threads fall.
  • Min: The minimum response time.
  • Max: The maximum response time.
  • Error: The error rate.
  • Throughput: The number of requests processed per unit of time.
  • KB/sec: Throughput measured in terms of data transferred per second.

2.2 Underlying storage

RocksDB is used for backend storage, HugeGraph and RocksDB are both started on the same machine, and the configuration files related to the server remain as default except for the modification of the host and port.

3 Summary of performance results

  1. The speed of inserting a single vertex and edge in HugeGraph is about 1w per second
  2. The batch insertion speed of vertices and edges is much faster than the single insertion speed
  3. The concurrency of querying vertices and edges by id can reach more than 13000, and the average delay of requests is less than 50ms

4 Test results and analysis

4.1 batch insertion

4.1.1 Upper limit stress testing
Test methods

The upper limit of stress testing is to continuously increase the concurrency and test whether the server can still provide services normally.

Stress Parameters

Duration: 5 minutes

Maximum insertion speed for vertices:
image

####### in conclusion:

  • With a concurrency of 2200, the throughput for vertices is 2026.8. This means that the system can process data at a rate of 405360 per second (2026.8 * 200).
Maximum insertion speed for edges
image

####### Conclusion:

  • With a concurrency of 900, the throughput for edges is 776.9. This means that the system can process data at a rate of 388450 per second (776.9 * 500).

4.2 Single insertion

4.2.1 Stress limit testing
Test Methods

Stress limit testing is a process of continuously increasing the concurrency level to test the upper limit of the server’s ability to provide normal service.

Stress parameters
  • Duration: 5 minutes.
  • Service exception indicator: Error rate greater than 0.00%.
Single vertex insertion
image

####### Conclusion:

  • With a concurrency of 11500, the throughput is 10730. This means that the system can handle a single concurrent insertion of vertices at a concurrency level of 11500.
Single edge insertion
image

####### Conclusion:

  • With a concurrency of 9000, the throughput is 8418. This means that the system can handle a single concurrent insertion of edges at a concurrency level of 9000.

4.3 Search by ID

4.3.1 Stress test upper limit
Testing method

Continuously increasing the concurrency level to test the upper limit of the server’s ability to provide service under normal conditions.

stress parameters
  • Duration: 5 minutes
  • Service abnormality indicator: error rate greater than 0.00%
Querying vertices by ID
image

####### Conclusion:

  • Concurrency is 14,000, throughput is 12,663. The concurrency capacity for querying vertices by ID is 14,000, with an average delay of 44ms.
Querying edges by ID
image

####### Conclusion:

  • Concurrency is 13,000, throughput is 12,225. The concurrency capacity for querying edges by ID is 13,000, with an average delay of 12ms.

8.2.2 - v0.5.6 Cluster(Cassandra)

1 Test environment

Compressed machine information

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • Starting Pressure Machine Information: Configured the same as the compressed machine.
  • Testing tool: Apache JMeter 2.5.1.

Note: The machine used to initiate the load and the machine being tested are located in the same data center (or server room)

2 Test Description

2.1 Definition of terms (the unit of time is ms)

  • Samples – The total number of threads completed in this scenario.
  • Average – The average response time.
  • Median – The median response time in statistical terms.
  • 90% Line – The response time below which 90% of all threads fall.
  • Min – The minimum response time.
  • Max – The maximum response time.
  • Error – The error rate.
  • Throughput – The number of transactions processed per unit of time.
  • KB/sec – The throughput measured in terms of data transmitted per second.

2.2 Low-Level Storage

A 15-node Cassandra cluster is used for backend storage. HugeGraph and the Cassandra cluster are located on separate servers. Server-related configuration files are modified only for host and port settings, while the rest remain default.

3 Summary of Performance Results

  1. The speed of single vertex and edge insertion in HugeGraph is 9000 and 4500 per second, respectively.
  2. The speed of bulk vertex and edge insertion is 50,000 and 150,000 per second, respectively, which is much higher than the single insertion speed.
  3. The concurrency for querying vertices and edges by ID can reach more than 12,000, and the average request delay is less than 70ms.

4 Test Results and Analysis

4.1 Batch Insertion

4.1.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency level to test the upper limit of the server’s ability to provide services.

Pressure Parameters

Duration: 5 minutes.

Maximum Insertion Speed of Vertices:
image
Conclusion:
  • At a concurrency level of 3500, the throughput of vertices is 261, and the amount of data processed per second is 52,200 (261 * 200).
Maximum Insertion Speed of Edges:
image
Conclusion:
  • At a concurrency level of 1000, the throughput of edges is 323, and the amount of data processed per second is 161,500 (323 * 500).

4.2 Single Insertion

4.2.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency level to test the upper limit of the server’s ability to provide services.

Pressure Parameters
  • Duration: 5 minutes.
  • Service exception mark: Error rate greater than 0.00%.
Single Insertion of Vertices:
image
Conclusion:
  • At a concurrency level of 9000, the throughput is 8400, and the single-insertion concurrency capability for vertices is 9000.
Single Insertion of Edges:
image
Conclusion:
  • At a concurrency level of 4500, the throughput is 4160, and the single-insertion concurrency capability for edges is 4500.

4.3 Query by ID

4.3.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency and test the upper limit of the pressure that the server can still provide services normally.

Pressure Parameters
  • Duration: 5 minutes
  • Service exception flag: error rate greater than 0.00%
Query by ID for vertices
image
Conclusion:
  • The concurrent capacity of the vertex search by ID is 14500, with a throughput of 13576 and an average delay of 11ms.
Edge search by ID
image
Conclusion:
  • For edge ID-based queries, the server’s concurrent capacity is up to 12,000, with a throughput of 10,688 and an average latency of 63ms.

8.3 - HugeGraph-Loader Performance

Use Cases

When the number of graph data to be batch inserted (including vertices and edges) is at the billion level or below, or the total data size is less than TB, the HugeGraph-Loader tool can be used to continuously and quickly import graph data.

Performance

The test uses the edge data of website.

RocksDB single-machine performance

  • When label index is turned off, 228k edges/s.
  • When label index is turned on, 153k edges/s.

Cassandra cluster performance

  • When label index is turned on by default, 63k edges/s.

8.4 -

1 测试环境

1.1 硬件信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 软件信息

1.2.1 测试用例

测试使用graphdb-benchmark,一个图数据库测试集。该测试集主要包含4类测试:

  • Massive Insertion,批量插入顶点和边,一定数量的顶点或边一次性提交

  • Single Insertion,单条插入,每个顶点或者每条边立即提交

  • Query,主要是图数据库的基本查询操作:

    • Find Neighbors,查询所有顶点的邻居
    • Find Adjacent Nodes,查询所有边的邻接顶点
    • Find Shortest Path,查询第一个顶点到100个随机顶点的最短路径
  • Clustering,基于Louvain Method的社区发现算法

1.2.2 测试数据集

测试使用人造数据和真实数据

本测试用到的数据集规模
名称vertex数目edge数目文件大小
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB

1.3 服务配置

  • HugeGraph版本:0.4.4,RestServer和Gremlin Server和backends都在同一台服务器上
  • Cassandra版本:cassandra-3.10,commit-log 和data共用SSD
  • RocksDB版本:rocksdbjni-5.8.6
  • Titan版本:0.5.4, 使用thrift+Cassandra模式

graphdb-benchmark适配的Titan版本为0.5.4

2 测试结果

2.1 Batch插入性能

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan9.51688.123111.586
RocksDB2.34514.07616.636
Cassandra11.930108.709101.959
Memory3.07715.20413.841

说明

  • 表头"()“中数据是数据规模,以边为单位
  • 表中数据是批量插入的时间,单位是s
  • 例如,HugeGraph使用RocksDB插入amazon0601数据集的300w条边,花费14.076s,速度约为21w edges/s
结论
  • RocksDB和Memory后端插入性能优于Cassandra
  • HugeGraph和Titan同样使用Cassandra作为后端的情况下,插入性能接近

2.2 遍历性能

2.2.1 术语说明
  • FN(Find Neighbor), 遍历所有vertex, 根据vertex查邻接edge, 通过edge和vertex查other vertex
  • FA(Find Adjacent), 遍历所有edge,根据edge获得source vertex和target vertex
2.2.2 FN性能
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)
Titan7.72470.935128.884
RocksDB8.87665.85263.388
Cassandra13.125126.959102.580
Memory22.309207.411165.609

说明

  • 表头”()“中数据是数据规模,以顶点为单位
  • 表中数据是遍历顶点花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有顶点,并查找邻接边和另一顶点,总共耗时65.852s
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan7.11963.353115.633
RocksDB6.03264.52652.721
Cassandra9.410102.76694.197
Memory12.340195.444140.89

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是遍历边花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有边,并查询每条边的两个顶点,总共耗时64.526s
结论
  • HugeGraph RocksDB > Titan thrift+Cassandra > HugeGraph Cassandra > HugeGraph Memory

2.3 HugeGraph-图常用分析方法性能

术语说明
  • FS(Find Shortest Path), 寻找最短路径
  • K-neighbor,从起始vertex出发,通过K跳边能够到达的所有顶点, 包括1, 2, 3…(K-1), K跳边可达vertex
  • K-out, 从起始vertex出发,恰好经过K跳out边能够到达的顶点
FS性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan11.3330.313376.06
RocksDB44.3912.221268.792
Cassandra39.8453.337331.113
Memory35.6382.059388.987

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是找到从第一个顶点出发到达随机选择的100个顶点的最短路径的时间,单位是s
  • 例如,HugeGraph使用RocksDB查找第一个顶点到100个随机顶点的最短路径,总共耗时2.059s
结论
  • 在数据规模小或者顶点关联关系少的场景下,Titan最短路径性能优于HugeGraph
  • 随着数据规模增大且顶点的关联度增高,HugeGraph最短路径性能优于Titan
K-neighbor性能
顶点深度一度二度三度四度五度六度
v1时间0.031s0.033s0.048s0.500s11.27sOOM
v111时间0.027s0.034s0.1151.36sOOM
v1111时间0.039s0.027s0.052s0.511s10.96sOOM

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
K-out性能
顶点深度一度二度三度四度五度六度
v1时间0.054s0.057s0.109s0.526s3.77sOOM
10133245350,8301,128,688
v111时间0.032s0.042s0.136s1.25s20.62sOOM
1021149441131502,629,970
v1111时间0.039s0.045s0.053s1.10s2.92sOOM
101402555508251,070,230

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
结论
  • FS场景,HugeGraph性能优于Titan
  • K-neighbor和K-out场景,HugeGraph能够实现在5度范围内秒级返回结果

2.4 图综合性能测试-CW

数据库规模1000规模5000规模10000规模20000
Titan45.943849.1682737.1179791.46
Memory(core)41.0771825.905**
Cassandra(core)39.783862.7442423.1366564.191
RocksDB(core)33.383199.894763.8691677.813

说明

  • “规模"以顶点为单位
  • 表中数据是社区发现完成需要的时间,单位是s,例如HugeGraph使用RocksDB后端在规模10000的数据集,社区聚合不再变化,需要耗时763.869s
  • “*“表示超过10000s未完成
  • CW测试是CRUD的综合评估
  • 后三者分别是HugeGraph的不同后端,该测试中HugeGraph跟Titan一样,没有通过client,直接对core操作
结论
  • HugeGraph在使用Cassandra后端时,性能略优于Titan,随着数据规模的增大,优势越来越明显,数据规模20000时,比Titan快30%
  • HugeGraph在使用RocksDB后端时,性能远高于Titan和HugeGraph的Cassandra后端,分别比两者快了6倍和4倍

9 - Contribution Guidelines

9.1 - How to Contribute to HugeGraph

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
    +

    推荐使用HugeGraph-Studio 通过可视化的方式来执行上述代码。另外也可以通过HugeGraph-Client、HugeApi、GremlinConsole和GremlinDriver等多种方式执行上述代码。

    3.2 总结

    HugeGraph 目前支持 Gremlin 的语法,用户可以通过 Gremlin / REST-API 实现各种查询需求。

8 - PERFORMANCE

8.1 - HugeGraph BenchMark Performance

1 Test environment

1.1 Hardware information

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 Software information

1.2.1 Test cases

Testing is done using the graphdb-benchmark, a benchmark suite for graph databases. This benchmark suite mainly consists of four types of tests:

  • Massive Insertion, which involves batch insertion of vertices and edges, with a certain number of vertices or edges being submitted at once.
  • Single Insertion, which involves the immediate insertion of each vertex or edge, one at a time.
  • Query, which mainly includes the basic query operations of the graph database:
    • Find Neighbors, which queries the neighbors of all vertices.
    • Find Adjacent Nodes, which queries the adjacent vertices of all edges.
    • Find Shortest Path, which queries the shortest path from the first vertex to 100 random vertices.
  • Clustering, which is a community detection algorithm based on the Louvain Method.
1.2.2 Test dataset

Tests are conducted using both synthetic and real data.

The size of the datasets used in this test are not mentioned.

NameNumber of VerticesNumber of EdgesFile Size
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB
com-lj.ungraph.txt399796134681189479MB

1.3 Service configuration

  • HugeGraph version: 0.5.6, RestServer and Gremlin Server and backends are on the same server

    • RocksDB version: rocksdbjni-5.8.6
  • Titan version: 0.5.4, using thrift+Cassandra mode

    • Cassandra version: cassandra-3.10, commit-log and data use SSD together
  • Neo4j version: 2.0.1

The Titan version adapted by graphdb-benchmark is 0.5.4.

2 Test results

2.1 Batch insertion performance

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.6295.7115.24367.033
Titan10.15108.569150.2661217.944
Neo4j3.88418.93824.890281.537

Instructions

  • The data scale is in the table header in terms of edges
  • The data in the table is the time for batch insertion, in seconds
  • For example, HugeGraph(RocksDB) spent 5.711 seconds to insert 3 million edges of the amazon0601 dataset.
Conclusion
  • The performance of batch insertion: HugeGraph(RocksDB) > Neo4j > Titan(thrift+Cassandra)

2.2 Traversal performance

2.2.1 Explanation of terms
  • FN(Find Neighbor): Traverse all vertices, find the adjacent edges based on each vertex, and use the edges and vertices to find the other vertices adjacent to the original vertex.
  • FA(Find Adjacent): Traverse all edges, get the source vertex and target vertex based on each edge.
2.2.2 FN performance
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)com-lj.ungraph(400w)
HugeGraph4.07245.11866.006609.083
Titan8.08492.507184.5431099.371
Neo4j2.42410.53711.609106.919

Instructions

  • The data in the table header “( )” represents the data scale, in terms of vertices.
  • The data in the table represents the time spent traversing vertices, in seconds.
  • For example, HugeGraph uses the RocksDB backend to traverse all vertices in amazon0601, and search for adjacent edges and another vertex, which takes a total of 45.118 seconds.
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph1.54010.76411.243151.271
Titan7.36193.344169.2181085.235
Neo4j1.6734.7754.28440.507

Explanation

  • The data size in the header “( )” is based on the number of vertices.
  • The data in the table is the time it takes to traverse the vertices, in seconds.
  • For example, HugeGraph with RocksDB backend traverses all vertices in the amazon0601 dataset, and looks up adjacent edges and other vertices, taking a total of 45.118 seconds.
Conclusion
  • Traversal performance: Neo4j > HugeGraph(RocksDB) > Titan(thrift+Cassandra)

2.3 Performance of Common Graph Analysis Methods in HugeGraph

Terminology Explanation
  • FS (Find Shortest Path): finding the shortest path between two vertices
  • K-neighbor: all vertices that can be reached by traversing K hops (including 1, 2, 3…(K-1) hops) from the starting vertex
  • K-out: all vertices that can be reached by traversing exactly K out-edges from the starting vertex.
FS performance
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.4940.1033.3648.155
Titan11.8180.239377.709575.678
Neo4j1.7191.8001.9568.530

Explanation

  • The data in the header “()” represents the data scale in terms of edges
  • The data in the table is the time it takes to find the shortest path from the first vertex to 100 randomly selected vertices in seconds
  • For example, HugeGraph using the RocksDB backend to find the shortest path from the first vertex to 100 randomly selected vertices in the amazon0601 graph took a total of 0.103s.
Conclusion
  • In scenarios with small data size or few vertex relationships, HugeGraph outperforms Neo4j and Titan.
  • As the data size increases and the degree of vertex association increases, the performance of HugeGraph and Neo4j tends to be similar, both far exceeding Titan.
K-neighbor Performance
VertexDepthDegree 1Degree 2Degree 3Degree 4Degree 5Degree 6
v1Time0.031s0.033s0.048s0.500s11.27sOOM
v111Time0.027s0.034s0.115s1.36sOOM
v1111Time0.039s0.027s0.052s0.511s10.96sOOM

Explanation

  • HugeGraph-Server’s JVM memory is set to 32GB and may experience OOM when the data is too large.
K-out performance
VertexDepth1st Degree2nd Degree3rd Degree4th Degree5th Degree6th Degree
v1Time0.054s0.057s0.109s0.526s3.77sOOM
Degree10133245350,8301,128,688
v111Time0.032s0.042s0.136s1.25s20.62sOOM
Degree1021149441131502,629,970
v1111Time0.039s0.045s0.053s1.10s2.92sOOM
Degree101402555508251,070,230

Explanation

  • The JVM memory of HugeGraph-Server is set to 32GB, and OOM may occur when the data is too large.
Conclusion
  • In the FS scenario, HugeGraph outperforms Neo4j and Titan in terms of performance.
  • In the K-neighbor and K-out scenarios, HugeGraph can achieve results returned within seconds within 5 degrees.

2.4 Comprehensive Performance Test - CW

DatabaseSize 1000Size 5000Size 10000Size 20000
HugeGraph(core)20.804242.099744.7801700.547
Titan45.790820.6332652.2359568.623
Neo4j5.91350.267142.354460.880

Explanation

  • The “scale” is based on the number of vertices.
  • The data in the table is the time required to complete community discovery, in seconds. For example, if HugeGraph uses the RocksDB backend and operates on a dataset of 10,000 vertices, and the community aggregation is no longer changing, it takes 744.780 seconds.
  • The CW test is a comprehensive evaluation of CRUD operations.
  • In this test, HugeGraph, like Titan, did not use the client and directly operated on the core.
Conclusion
  • Performance of community detection algorithm: Neo4j > HugeGraph > Titan

8.2 - HugeGraph-API Performance

The HugeGraph API performance test mainly tests HugeGraph-Server’s ability to concurrently process RESTful API requests, including:

  • Single insertion of vertices/edges
  • Batch insertion of vertices/edges
  • Vertex/Edge Queries

For the performance test of the RESTful API of each release version of HugeGraph, please refer to:

Updates coming soon, stay tuned!

8.2.1 - v0.5.6 Stand-alone(RocksDB)

1 Test environment

Compressed machine information:

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • Information about the machine used to generate load: configured the same as the machine that is being tested under load.
  • Testing tool: Apache JMeter 2.5.1

Note: The load-generating machine and the machine under test are located in the same local network.

2 Test description

2.1 Definition of terms (the unit of time is ms)

  • Samples: The total number of threads completed in the current scenario.
  • Average: The average response time.
  • Median: The statistical median of the response time.
  • 90% Line: The response time below which 90% of all threads fall.
  • Min: The minimum response time.
  • Max: The maximum response time.
  • Error: The error rate.
  • Throughput: The number of requests processed per unit of time.
  • KB/sec: Throughput measured in terms of data transferred per second.

2.2 Underlying storage

RocksDB is used for backend storage, HugeGraph and RocksDB are both started on the same machine, and the configuration files related to the server remain as default except for the modification of the host and port.

3 Summary of performance results

  1. The speed of inserting a single vertex and edge in HugeGraph is about 1w per second
  2. The batch insertion speed of vertices and edges is much faster than the single insertion speed
  3. The concurrency of querying vertices and edges by id can reach more than 13000, and the average delay of requests is less than 50ms

4 Test results and analysis

4.1 batch insertion

4.1.1 Upper limit stress testing
Test methods

The upper limit of stress testing is to continuously increase the concurrency and test whether the server can still provide services normally.

Stress Parameters

Duration: 5 minutes

Maximum insertion speed for vertices:
image

####### in conclusion:

  • With a concurrency of 2200, the throughput for vertices is 2026.8. This means that the system can process data at a rate of 405360 per second (2026.8 * 200).
Maximum insertion speed for edges
image

####### Conclusion:

  • With a concurrency of 900, the throughput for edges is 776.9. This means that the system can process data at a rate of 388450 per second (776.9 * 500).

4.2 Single insertion

4.2.1 Stress limit testing
Test Methods

Stress limit testing is a process of continuously increasing the concurrency level to test the upper limit of the server’s ability to provide normal service.

Stress parameters
  • Duration: 5 minutes.
  • Service exception indicator: Error rate greater than 0.00%.
Single vertex insertion
image

####### Conclusion:

  • With a concurrency of 11500, the throughput is 10730. This means that the system can handle a single concurrent insertion of vertices at a concurrency level of 11500.
Single edge insertion
image

####### Conclusion:

  • With a concurrency of 9000, the throughput is 8418. This means that the system can handle a single concurrent insertion of edges at a concurrency level of 9000.

4.3 Search by ID

4.3.1 Stress test upper limit
Testing method

Continuously increasing the concurrency level to test the upper limit of the server’s ability to provide service under normal conditions.

stress parameters
  • Duration: 5 minutes
  • Service abnormality indicator: error rate greater than 0.00%
Querying vertices by ID
image

####### Conclusion:

  • Concurrency is 14,000, throughput is 12,663. The concurrency capacity for querying vertices by ID is 14,000, with an average delay of 44ms.
Querying edges by ID
image

####### Conclusion:

  • Concurrency is 13,000, throughput is 12,225. The concurrency capacity for querying edges by ID is 13,000, with an average delay of 12ms.

8.2.2 - v0.5.6 Cluster(Cassandra)

1 Test environment

Compressed machine information

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • Starting Pressure Machine Information: Configured the same as the compressed machine.
  • Testing tool: Apache JMeter 2.5.1.

Note: The machine used to initiate the load and the machine being tested are located in the same data center (or server room)

2 Test Description

2.1 Definition of terms (the unit of time is ms)

  • Samples – The total number of threads completed in this scenario.
  • Average – The average response time.
  • Median – The median response time in statistical terms.
  • 90% Line – The response time below which 90% of all threads fall.
  • Min – The minimum response time.
  • Max – The maximum response time.
  • Error – The error rate.
  • Throughput – The number of transactions processed per unit of time.
  • KB/sec – The throughput measured in terms of data transmitted per second.

2.2 Low-Level Storage

A 15-node Cassandra cluster is used for backend storage. HugeGraph and the Cassandra cluster are located on separate servers. Server-related configuration files are modified only for host and port settings, while the rest remain default.

3 Summary of Performance Results

  1. The speed of single vertex and edge insertion in HugeGraph is 9000 and 4500 per second, respectively.
  2. The speed of bulk vertex and edge insertion is 50,000 and 150,000 per second, respectively, which is much higher than the single insertion speed.
  3. The concurrency for querying vertices and edges by ID can reach more than 12,000, and the average request delay is less than 70ms.

4 Test Results and Analysis

4.1 Batch Insertion

4.1.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency level to test the upper limit of the server’s ability to provide services.

Pressure Parameters

Duration: 5 minutes.

Maximum Insertion Speed of Vertices:
image
Conclusion:
  • At a concurrency level of 3500, the throughput of vertices is 261, and the amount of data processed per second is 52,200 (261 * 200).
Maximum Insertion Speed of Edges:
image
Conclusion:
  • At a concurrency level of 1000, the throughput of edges is 323, and the amount of data processed per second is 161,500 (323 * 500).

4.2 Single Insertion

4.2.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency level to test the upper limit of the server’s ability to provide services.

Pressure Parameters
  • Duration: 5 minutes.
  • Service exception mark: Error rate greater than 0.00%.
Single Insertion of Vertices:
image
Conclusion:
  • At a concurrency level of 9000, the throughput is 8400, and the single-insertion concurrency capability for vertices is 9000.
Single Insertion of Edges:
image
Conclusion:
  • At a concurrency level of 4500, the throughput is 4160, and the single-insertion concurrency capability for edges is 4500.

4.3 Query by ID

4.3.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency and test the upper limit of the pressure that the server can still provide services normally.

Pressure Parameters
  • Duration: 5 minutes
  • Service exception flag: error rate greater than 0.00%
Query by ID for vertices
image
Conclusion:
  • The concurrent capacity of the vertex search by ID is 14500, with a throughput of 13576 and an average delay of 11ms.
Edge search by ID
image
Conclusion:
  • For edge ID-based queries, the server’s concurrent capacity is up to 12,000, with a throughput of 10,688 and an average latency of 63ms.

8.3 - HugeGraph-Loader Performance

Use Cases

When the number of graph data to be batch inserted (including vertices and edges) is at the billion level or below, or the total data size is less than TB, the HugeGraph-Loader tool can be used to continuously and quickly import graph data.

Performance

The test uses the edge data of website.

RocksDB single-machine performance

  • When label index is turned off, 228k edges/s.
  • When label index is turned on, 153k edges/s.

Cassandra cluster performance

  • When label index is turned on by default, 63k edges/s.

8.4 -

1 测试环境

1.1 硬件信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 软件信息

1.2.1 测试用例

测试使用graphdb-benchmark,一个图数据库测试集。该测试集主要包含4类测试:

  • Massive Insertion,批量插入顶点和边,一定数量的顶点或边一次性提交

  • Single Insertion,单条插入,每个顶点或者每条边立即提交

  • Query,主要是图数据库的基本查询操作:

    • Find Neighbors,查询所有顶点的邻居
    • Find Adjacent Nodes,查询所有边的邻接顶点
    • Find Shortest Path,查询第一个顶点到100个随机顶点的最短路径
  • Clustering,基于Louvain Method的社区发现算法

1.2.2 测试数据集

测试使用人造数据和真实数据

本测试用到的数据集规模
名称vertex数目edge数目文件大小
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB

1.3 服务配置

  • HugeGraph版本:0.4.4,RestServer和Gremlin Server和backends都在同一台服务器上
  • Cassandra版本:cassandra-3.10,commit-log 和data共用SSD
  • RocksDB版本:rocksdbjni-5.8.6
  • Titan版本:0.5.4, 使用thrift+Cassandra模式

graphdb-benchmark适配的Titan版本为0.5.4

2 测试结果

2.1 Batch插入性能

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan9.51688.123111.586
RocksDB2.34514.07616.636
Cassandra11.930108.709101.959
Memory3.07715.20413.841

说明

  • 表头"()“中数据是数据规模,以边为单位
  • 表中数据是批量插入的时间,单位是s
  • 例如,HugeGraph使用RocksDB插入amazon0601数据集的300w条边,花费14.076s,速度约为21w edges/s
结论
  • RocksDB和Memory后端插入性能优于Cassandra
  • HugeGraph和Titan同样使用Cassandra作为后端的情况下,插入性能接近

2.2 遍历性能

2.2.1 术语说明
  • FN(Find Neighbor), 遍历所有vertex, 根据vertex查邻接edge, 通过edge和vertex查other vertex
  • FA(Find Adjacent), 遍历所有edge,根据edge获得source vertex和target vertex
2.2.2 FN性能
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)
Titan7.72470.935128.884
RocksDB8.87665.85263.388
Cassandra13.125126.959102.580
Memory22.309207.411165.609

说明

  • 表头”()“中数据是数据规模,以顶点为单位
  • 表中数据是遍历顶点花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有顶点,并查找邻接边和另一顶点,总共耗时65.852s
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan7.11963.353115.633
RocksDB6.03264.52652.721
Cassandra9.410102.76694.197
Memory12.340195.444140.89

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是遍历边花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有边,并查询每条边的两个顶点,总共耗时64.526s
结论
  • HugeGraph RocksDB > Titan thrift+Cassandra > HugeGraph Cassandra > HugeGraph Memory

2.3 HugeGraph-图常用分析方法性能

术语说明
  • FS(Find Shortest Path), 寻找最短路径
  • K-neighbor,从起始vertex出发,通过K跳边能够到达的所有顶点, 包括1, 2, 3…(K-1), K跳边可达vertex
  • K-out, 从起始vertex出发,恰好经过K跳out边能够到达的顶点
FS性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan11.3330.313376.06
RocksDB44.3912.221268.792
Cassandra39.8453.337331.113
Memory35.6382.059388.987

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是找到从第一个顶点出发到达随机选择的100个顶点的最短路径的时间,单位是s
  • 例如,HugeGraph使用RocksDB查找第一个顶点到100个随机顶点的最短路径,总共耗时2.059s
结论
  • 在数据规模小或者顶点关联关系少的场景下,Titan最短路径性能优于HugeGraph
  • 随着数据规模增大且顶点的关联度增高,HugeGraph最短路径性能优于Titan
K-neighbor性能
顶点深度一度二度三度四度五度六度
v1时间0.031s0.033s0.048s0.500s11.27sOOM
v111时间0.027s0.034s0.1151.36sOOM
v1111时间0.039s0.027s0.052s0.511s10.96sOOM

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
K-out性能
顶点深度一度二度三度四度五度六度
v1时间0.054s0.057s0.109s0.526s3.77sOOM
10133245350,8301,128,688
v111时间0.032s0.042s0.136s1.25s20.62sOOM
1021149441131502,629,970
v1111时间0.039s0.045s0.053s1.10s2.92sOOM
101402555508251,070,230

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
结论
  • FS场景,HugeGraph性能优于Titan
  • K-neighbor和K-out场景,HugeGraph能够实现在5度范围内秒级返回结果

2.4 图综合性能测试-CW

数据库规模1000规模5000规模10000规模20000
Titan45.943849.1682737.1179791.46
Memory(core)41.0771825.905**
Cassandra(core)39.783862.7442423.1366564.191
RocksDB(core)33.383199.894763.8691677.813

说明

  • “规模"以顶点为单位
  • 表中数据是社区发现完成需要的时间,单位是s,例如HugeGraph使用RocksDB后端在规模10000的数据集,社区聚合不再变化,需要耗时763.869s
  • “*“表示超过10000s未完成
  • CW测试是CRUD的综合评估
  • 后三者分别是HugeGraph的不同后端,该测试中HugeGraph跟Titan一样,没有通过client,直接对core操作
结论
  • HugeGraph在使用Cassandra后端时,性能略优于Titan,随着数据规模的增大,优势越来越明显,数据规模20000时,比Titan快30%
  • HugeGraph在使用RocksDB后端时,性能远高于Titan和HugeGraph的Cassandra后端,分别比两者快了6倍和4倍

9 - Contribution Guidelines

9.1 - How to Contribute to HugeGraph

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

Optional: You can use GitHub desktop to greatly simplify the commit and update process.

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
     git clone https://github.com/${GITHUB_USER_NAME}/hugegraph
     
  4. Configure local HugeGraph repo

    cd hugegraph
     
    @@ -6470,7 +6470,7 @@
     # set name and email to push code to github
     git config user.name "{full-name}" # like "Jermy Li"
     git config user.email "{email-address-of-github}" # like "jermy@apache.org"
    -

Optional: You can use GitHub desktop to greatly simplify the commit and update process.

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
+

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
 git checkout master
 # pull the latest code from official hugegraph
 git pull hugegraph
@@ -6492,7 +6492,7 @@
 

Please remember to fill in the issue id, which was generated by GitHub after issue creation.

3.4 Push commit to GitHub fork repo

Push the local commit to GitHub fork repo:

# push the local commit to fork repo
 git push origin bugfix-branch:bugfix-branch
 

Note that since GitHub requires submitting code through username + token (instead of using username + password directly), you need to create a GitHub token from https://github.com/settings/tokens: -image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:

I have read the CLA Document and I hereby sign the CLA

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: +image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: image

5. Code review

Maintainers will start the code review after all the automatic checks are passed:

  • Check: Contributor License Agreement is signed
  • Check: Travis CI builds is passed (automatically Test and Deploy)

The commit will be accepted and merged if there is no problem after review.

Please click on “Details” to find the problem if any check does not pass.

If there are checks not passed or changes requested, then continue to modify the code and push again.

6. More changes after review

If we have not passed the review, don’t be discouraged. Usually a commit needs to be reviewed several times before being accepted! Please follow the review comments and make further changes.

After the further changes, we submit them to the local repo:

# commit all updated files in a new commit,
 # please feel free to enter any appropriate commit message, note that
 # we will squash all commits in the pull request as one commit when
diff --git a/docs/contribution-guidelines/_print/index.html b/docs/contribution-guidelines/_print/index.html
index b066277e0..55af956a4 100644
--- a/docs/contribution-guidelines/_print/index.html
+++ b/docs/contribution-guidelines/_print/index.html
@@ -1,6 +1,6 @@
 Contribution Guidelines | HugeGraph
 

1 - How to Contribute to HugeGraph

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
    +Click here to print.

    Return to the regular view of this page.

    Contribution Guidelines

1 - How to Contribute to HugeGraph

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

Optional: You can use GitHub desktop to greatly simplify the commit and update process.

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
     git clone https://github.com/${GITHUB_USER_NAME}/hugegraph
     
  4. Configure local HugeGraph repo

    cd hugegraph
     
    @@ -10,7 +10,7 @@
     # set name and email to push code to github
     git config user.name "{full-name}" # like "Jermy Li"
     git config user.email "{email-address-of-github}" # like "jermy@apache.org"
    -

Optional: You can use GitHub desktop to greatly simplify the commit and update process.

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
+

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
 git checkout master
 # pull the latest code from official hugegraph
 git pull hugegraph
@@ -32,7 +32,7 @@
 

Please remember to fill in the issue id, which was generated by GitHub after issue creation.

3.4 Push commit to GitHub fork repo

Push the local commit to GitHub fork repo:

# push the local commit to fork repo
 git push origin bugfix-branch:bugfix-branch
 

Note that since GitHub requires submitting code through username + token (instead of using username + password directly), you need to create a GitHub token from https://github.com/settings/tokens: -image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:

I have read the CLA Document and I hereby sign the CLA

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: +image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: image

5. Code review

Maintainers will start the code review after all the automatic checks are passed:

  • Check: Contributor License Agreement is signed
  • Check: Travis CI builds is passed (automatically Test and Deploy)

The commit will be accepted and merged if there is no problem after review.

Please click on “Details” to find the problem if any check does not pass.

If there are checks not passed or changes requested, then continue to modify the code and push again.

6. More changes after review

If we have not passed the review, don’t be discouraged. Usually a commit needs to be reviewed several times before being accepted! Please follow the review comments and make further changes.

After the further changes, we submit them to the local repo:

# commit all updated files in a new commit,
 # please feel free to enter any appropriate commit message, note that
 # we will squash all commits in the pull request as one commit when
diff --git a/docs/contribution-guidelines/contribute/index.html b/docs/contribution-guidelines/contribute/index.html
index 79a7cbe8a..9523d9adf 100644
--- a/docs/contribution-guidelines/contribute/index.html
+++ b/docs/contribution-guidelines/contribute/index.html
@@ -1,22 +1,22 @@
 How to Contribute to HugeGraph | HugeGraph
+1. Preparation Optional: You can use GitHub desktop to greatly simplify the commit and update process.
+We can contribute by reporting issues, submitting code patches or any other feedback.
+Before submitting the code, we need to do some preparation:">
 

How to Contribute to HugeGraph

Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

The following is a contribution guide for HugeGraph:

image

1. Preparation

We can contribute by reporting issues, submitting code patches or any other feedback.

Before submitting the code, we need to do some preparation:

  1. Sign up or login to GitHub: https://github.com

  2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

  3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

    # clone code from remote to local repo
    + Print entire section

    How to Contribute to HugeGraph

    Thanks for taking the time to contribute! As an open source project, HugeGraph is looking forward to be contributed from everyone, and we are also grateful to all the contributors.

    The following is a contribution guide for HugeGraph:

    image

    1. Preparation

    Optional: You can use GitHub desktop to greatly simplify the commit and update process.

    We can contribute by reporting issues, submitting code patches or any other feedback.

    Before submitting the code, we need to do some preparation:

    1. Sign up or login to GitHub: https://github.com

    2. Fork HugeGraph repo from GitHub: https://github.com/apache/incubator-hugegraph/fork

    3. Clone code from fork repo to local: https://github.com/${GITHUB_USER_NAME}/hugegraph

      # clone code from remote to local repo
       git clone https://github.com/${GITHUB_USER_NAME}/hugegraph
       
    4. Configure local HugeGraph repo

      cd hugegraph
       
      @@ -26,7 +26,7 @@
       # set name and email to push code to github
       git config user.name "{full-name}" # like "Jermy Li"
       git config user.email "{email-address-of-github}" # like "jermy@apache.org"
      -

    Optional: You can use GitHub desktop to greatly simplify the commit and update process.

    2. Create an Issue on GitHub

    If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

    3. Make changes of code locally

    3.1 Create a new branch

    Please don’t use master branch for development. We should create a new branch instead:

    # checkout master branch
    +

2. Create an Issue on GitHub

If you encounter bugs or have any questions, please go to GitHub Issues to report them and feel free to create an issue.

3. Make changes of code locally

3.1 Create a new branch

Please don’t use master branch for development. We should create a new branch instead:

# checkout master branch
 git checkout master
 # pull the latest code from official hugegraph
 git pull hugegraph
@@ -48,7 +48,7 @@
 

Please remember to fill in the issue id, which was generated by GitHub after issue creation.

3.4 Push commit to GitHub fork repo

Push the local commit to GitHub fork repo:

# push the local commit to fork repo
 git push origin bugfix-branch:bugfix-branch
 

Note that since GitHub requires submitting code through username + token (instead of using username + password directly), you need to create a GitHub token from https://github.com/settings/tokens: -image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:

I have read the CLA Document and I hereby sign the CLA

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: +image

4. Create a Pull Request

Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button “Compare & pull request” to do it. Then edit the description for proposed changes, which can just be copied from the commit message.

Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to https://github.com/settings/emails: image

5. Code review

Maintainers will start the code review after all the automatic checks are passed:

  • Check: Contributor License Agreement is signed
  • Check: Travis CI builds is passed (automatically Test and Deploy)

The commit will be accepted and merged if there is no problem after review.

Please click on “Details” to find the problem if any check does not pass.

If there are checks not passed or changes requested, then continue to modify the code and push again.

6. More changes after review

If we have not passed the review, don’t be discouraged. Usually a commit needs to be reviewed several times before being accepted! Please follow the review comments and make further changes.

After the further changes, we submit them to the local repo:

# commit all updated files in a new commit,
 # please feel free to enter any appropriate commit message, note that
 # we will squash all commits in the pull request as one commit when
@@ -62,7 +62,7 @@
 git rebase -i master
 

And push it to GitHub fork repo again:

# force push the local commit to fork repo
 git push -f origin bugfix-branch:bugfix-branch
-

GitHub will automatically update the Pull Request after we push it, just wait for code review.


+

GitHub will automatically update the Pull Request after we push it, just wait for code review.


diff --git a/docs/contribution-guidelines/index.xml b/docs/contribution-guidelines/index.xml index 49e55da86..e5f015703 100644 --- a/docs/contribution-guidelines/index.xml +++ b/docs/contribution-guidelines/index.xml @@ -3,6 +3,7 @@ <p>The following is a contribution guide for HugeGraph:</p> <img width="884" alt="image" src="https://user-images.githubusercontent.com/9625821/159643158-8bf72c0a-93c3-4a58-8912-7b2ab20ced1d.png"> <h2 id="1-preparation">1. Preparation</h2> +<p>Optional: You can use <a href="https://desktop.github.com/">GitHub desktop</a> to greatly simplify the commit and update process.</p> <p>We can contribute by reporting issues, submitting code patches or any other feedback.</p> <p>Before submitting the code, we need to do some preparation:</p> <ol> @@ -29,7 +30,6 @@ </span></span><span style="display:flex;"><span>git config user.email <span style="color:#4e9a06">&#34;{email-address-of-github}&#34;</span> <span style="color:#8f5902;font-style:italic"># like &#34;jermy@apache.org&#34;</span> </span></span></code></pre></div></li> </ol> -<p>Optional: You can use <a href="https://desktop.github.com/">GitHub desktop</a> to greatly simplify the commit and update process.</p> <h2 id="2-create-an-issue-on-github">2. Create an Issue on GitHub</h2> <p>If you encounter bugs or have any questions, please go to <a href="https://github.com/apache/incubator-hugegraph/issues">GitHub Issues</a> to report them and feel free to <a href="https://github.com/apache/hugegraph/issues/new">create an issue</a>.</p> <h2 id="3-make-changes-of-code-locally">3. Make changes of code locally</h2> @@ -86,8 +86,6 @@ <img width="1280" alt="image" src="https://user-images.githubusercontent.com/9625821/163524204-7fe0e6bf-9c8b-4b1a-ac65-6a0ac423eb16.png"></p> <h2 id="4-create-a-pull-request">4. Create a Pull Request</h2> <p>Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button &ldquo;Compare &amp; pull request&rdquo; to do it. Then edit the description for proposed changes, which can just be copied from the commit message.</p> -<p>Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:</p> -<p><code>I have read the CLA Document and I hereby sign the CLA</code></p> <p>Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to <a href="https://github.com/settings/emails">https://github.com/settings/emails</a>: <img width="1280" alt="image" src="https://user-images.githubusercontent.com/9625821/163522445-2a50a72a-dea2-434f-9868-3a0d40d0d037.png"></p> <h2 id="5-code-review">5. Code review</h2> diff --git a/docs/index.xml b/docs/index.xml index 4c187f5bd..26cc4c512 100644 --- a/docs/index.xml +++ b/docs/index.xml @@ -817,6 +817,7 @@ <p>The following is a contribution guide for HugeGraph:</p> <img width="884" alt="image" src="https://user-images.githubusercontent.com/9625821/159643158-8bf72c0a-93c3-4a58-8912-7b2ab20ced1d.png"> <h2 id="1-preparation">1. Preparation</h2> +<p>Optional: You can use <a href="https://desktop.github.com/">GitHub desktop</a> to greatly simplify the commit and update process.</p> <p>We can contribute by reporting issues, submitting code patches or any other feedback.</p> <p>Before submitting the code, we need to do some preparation:</p> <ol> @@ -843,7 +844,6 @@ </span></span><span style="display:flex;"><span>git config user.email <span style="color:#4e9a06">&#34;{email-address-of-github}&#34;</span> <span style="color:#8f5902;font-style:italic"># like &#34;jermy@apache.org&#34;</span> </span></span></code></pre></div></li> </ol> -<p>Optional: You can use <a href="https://desktop.github.com/">GitHub desktop</a> to greatly simplify the commit and update process.</p> <h2 id="2-create-an-issue-on-github">2. Create an Issue on GitHub</h2> <p>If you encounter bugs or have any questions, please go to <a href="https://github.com/apache/incubator-hugegraph/issues">GitHub Issues</a> to report them and feel free to <a href="https://github.com/apache/hugegraph/issues/new">create an issue</a>.</p> <h2 id="3-make-changes-of-code-locally">3. Make changes of code locally</h2> @@ -900,8 +900,6 @@ <img width="1280" alt="image" src="https://user-images.githubusercontent.com/9625821/163524204-7fe0e6bf-9c8b-4b1a-ac65-6a0ac423eb16.png"></p> <h2 id="4-create-a-pull-request">4. Create a Pull Request</h2> <p>Go to the web page of GitHub fork repo, there would be a chance to create a Pull Request after pushing to a new branch, just click button &ldquo;Compare &amp; pull request&rdquo; to do it. Then edit the description for proposed changes, which can just be copied from the commit message.</p> -<p>Please sign the HugeGraph CLA when contributing code for the first time. You can sign the CLA by just posting a Pull Request Comment same as the below format:</p> -<p><code>I have read the CLA Document and I hereby sign the CLA</code></p> <p>Note: please make sure the email address you used to submit the code is bound to the GitHub account. For how to bind the email address, please refer to <a href="https://github.com/settings/emails">https://github.com/settings/emails</a>: <img width="1280" alt="image" src="https://user-images.githubusercontent.com/9625821/163522445-2a50a72a-dea2-434f-9868-3a0d40d0d037.png"></p> <h2 id="5-code-review">5. Code review</h2> diff --git a/en/sitemap.xml b/en/sitemap.xml index 9851a407c..7ec348cee 100644 --- a/en/sitemap.xml +++ b/en/sitemap.xml @@ -1 +1 @@ -/docs/guides/architectural/2023-06-25T21:06:07+08:00/docs/config/config-guide/2023-06-21T14:48:04+08:00/docs/language/hugegraph-gremlin/2023-05-14T07:29:41-05:00/docs/contribution-guidelines/contribute/2023-06-26T14:59:53+08:00/docs/performance/hugegraph-benchmark-0.5.6/2023-05-14T22:31:02-05:00/docs/quickstart/hugegraph-server/2023-06-25T21:06:07+08:00/docs/introduction/readme/2023-06-18T14:57:33+08:00/docs/changelog/hugegraph-1.0.0-release-notes/2023-01-09T07:41:46+08:00/docs/clients/restful-api/2023-07-31T23:55:30+08:00/docs/clients/restful-api/schema/2023-05-14T19:35:13+08:00/docs/performance/api-preformance/hugegraph-api-0.5.6-rocksdb/2023-05-15T22:47:44-05:00/docs/config/config-option/2023-02-08T20:56:09+08:00/docs/guides/desgin-concept/2023-05-14T07:20:21-05:00/docs/download/download/2023-06-17T14:43:04+08:00/docs/language/hugegraph-example/2023-02-02T01:21:10+08:00/docs/clients/hugegraph-client/2023-01-01T16:16:43+08:00/docs/performance/api-preformance/2023-06-17T14:43:04+08:00/docs/quickstart/hugegraph-loader/2023-05-17T23:12:35+08:00/docs/clients/restful-api/propertykey/2023-05-19T05:15:56-05:00/docs/changelog/hugegraph-0.12.0-release-notes/2023-05-18T06:11:19-05:00/docs/contribution-guidelines/subscribe/2023-06-17T14:43:04+08:00/docs/performance/api-preformance/hugegraph-api-0.5.6-cassandra/2023-05-16T23:30:00-05:00/docs/config/config-authentication/2023-05-19T05:12:35-05:00/docs/clients/gremlin-console/2023-06-12T23:52:07+08:00/docs/guides/custom-plugin/2023-05-14T07:22:46-05:00/docs/performance/hugegraph-loader-performance/2023-05-18T00:34:48-05:00/docs/quickstart/hugegraph-tools/2023-05-09T21:27:34+08:00/docs/quickstart/2022-04-17T11:36:55+08:00/docs/contribution-guidelines/validate-release/2023-02-15T16:14:21+08:00/docs/clients/restful-api/vertexlabel/2023-05-19T04:03:23-05:00/docs/guides/backup-restore/2023-05-14T07:26:12-05:00/docs/config/2022-04-17T11:36:55+08:00/docs/config/config-https/2023-05-19T05:04:16-05:00/docs/clients/restful-api/edgelabel/2023-05-19T05:17:26-05:00/docs/contribution-guidelines/hugegraph-server-idea-setup/2023-06-25T21:06:07+08:00/docs/quickstart/hugegraph-hubble/2023-01-04T22:59:07+08:00/docs/clients/2022-04-17T11:36:55+08:00/docs/config/config-computer/2023-01-01T16:16:43+08:00/docs/guides/faq/2023-05-14T07:28:41-05:00/docs/clients/restful-api/indexlabel/2023-05-19T05:18:17-05:00/docs/quickstart/hugegraph-client/2023-05-14T22:39:27+08:00/docs/guides/2022-04-17T11:36:55+08:00/docs/clients/restful-api/rebuild/2022-05-09T18:43:53+08:00/docs/quickstart/hugegraph-computer/2023-06-25T21:06:46+08:00/docs/language/2022-04-17T11:36:55+08:00/docs/clients/restful-api/vertex/2023-06-04T23:04:47+08:00/docs/clients/restful-api/edge/2023-06-29T10:17:29+08:00/docs/performance/2022-04-17T11:36:55+08:00/docs/contribution-guidelines/2022-12-30T19:36:31+08:00/docs/clients/restful-api/traverser/2023-05-20T06:12:55-05:00/docs/changelog/2022-04-28T21:26:41+08:00/docs/clients/restful-api/rank/2022-09-15T12:59:59+08:00/docs/clients/restful-api/variable/2023-05-21T04:38:57-05:00/docs/clients/restful-api/graphs/2022-05-27T09:27:37+08:00/docs/clients/restful-api/task/2022-09-15T12:59:59+08:00/docs/clients/restful-api/gremlin/2023-05-21T04:39:11-05:00/docs/clients/restful-api/cypher/2023-07-31T23:55:30+08:00/docs/clients/restful-api/auth/2023-07-31T23:55:30+08:00/docs/clients/restful-api/other/2023-07-31T23:55:30+08:00/docs/2022-12-30T19:57:48+08:00/blog/news/2022-03-21T18:55:33+08:00/blog/releases/2022-03-21T18:55:33+08:00/blog/2018/10/06/easy-documentation-with-docsy/2022-03-21T18:55:33+08:00/blog/2018/10/06/the-second-blog-post/2022-03-21T18:55:33+08:00/blog/2018/01/04/another-great-release/2022-03-21T18:55:33+08:00/docs/cla/2022-03-21T19:51:14+08:00/docs/performance/hugegraph-benchmark-0.4.4/2022-09-15T12:59:59+08:00/docs/summary/2023-07-31T23:55:30+08:00/blog/2022-03-21T18:55:33+08:00/categories//community/2022-03-21T18:55:33+08:00/2023-01-15T13:44:01+00:00/search/2022-03-21T18:55:33+08:00/tags/ \ No newline at end of file +/docs/guides/architectural/2023-06-25T21:06:07+08:00/docs/config/config-guide/2023-06-21T14:48:04+08:00/docs/language/hugegraph-gremlin/2023-05-14T07:29:41-05:00/docs/contribution-guidelines/contribute/2023-09-09T20:50:32+08:00/docs/performance/hugegraph-benchmark-0.5.6/2023-05-14T22:31:02-05:00/docs/quickstart/hugegraph-server/2023-06-25T21:06:07+08:00/docs/introduction/readme/2023-06-18T14:57:33+08:00/docs/changelog/hugegraph-1.0.0-release-notes/2023-01-09T07:41:46+08:00/docs/clients/restful-api/2023-07-31T23:55:30+08:00/docs/clients/restful-api/schema/2023-05-14T19:35:13+08:00/docs/performance/api-preformance/hugegraph-api-0.5.6-rocksdb/2023-05-15T22:47:44-05:00/docs/config/config-option/2023-02-08T20:56:09+08:00/docs/guides/desgin-concept/2023-05-14T07:20:21-05:00/docs/download/download/2023-06-17T14:43:04+08:00/docs/language/hugegraph-example/2023-02-02T01:21:10+08:00/docs/clients/hugegraph-client/2023-01-01T16:16:43+08:00/docs/performance/api-preformance/2023-06-17T14:43:04+08:00/docs/quickstart/hugegraph-loader/2023-05-17T23:12:35+08:00/docs/clients/restful-api/propertykey/2023-05-19T05:15:56-05:00/docs/changelog/hugegraph-0.12.0-release-notes/2023-05-18T06:11:19-05:00/docs/contribution-guidelines/subscribe/2023-06-17T14:43:04+08:00/docs/performance/api-preformance/hugegraph-api-0.5.6-cassandra/2023-05-16T23:30:00-05:00/docs/config/config-authentication/2023-05-19T05:12:35-05:00/docs/clients/gremlin-console/2023-06-12T23:52:07+08:00/docs/guides/custom-plugin/2023-05-14T07:22:46-05:00/docs/performance/hugegraph-loader-performance/2023-05-18T00:34:48-05:00/docs/quickstart/hugegraph-tools/2023-05-09T21:27:34+08:00/docs/quickstart/2022-04-17T11:36:55+08:00/docs/contribution-guidelines/validate-release/2023-02-15T16:14:21+08:00/docs/clients/restful-api/vertexlabel/2023-05-19T04:03:23-05:00/docs/guides/backup-restore/2023-05-14T07:26:12-05:00/docs/config/2022-04-17T11:36:55+08:00/docs/config/config-https/2023-05-19T05:04:16-05:00/docs/clients/restful-api/edgelabel/2023-05-19T05:17:26-05:00/docs/contribution-guidelines/hugegraph-server-idea-setup/2023-06-25T21:06:07+08:00/docs/quickstart/hugegraph-hubble/2023-01-04T22:59:07+08:00/docs/clients/2022-04-17T11:36:55+08:00/docs/config/config-computer/2023-01-01T16:16:43+08:00/docs/guides/faq/2023-05-14T07:28:41-05:00/docs/clients/restful-api/indexlabel/2023-05-19T05:18:17-05:00/docs/quickstart/hugegraph-client/2023-05-14T22:39:27+08:00/docs/guides/2022-04-17T11:36:55+08:00/docs/clients/restful-api/rebuild/2022-05-09T18:43:53+08:00/docs/quickstart/hugegraph-computer/2023-06-25T21:06:46+08:00/docs/language/2022-04-17T11:36:55+08:00/docs/clients/restful-api/vertex/2023-06-04T23:04:47+08:00/docs/clients/restful-api/edge/2023-06-29T10:17:29+08:00/docs/performance/2022-04-17T11:36:55+08:00/docs/contribution-guidelines/2022-12-30T19:36:31+08:00/docs/clients/restful-api/traverser/2023-05-20T06:12:55-05:00/docs/changelog/2022-04-28T21:26:41+08:00/docs/clients/restful-api/rank/2022-09-15T12:59:59+08:00/docs/clients/restful-api/variable/2023-05-21T04:38:57-05:00/docs/clients/restful-api/graphs/2022-05-27T09:27:37+08:00/docs/clients/restful-api/task/2022-09-15T12:59:59+08:00/docs/clients/restful-api/gremlin/2023-05-21T04:39:11-05:00/docs/clients/restful-api/cypher/2023-07-31T23:55:30+08:00/docs/clients/restful-api/auth/2023-07-31T23:55:30+08:00/docs/clients/restful-api/other/2023-07-31T23:55:30+08:00/docs/2022-12-30T19:57:48+08:00/blog/news/2022-03-21T18:55:33+08:00/blog/releases/2022-03-21T18:55:33+08:00/blog/2018/10/06/easy-documentation-with-docsy/2022-03-21T18:55:33+08:00/blog/2018/10/06/the-second-blog-post/2022-03-21T18:55:33+08:00/blog/2018/01/04/another-great-release/2022-03-21T18:55:33+08:00/docs/cla/2022-03-21T19:51:14+08:00/docs/performance/hugegraph-benchmark-0.4.4/2022-09-15T12:59:59+08:00/docs/summary/2023-07-31T23:55:30+08:00/blog/2022-03-21T18:55:33+08:00/categories//community/2022-03-21T18:55:33+08:00/2023-01-15T13:44:01+00:00/search/2022-03-21T18:55:33+08:00/tags/ \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml index ad44381f2..cd15f2aeb 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -1 +1 @@ -/en/sitemap.xml2023-07-31T23:55:30+08:00/cn/sitemap.xml2023-07-31T23:55:30+08:00 \ No newline at end of file +/en/sitemap.xml2023-09-09T20:50:32+08:00/cn/sitemap.xml2023-09-09T20:50:32+08:00 \ No newline at end of file