Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A way to count the total number of vertices/edges (In Cassandra) #322

Open
imbajin opened this issue Jan 7, 2019 · 3 comments
Open

A way to count the total number of vertices/edges (In Cassandra) #322

imbajin opened this issue Jan 7, 2019 · 3 comments
Labels
feature New feature

Comments

@imbajin
Copy link
Member

imbajin commented Jan 7, 2019

Expected behavior

Easy and quick to estimate the total number of vertices or edges..

Actual behavior

It's obviously unreasonable to use gremlin's method (like g.V().count) to get the result.
Like #102 , #37

Specifications of environment

Backend Version: Cassandra 3.0x or Cassandra 3.1x

@imbajin
Copy link
Member Author

imbajin commented Jan 7, 2019

解决思路

就cassandra来说,有一个比较好的思路统计总的点边数 : 它自身提供了一个统计某个表总行数(Number of keys)的接口,可以快速得到一个预估但具体数值 (可通过JMX很方便的调用)

  1. 由于每个顶点由唯一的VertexID标识, 那么实际总顶点数一定 顶点表的行数
  2. 然后根据设置的的n副本策略,比如默认的3副本,则总的顶点数 各节点顶点表总行数之和 / 3
    统计总的出/入边也可采用相同的思路, 具体的做法如下

使用方式

  1. 临时命令方式,适用于cassandra的3.0x版本,通过Cassandra自带的notetool执行 : /path/to/bin/nodetool cfstats |grep cfName -A 15 |grep "Number of keys" |awk '{print $5}' 可以直接得到本节点的某个表(cf)的数值, 然后通过工具批量分发其他节点汇总一下, 就能得到总行数.
  2. 另一种是通用的方式, 通过jmx的接口, 需要先启动JMX的agent服务, 默认Cassandra无此端口 ,然后通过比如: curl http://127.0.0.1:7777/jolokia/read/org.apache.cassandra.metrics:type=ColumnFamily,keyspace=hugegraph,scope=graph_vertices,name=EstimatedRowCount 的方式直接获得值, 可以很方便的整合到项目中

以上是这种思路的原理和实现方式, 虽然它的局限性比较明显, 但的确是一个简单快速统计出总点边的方法, 仅供大家参考~

@javeme
Copy link
Contributor

javeme commented Jan 9, 2019

@imbajin 非常感谢

@javeme
Copy link
Contributor

javeme commented Dec 8, 2020

补充Cassandra各版本的差异,可参考:

DSE 5.1 ~ 6.8: Number of partitions (estimate)

Apache Cassandra 3.0 ~ 2.2: Number of keys (estimate)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature
Projects
None yet
Development

No branches or pull requests

2 participants