Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): add a string of OLAP algorithms #1984

Merged
merged 34 commits into from
Nov 9, 2022
Merged

feat(core): add a string of OLAP algorithms #1984

merged 34 commits into from
Nov 9, 2022

Conversation

imbajin
Copy link
Member

@imbajin imbajin commented Oct 19, 2022

add OLAP algorithms like:

  • community: TriangleCount, ClusterCoefficient, LPA, Louvain, WeakConnectedComponent.
  • centrality: PageRank, DegreeCentrality, StressCentrality, BetweennessCentrality, ClosenessCentrality, EigenvectorCentrality.
  • similarity/path: FusiformSimilarity, K-Core, RingsDetect.
  • count: CountVertex, CountEdge.

@imbajin
Copy link
Member Author

imbajin commented Oct 19, 2022

Error: hugegraph-core/src/main/java/com/baidu/hugegraph/job/algorithm/AbstractAlgorithm.java:[61,48]
package jersey.repackaged.com.google.common.base does not exist


INSTANCE.register(new StressCentralityAlgorithmV2());
INSTANCE.register(new BetweennessCentralityAlgorithmV2());
INSTANCE.register(new ClosenessCentralityAlgorithmV2());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm list in commit message can refer to here

@imbajin imbajin added the feature New feature label Oct 21, 2022
if (label != null) {
query.eq(HugeKeys.LABEL, this.getVertexLabelId(label));
}
query.eq(HugeKeys.LABEL, this.getVertexLabelId(label));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep if (label != null) statement to avoid label=null condition

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the null condition has checked in line 378 already (start of the method)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -23,6 +23,7 @@
import java.util.Map;

import org.apache.commons.configuration.PropertiesConfiguration;
import org.apache.commons.configuration2.Configuration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are still compile errors like package javax.inject does not exist

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see lines below

Comment on lines 98 to 99
return new StandardHugeGraph(new HugeConfig(config));
return new StandardHugeGraph(new HugeConfig((Configuration) config));
Copy link
Member Author

@imbajin imbajin Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here the Configuration dependency has changed
old: org.apache.commons.configuration --> new:"org.apache.commons.configuration2"

I try to update it

@codecov
Copy link

codecov bot commented Nov 1, 2022

Codecov Report

Merging #1984 (288a1a4) into master (8869c9e) will decrease coverage by 4.15%.
The diff coverage is 0.04%.

@@             Coverage Diff              @@
##             master    #1984      +/-   ##
============================================
- Coverage     70.51%   66.35%   -4.16%     
+ Complexity      978      976       -2     
============================================
  Files           453      482      +29     
  Lines         39024    41450    +2426     
  Branches       5556     5896     +340     
============================================
- Hits          27518    27506      -12     
- Misses         8806    11244    +2438     
  Partials       2700     2700              
Impacted Files Coverage Δ
...java/com/baidu/hugegraph/api/job/AlgorithmAPI.java 0.00% <0.00%> (ø)
...ain/java/com/baidu/hugegraph/job/AlgorithmJob.java 0.00% <0.00%> (ø)
...idu/hugegraph/job/algorithm/AbstractAlgorithm.java 0.00% <0.00%> (ø)
...m/baidu/hugegraph/job/algorithm/AlgorithmPool.java 0.00% <0.00%> (ø)
...om/baidu/hugegraph/job/algorithm/BfsTraverser.java 0.00% <0.00%> (ø)
...a/com/baidu/hugegraph/job/algorithm/Consumers.java 0.00% <0.00%> (ø)
...du/hugegraph/job/algorithm/CountEdgeAlgorithm.java 0.00% <0.00%> (ø)
.../hugegraph/job/algorithm/CountVertexAlgorithm.java 0.00% <0.00%> (ø)
...hugegraph/job/algorithm/SubgraphStatAlgorithm.java 0.00% <0.00%> (ø)
...raph/job/algorithm/cent/AbstractCentAlgorithm.java 0.00% <0.00%> (ø)
... and 30 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@javeme javeme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine now, can you merge the last 4 commits into one since we need to rebase-merge this PR.

@imbajin
Copy link
Member Author

imbajin commented Nov 1, 2022

seems fine now, can you merge the last 4 commits into one since we need to rebase-merge this PR.

fine,fix the security alert,it will block the ci,then I will rebase them into one commit

@imbajin imbajin force-pushed the olap-algo branch 2 times, most recently from 62f7290 to 30c685f Compare November 2, 2022 07:56
if (label != null) {
query.eq(HugeKeys.LABEL, this.getVertexLabelId(label));
}
query.eq(HugeKeys.LABEL, this.getVertexLabelId(label));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

"sample", -1L,
"top", -1L /* sorted */,
"workers", 0);
private static final Map<String, Object> PARAMS = ImmutableMap.of("depth", 10L,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the line is too long...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

88 chars seems fine?

And we should better not use a irregular align rule in future, it's not friendly enough (like align with Immutablemanually)

@@ -92,7 +92,7 @@ private Void runAndDone() {
this.run();
this.done();
} catch (Throwable e) {
// Only the first exception of one thread can be stored
// Only the first exception to one thread can be stored
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer to keep the origin comment

Copy link
Member Author

@imbajin imbajin Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

original has grammar mistakes (although it seems fine) 😢
image

private Collection<Pair<Community, MutableInt>> nbCommunities(
int pass,
List<Edge> edges) {
private Collection<Pair<Community, MutableInt>> nbCommunities(int pass, List<Edge> edges) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap line after int pass, ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact, It doesn't seem necessary, compare the 2 align, maybe one line is better (when it < 100 chars)

old:
image

new:
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so as this case: (not modify it):

old:
image

new:
image

@javeme
Copy link
Contributor

javeme commented Nov 3, 2022

+2 LMTM.

NOTE please: this PR need to merge by "Rebase and merge", not "Squash and merge".

@imbajin
Copy link
Member Author

imbajin commented Nov 3, 2022

+2 LMTM.

NOTE please: this PR need to merge by "Rebase and merge", not "Squash and merge".

seems the function is forbidden, we need to enable it separately? (and shall we keep it open in other repos?)

rebase: false

@javeme
Copy link
Contributor

javeme commented Nov 4, 2022

+2 LMTM.
NOTE please: this PR need to merge by "Rebase and merge", not "Squash and merge".

seems the function is forbidden, we need to enable it separately? (and shall we keep it open in other repos?)

rebase: false

just enable it when necessary, and don't enable rebase in other repos.

@imbajin imbajin requested review from zyxxoo and coderzc November 7, 2022 11:55
@javeme
Copy link
Contributor

javeme commented Nov 7, 2022

seems need to rebase master locally or don't check out-of-date with the base branch

javeme and others added 12 commits November 7, 2022 22:47
* add olap algo api
* improve source filter
* fix louvain shaking with limit degree
* catch exception for lpa and louvain
* add 3 params for lpa: label,source_label,percision
* improve louvain node store
* remove vertices from class Community
* move showCommunity to AbstractCommAlgorithm
* add some parameters to AbstractAlgorithm
* improve louvain log
* improve clearPass and communities check
* split louvain cache
* fix degreeCentrality bug: degree is always < 500

Change-Id: I2341b981dab44f43ac50ae0f8fa5e51b7acc1b5a
* improve
* move c_label to lower layer and add appendRow(value)
* add community limit 100w for louvain
* improve louvain log
* fix louvain bug

Change-Id: I886ac3e7a3f0dfd49e66fdf544f97f6f7db615df
Change-Id: I5cc3be846ff4536ebfba1f9bf54a0adda7409036
also improve error message cause reason

Change-Id: I15ea8dd651e01ff678a32f19efd3584cd20ffc10
network10000 dataset test:
                   before  after
(depth=4 sample=1) 395s    25s
(depth=3 sample=2) 4300s   35s

same as the closeness_centrality

Change-Id: Ia0c557434bf25f9d13a0b1dc19f66024e08c89df
Change-Id: I20b72ea0da673359e2bd21888010290efca81441
* add modularity parameter for louvain
* fix: louvain lost isolated community from one to next pass

Change-Id: I6a7dadc80635429aa2898939aa337aae01bc8d12
* optimize louvain by multi threads
* implement louvain threads
* fix race condition
* implement merge community by multi threads
* remove debug info
* fix genId race condition
* compatible with serial and parallel computing
* support parallel lpa
* support parallel: Louvain,LPA,Rings,K-Core,Fusiform

Change-Id: I2425d1da58581ea7a61dce72a88355ae3d2dd610
Change-Id: I8eaaeccaa0b23048a9d0f597080186c069b9799b
* support rings count for rings-detect ap algo
* fix direction can't be null or Both for triangle and clusterCoeffcient
* fix source_clabel error for rings-detect, fusiform and kcore
* change limit meaning for fusiform similarity

Change-Id: I5a4ccbf46b47c13ea2eafe7f3e335dc6aea4a83c
Change-Id: I546682b19fb5a84a65dc2a3bd77d62b386722bfa
javeme and others added 20 commits November 7, 2022 22:47
Change-Id: If4faafc1a952186c17314c599f611f7ab6132b7b
change log:
1. add BOTH direction support for triangle_count/cluster_coeffcientith.
2. fix extra triangle count with multi edges between tow adjacent vertices.
3. set default value of direction to BOTH for degree_centrality and cluster_coeffcient .
4. add workers for triangle_count and cluster_coeffcient.
5. fix closeness: multi shortest paths cause results illogical.
6. rename rings_detect to rings, rename limit to each_limit which means limit number of rings of each source vertex, and don't do dedup if passed each_limit > 0.
7. unify top for 4 centrality algos: sorted results when top = -1, unsorted results when top = 0.
8. fusiform: rename top to top_similars (expected >= 0).
9. fusiform/rings: add limit param which means max number of results, and remove capacity param and hardcode to 100000000.

Change-Id: I9ddf8553e6d86b99adbff8b972890d69d623fa1a
Change-Id: I565602945a26c2a575baaa3f17d084b65399a009
call graph close instead of closeTx

Change-Id: I0e329280b067f34daec69c9b1b2b81a6cd3309bf
Change-Id: Iedbafa9a31ed3f5b3fb35ccb0c583ec0a0cfc6ac
* add parameter top to print the top result in job result
Change-Id: Ib24ede9c20bb2c23a3f06fe72c53be2342295fd4
Change-Id: I1b055130c75edce039a97822d9154e1214bc80c2
* improve betweeness by remove boundary vertex of path

Change-Id: I76924daf8d9da113ab7a1aeac536c6080eccb296

* add with_boundary parameter for betweeness

also fix lpa count comms with limit

Change-Id: Iaf675cd87a8dc0b5ef75476144bc8141f2dd4385
Change-Id: I01e402fc99669f53544279c752f81d886c6ce28f
* add betweenness BetweennessCentralityAlgorithmV2
Change-Id: I35bc20698bf93002a530b95688cd0359a9154fdf
Change-Id: Id4fc070268d8957ef2764bffbb3c762d9c034a0d
* add StressCentralityAlgorithmV2
* add BfsTraverser and ClosenessCentralityAlgorithmV2
Change-Id: Ibf0f415ee6bd9cf0ff8b598e2166cc70667c49b7
Change-Id: I843c595172d7b45b894d764bf859dec61a35c8b5
)

Change-Id: Ib2a9766e29708fa4fcbaec152f5c9e571c96a0a6
@javeme
Copy link
Contributor

javeme commented Nov 8, 2022

NOTE please: this PR need to merge by "Rebase and merge", not "Squash and merge".

@imbajin imbajin mentioned this pull request Nov 8, 2022
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants