Can we have Node2Vec example #49

porscheme · 2022-05-24T21:05:38Z

As the subject says can we have Node2Vec example?
@wey-gu

wey-gu · 2022-05-25T08:24:57Z

Will look into this and come with an example ;)

wey-gu · 2022-05-29T08:59:42Z

Today I got the bandwidth to run node2vec for you: https://gist.github.com/wey-gu/53e35bc2da571a919f4f0c248c5dd9fc as an example

sunkararp · 2022-06-16T03:06:25Z

What version spark/scala should I using to run nebula-algorithm?

Can you provide spark 3.0.0 compatible version?

wey-gu · 2022-06-16T04:21:03Z

What version spark/scala should I using to run nebula-algorithm?

Can you provide spark 3.0.0 compatible version?

For now, it's 2.4.x only as documented, could you possibly use 2.4.x first?

I noticed nebula-exchange supported 3.0.0 with vesoft-inc/nebula-exchange#41, but the equivalent work is not yet planned in nebula-algorithm, but i created an issue for it just now.

sunkararp · 2022-06-16T04:26:21Z

We normally use spark 3.2.1 but downgraded for Nebula to spark 2.4.6 and scala 2.11.12
Getting below error

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/ReadSupport
        at java.base/java.lang.ClassLoader.defineClass1(Native Method)
        at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017)
        at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:174)
        at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:800)
        at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:698)
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:621)
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:579)
        at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
         $NebulaDataFrameReader.loadEdgesToDF(package.scala:146)

sunkararp · 2022-06-16T04:35:33Z

It's blocking us making any progress; can you expedite support to spark 3?

wey-gu · 2022-06-16T06:29:12Z

It's blocking us making any progress; can you expedite support to spark 3?

@Nicole00 could you help point directions on why java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/ReadSupport encountered in spark 2.4.6 and scala 2.11.12, please?

wey-gu · 2022-06-16T06:34:55Z

@sunkararp Before @Nicole00 could help look into it, maybe you could refer to my nebula-up playground environment to see the differences?

https://github.com/wey-gu/nebula-up/

after running curl -fsSL nebula-up.siwei.io/all-in-one.sh | bash -s -- v3 spark in a machine with docker, you will have a nebula graph + spark 2.4.

then ~/.nebula-up/nebula-algo-pagerank-example.sh will run page rank in the spark container, you could enter the spark container with docker exec -it spark_master_1 bash to check its difference from yours 2.4.6?

sunkararp · 2022-06-16T06:38:59Z

I was using Spark 2.4.6 and Scala 2.11.12
I'm having some challenges in building scala with SBT as there wasn't sbt support to 2.11.12, this could be an issue. I'm not sure
It could some version incompatibility, not sure

sunkararp · 2022-06-16T22:38:39Z

I'm finally able to run in spark 2.4.6, Scala 2.11.12 and OpenJDK 64-Bit 1.8.0_252.

But getting java.lang.NullPointerException

@ Ln 158/160 in this Node2vecAlgo.scala file
@ Ln 165/160 in this Node2vecAlgo.scala file
The metaAddress, I had to create ClusterIP. Cluster Helm charts does not expose this

Can you please look into this ASAP?

Below is my spark-submit

spark-submit --master "spark://10.155.48.35:7077" --conf spark.driver.extraClassPath=/home/jovyan/* --conf spark.executor.extraClassPath=/home/jovyan/* --conf spark.executor.instances=3 --conf spark.executor.memory=16G --conf spark.driver.maxResultSize=10G --conf spark.driver.host=10.155.50.21 --class com.vesoft.nebula.algorithm.Main --packages "com.vesoft:nebula-spark-connector:3.0.0,org.apache.spark:spark-core_2.11:2.4.4,org.apache.spark:spark-sql_2.11:2.4.4,com.github.scopt:scopt_2.11:3.7.1,com.typesafe:config:1.4.0,org.apache.spark:spark-mllib_2.11:2.4.4" --deploy-mode client nebula-algorithm-3.0.0.jar -p dev.algorithm.conf

below is my conf file

{
  spark: {
    app: {
        name: My Graph Algorithm 1.0
        partitionNum:100
    }
    master:local
  }

  data: {
    source: nebula
    sink: nebula
    hasWeight: false
  }

  nebula: {
    read: {
        graphAddress: "10.0.195.64:9669"
        metaAddress: "10.0.213.158:9559"
        space: StudentCentral
        user:root
        pswd:nebula        
        labels: ["STUDENT_HAS_CLASS_TCODE"]
    }

    write:{
        graphAddress: "10.0.195.64:9669"
        metaAddress: "10.0.213.158:9559"
        user:root
        pswd:nebula
        space:StudentCentral
        tag:Student
        type:update
    }
  }  


  algorithm: {
    executeAlgo: node2vec
   node2vec:{
       maxIter: 10,
       lr: 0.025,
       dataNumPartition: 10,
       modelNumPartition: 10,
       dim: 10,
       window: 3,
       walkLength: 1,
       numWalks: 3,
       p: 1.0,
       q: 1.0,
       directed: true,
       degree: 30,
       embSeparate: ",",
       modelPath: "hdfs://namenode:9000/model"
    }
  }
}

sunkararp · 2022-06-20T03:42:20Z

finally able to fix the java.lang.NullPointerException, fix it explained here
It works fine for smaller dataset
Implementation wasn't using spark worker nodes, was it a known issue?
For large data, we are getting java.lang.OutOfMemoryError: GC overhead limit exceeded exception

wey-gu · 2022-06-20T07:00:02Z

Great to see your explorations and results 👍🏻, sorry I couldn't help you on them.

@Nicole00 , should we apply Setting Directed Flag Causes NullPointerException aditya-grover/node2vec#29 to prevent java.lang.NullPointerException or?
Implementation wasn't using spark worker nodes, was it a known issue?
- could you please help look into this, @Nicole00 ?
For large data, we are getting java.lang.OutOfMemoryError: GC overhead limit exceeded exception, do you think it's due to mem leak or this physically consumed your cluster's memory?

sunkararp · 2022-07-07T04:38:24Z

This implementation works for small dataset
For large dataset, you need huge amount of memory to process. Also, it doesn't use spark capabilities
Do you have any modifications to huge dataset? Any Pregel based solutions?

wey-gu mentioned this issue May 28, 2022

Weekly Report 2022-05-27 vesoft-inc/nebula-community#113

Closed

Nicole00 mentioned this issue Nov 16, 2022

fix npe for setting directed flag #58

Merged

Sophie-Xie added the type/question Type: question about the product label Nov 30, 2022

Nicole00 closed this as completed in #58 Jan 5, 2023

wey-gu mentioned this issue Jan 7, 2023

Weekly Report 2023-01-06 vesoft-inc/nebula-community#185

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we have Node2Vec example #49

Can we have Node2Vec example #49

porscheme commented May 24, 2022

wey-gu commented May 25, 2022

wey-gu commented May 29, 2022 •

edited

Loading

sunkararp commented Jun 16, 2022

wey-gu commented Jun 16, 2022

sunkararp commented Jun 16, 2022 •

edited

Loading

sunkararp commented Jun 16, 2022 •

edited

Loading

wey-gu commented Jun 16, 2022

wey-gu commented Jun 16, 2022 •

edited

Loading

sunkararp commented Jun 16, 2022 •

edited

Loading

sunkararp commented Jun 16, 2022 •

edited

Loading

sunkararp commented Jun 20, 2022

wey-gu commented Jun 20, 2022

sunkararp commented Jul 7, 2022

Can we have Node2Vec example #49

Can we have Node2Vec example #49

Comments

porscheme commented May 24, 2022

wey-gu commented May 25, 2022

wey-gu commented May 29, 2022 • edited Loading

sunkararp commented Jun 16, 2022

wey-gu commented Jun 16, 2022

sunkararp commented Jun 16, 2022 • edited Loading

sunkararp commented Jun 16, 2022 • edited Loading

wey-gu commented Jun 16, 2022

wey-gu commented Jun 16, 2022 • edited Loading

sunkararp commented Jun 16, 2022 • edited Loading

sunkararp commented Jun 16, 2022 • edited Loading

sunkararp commented Jun 20, 2022

wey-gu commented Jun 20, 2022

sunkararp commented Jul 7, 2022

wey-gu commented May 29, 2022 •

edited

Loading

sunkararp commented Jun 16, 2022 •

edited

Loading

sunkararp commented Jun 16, 2022 •

edited

Loading

wey-gu commented Jun 16, 2022 •

edited

Loading

sunkararp commented Jun 16, 2022 •

edited

Loading

sunkararp commented Jun 16, 2022 •

edited

Loading