LightGBM: NullPointerException in version 0.16 #514

sully90 · 2019-03-13T12:53:22Z

Hi all,

I'm getting the same issue as #405, but with the latest release. We're running spark on a centOS 7 cluster, and as such ran into the GLIBCXX_3.4.20 not found error, (see microsoft/LightGBM#1945), so we had to build the library manually with SWIG and set our spark driver path accordingly. While this lets us use LightGBM locally on our spark cluster with mmlspark, we get the following exception when using two or more nodes:

[Stage 3:>                                                        (0 + 12) / 16]2019-03-08 07:35:23 WARN  TaskSetManager:66 - Lost task 3.0 in stage 3.0 (TID 243, 178.63.65.13, executor 1): java.lang.NullPointerException
at com.microsoft.ml.spark.TrainUtils$.trainLightGBM(TrainUtils.scala:218)
at com.microsoft.ml.spark.LightGBMClassifier$$anonfun$3.apply(LightGBMClassifier.scala:83)
at com.microsoft.ml.spark.LightGBMClassifier$$anonfun$3.apply(LightGBMClassifier.scala:83)
at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5.apply(objects.scala:188)
at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5.apply(objects.scala:185)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:836)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:836)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

And spark worker logs show the same errors as the previous issue:

2019-03-08 07:35:22 INFO  LightGBMClassifier:192 - LightGBM worker got nodes for network init: null
2019-03-08 07:35:22 INFO  LightGBMClassifier:192 - LightGBM worker got nodes for network init: null
2019-03-08 07:35:22 INFO  LightGBMClassifier:192 - LightGBM worker got nodes for network init: null
2019-03-08 07:35:22 INFO  LightGBMClassifier:192 - LightGBM worker got nodes for network init: null
2019-03-08 07:35:22 INFO  LightGBMClassifier:192 - LightGBM worker got nodes for network init: null
2019-03-08 07:35:22 INFO  LightGBMClassifier:192 - LightGBM worker got nodes for network init: null

We've checked our networking settings and everything seems to be fine, so not sure why we're having issues.
Any help would be greatly appreciated!
Many thanks,
David

The text was updated successfully, but these errors were encountered:

imatiach-msft · 2019-03-13T19:13:58Z

@sully90 really sorry about the trouble you are having. Would you be able to provide more logs from the driver and workers? Also, for this issue, that you are trying to workaround: "ran into the GLIBCXX_3.4.20 not found error", what may be a proper fix for this? I could give you a debug build as well if you can send me the jar you built with some extra debug info printed out.

sully90 · 2019-03-14T09:26:29Z

Hi @imatiach-msft, I've created a gist with the stdout logs here. As for the GLIBCXX_3.4.20 issue, I believe the problem is that centOS 7 doesn't support the version of libstdc++ that was used to compile LightGBM in the build available via the spark packages, even though LightGBM are meant to guarantee GLIBCXX <= 3.4.19 as of version 2.2.2 (see microsoft/LightGBM#1858). I haven't tried building a MMLSpark JAR, but have tried the following:

Build LightGBM locally:

git clone --recursive -b v2.2.2 https://github.com/Microsoft/LightGBM
cd LightGBM
export JAVA_HOME=/usr/java/latest
cmake -DUSE_SWIG=ON .
make -j8

Update /opt/spark/spark-2.4.0-bin-hadoop2.7/conf/spark-env.sh on each node and add lib_lightgbm.so and lib_lightgbm_swig.so to $LD_LIBRARY_PATH

Then when I try and build a model locally (setting spark-master to local[*]) it works fine by including mmlspark via --packages Azure:mmlspark:0.16.

Thanks a lot for the help!

a3w4e5r · 2019-03-15T02:00:25Z

Hi，For the GLIBCXX_3.4.20 issue, I set a new release of libstdc++.so.6 in spark.yarn.appMasterEnv/executorEnv, and it works ! (master: yarn-cluster)
)

$SPARK_HOME/bin/spark-submit \
--conf "spark.yarn.appMasterEnv.LD_PRELOAD=libstdc++.so.6" \
--conf "spark.yarn.executorEnv.LD_PRELOAD=libstdc++.so.6" \
--files $WORK_PATH/libstdc++.so.6 \

imatiach-msft · 2019-03-29T21:33:51Z

@sully90 does the suggestion from @a3w4e5r work for you? Do you still see the network error even with the newer version of libstdc++.so.6?

imatiach-msft · 2019-03-29T21:37:08Z

Also, @sully90 would you be able to send the logs for some of the workers that failed? I looked through the driver log you sent but didn't see anything particularly interesting. If we can debug together over skype/teams/hangouts/phone that might be faster too.

edsonaoki · 2019-04-04T02:52:24Z

@a3w4e5r which release of libstdc++.so.6 did you use that works with CentOs 7? I have only used a version for CentOS 6 which I found in https://centos.pkgs.org/6/nux-dextop-x86_64/chrome-deps-stable-3.11-1.x86_64.rpm.html, which results in a connection error when I try to use the model fit function of LightGBM:

java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at java.net.Socket.connect(Socket.java:538) at java.net.Socket.<init>(Socket.java:434) at java.net.Socket.<init>(Socket.java:211) at com.microsoft.ml.spark.TrainUtils$.getNodes(TrainUtils.scala:178) at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:211) at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:205) at com.microsoft.ml.spark.StreamUtilities$.using(StreamUtilities.scala:29) at com.microsoft.ml.spark.TrainUtils$.trainLightGBM(TrainUtils.scala:204) at com.microsoft.ml.spark.LightGBMRegressor$$anonfun$3.apply(LightGBMRegressor.scala:90) at com.microsoft.ml.spark.LightGBMRegressor$$anonfun$3.apply(LightGBMRegressor.scala:90) at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$6.apply(objects.scala:196) at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$6.apply(objects.scala:193) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 19/04/04 10:34:30 ERROR executor.Executor: Exception in task 0.3 in stage 3.0 (TID 8) java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at java.net.Socket.connect(Socket.java:538) at java.net.Socket.<init>(Socket.java:434) at java.net.Socket.<init>(Socket.java:211) at com.microsoft.ml.spark.TrainUtils$.getNodes(TrainUtils.scala:178) at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:211) at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:205) at com.microsoft.ml.spark.StreamUtilities$.using(StreamUtilities.scala:29) at com.microsoft.ml.spark.TrainUtils$.trainLightGBM(TrainUtils.scala:204) at com.microsoft.ml.spark.LightGBMRegressor$$anonfun$3.apply(LightGBMRegressor.scala:90) at com.microsoft.ml.spark.LightGBMRegressor$$anonfun$3.apply(LightGBMRegressor.scala:90) at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$6.apply(objects.scala:196) at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$6.apply(objects.scala:193) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

a3w4e5r · 2019-04-04T03:14:04Z

@edsonaoki I got libstdc++.so.6 from gcc-7.1.0, works with RedHat 7

edsonaoki · 2019-04-04T06:29:01Z

@a3w4e5r thanks! unfortunately I don't have shell access to the Yarn executor nodes, which use RedHat Enterprise 7, and I can't compile gcc there. Is there some way to get a precompiled libstdc++.so.6 for Red Hat 7/CentOS 7?

a3w4e5r · 2019-04-08T09:13:28Z

@edsonaoki Sorry to reply so late，I can send you mine，leave your Email，If you still need it now.

edsonaoki · 2019-04-10T08:48:23Z

Hi @a3w4e5r thanks a lot! Can you send to ?

edsonaoki · 2019-04-16T01:46:56Z

Hi @imatiach-msft and all, I used the version of libstdc++.so.6 for CentOs 7 that @a3w4e5r compiled and kindly sent to me. As last time, I don't get the "GLIBCXX_3.4.20 not found" error but I get the connection error when I try to use the model fit function of LightGBM. Here are some details in the driver / executor logs:

Driver logs:

2019-04-15 08:59:50 INFO LightGBMRegressor:109 - driver expecting 1 connections...
2019-04-15 08:59:50 INFO LightGBMRegressor:111 - driver accepting a new connection...
2019-04-15 08:59:50 INFO LightGBMRegressor:134 - driver waiting for connections on host: 10.70.22.55 and port: 37521
2019-04-15 08:59:50 INFO LightGBMRegressor:86 - LightGBMRegressor parameters: alpha=0.2 tweedie_variance_power=1.5 is_pre_partition=True boosting_type=gbdt tree_learner=data_parallel num_iterations=100 learning_rate=0.3 num_leaves=31 max_bin=255 bagging_fraction=1.0 bagging_freq=0 bagging_seed=3 early_stopping_round=0 feature_fraction=1.0 max_depth=-1 min_sum_hessian_in_leaf=0.001 num_machines=1 objective=quantile verbosity=1 boost_from_average=true
2019-04-15 08:59:50 INFO SparkContext:54 - Starting job: reduce at LightGBMRegressor.scala:92
2019-04-15 08:59:50 INFO DAGScheduler:54 - Got job 3 (reduce at LightGBMRegressor.scala:92) with 1 output partitions
2019-04-15 08:59:50 INFO DAGScheduler:54 - Final stage: ResultStage 3 (reduce at LightGBMRegressor.scala:92)
2019-04-15 08:59:50 INFO DAGScheduler:54 - Parents of final stage: List()
2019-04-15 08:59:50 INFO DAGScheduler:54 - Missing parents: List()
2019-04-15 08:59:50 INFO DAGScheduler:54 - Submitting ResultStage 3 (MapPartitionsRDD[28] at reduce at LightGBMRegressor.scala:92), which has no missing parents
2019-04-15 08:59:50 INFO MemoryStore:54 - Block broadcast_5 stored as values in memory (estimated size 27.2 KB, free 633.4 MB)
2019-04-15 08:59:50 INFO MemoryStore:54 - Block broadcast_5_piece0 stored as bytes in memory (estimated size 11.9 KB, free 633.4 MB)
2019-04-15 08:59:50 INFO BlockManagerInfo:54 - Added broadcast_5_piece0 in memory on 10.70.22.55:24623 (size: 11.9 KB, free: 633.8 MB)
2019-04-15 08:59:50 INFO SparkContext:54 - Created broadcast 5 from broadcast at DAGScheduler.scala:1006
2019-04-15 08:59:50 INFO DAGScheduler:54 - Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[28] at reduce at LightGBMRegressor.scala:92) (first 15 tasks are for partitions Vector(0))
2019-04-15 08:59:50 INFO YarnScheduler:54 - Adding task set 3.0 with 1 tasks
2019-04-15 08:59:50 INFO TaskSetManager:54 - Starting task 0.0 in stage 3.0 (TID 5, x01gadaapp128a.vsi.sgp.dbs.com, executor 2, partition 0, NODE_LOCAL, 5563 bytes)
2019-04-15 08:59:50 INFO BlockManagerInfo:54 - Added broadcast_5_piece0 in memory on x01gadaapp128a.vsi.sgp.dbs.com:40577 (size: 11.9 KB, free: 366.2 MB)
2019-04-15 08:59:50 INFO BlockManagerInfo:54 - Added rdd_24_0 in memory on x01gadaapp128a.vsi.sgp.dbs.com:40577 (size: 66.7 KB, free: 366.2 MB)
2019-04-15 08:59:52 WARN TaskSetManager:66 - Lost task 0.0 in stage 3.0 (TID 5, x01gadaapp128a.vsi.sgp.dbs.com, executor 2): java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at java.net.Socket.(Socket.java:434)
at java.net.Socket.(Socket.java:211)
at com.microsoft.ml.spark.TrainUtils$.getNodes(TrainUtils.scala:178)
at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:211)
at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:205)
at com.microsoft.ml.spark.StreamUtilities$.using(StreamUtilities.scala:29)
at com.microsoft.ml.spark.TrainUtils$.trainLightGBM(TrainUtils.scala:204)
at com.microsoft.ml.spark.LightGBMRegressor$$anonfun$3.apply(LightGBMRegressor.scala:90)
at com.microsoft.ml.spark.LightGBMRegressor$$anonfun$3.apply(LightGBMRegressor.scala:90)
at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$6.apply(objects.scala:196)
at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$6.apply(objects.scala:193)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Executor logs:

19/04/15 16:59:50 INFO LightGBMRegressor: Successfully bound to port 12432
19/04/15 16:59:52 INFO LightGBMRegressor: LightGBM worker connecting to host: 10.70.22.55 and port: 37521
19/04/15 16:59:52 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 5)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at java.net.Socket.(Socket.java:434)
at java.net.Socket.(Socket.java:211)
at com.microsoft.ml.spark.TrainUtils$.getNodes(TrainUtils.scala:178)
at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:211)
at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:205)
at com.microsoft.ml.spark.StreamUtilities$.using(StreamUtilities.scala:29)
at com.microsoft.ml.spark.TrainUtils$.trainLightGBM(TrainUtils.scala:204)
at com.microsoft.ml.spark.LightGBMRegressor$$anonfun$3.apply(LightGBMRegressor.scala:90)
at com.microsoft.ml.spark.LightGBMRegressor$$anonfun$3.apply(LightGBMRegressor.scala:90)
at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$6.apply(objects.scala:196)
at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$6.apply(objects.scala:193)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

It's a bit strange that based on the logs, the driver successfully receives a connection from the executors, performs some tasks, and sends a response, but there is a "Connection error" with no apparent cause later.

For reference, we are running the Spark cluster on Apache Yarn with RHEL 7 (CentOS 7) in the cluster environment.

edsonaoki · 2019-04-17T09:31:54Z

Hi everyone. I realised that the LightGBM actually works normally with Yarn in Cluster mode, just not with Yarn in Client mode where it gives the errors above. Perhaps due to the fact that in Client mode, the driver runs in an Ubuntu machine, and hence with a different GLIBC from the executors (running on RHEL 7)?

imatiach-msft · 2019-05-01T16:21:34Z

closing as now we use the official linux .so files produced from Microsoft/LightGBM build, which uses ubuntu 14.04 docker that does not have the glibc issue.
This was fixed with the PR:
#526
Which updates the lightgbm version to:
"com.microsoft.ml.lightgbm" % "lightgbmlib" % "2.2.350"
It should be available in next release. For now, you can use the latest builds from master, eg the build for that PR was:

--packages
com.microsoft.ml.spark:mmlspark_2.11:0.17.dev1+1.g5e0b2a0
and --repositories
https://mmlspark.azureedge.net/maven

edsonaoki · 2019-05-06T04:43:05Z

@imatiach-msft that's great news! since I can't compile the package on my own (I only have internet connectivity in Windows environment), can you make the Python wheel (.whl) file available somewhere? It's unfortunately necessary to use MMLSpark in the Cloudera Workbench in client mode.

imatiach-msft · 2019-05-06T05:11:21Z

adding @mhamilton723 - I'm not sure if the python wheel is enough, you need to specify the underlying scala code as part of the spark package. The python wheel is built at that maven url though, but I don't have the permissions to look at the blob, I think Mark may have them. Hmm, I feel like there must exist a better way for you to add a spark package in cloudera workbench, I just don't know anything about how cloudera workbench works. I know in databricks (on azure or aws) you can just add spark packages using the info above directly, and both the scala and pyspark APIs are added.

edsonaoki · 2019-05-06T06:00:31Z

Hi @imatiach-msft, to clarify, I can create an internal Maven repository, such that in the cloudera workbench, I can add the MMLSpark as a Spark package pointing to this internal repository. It works normally when I submit the job using spark-submit.

However, when I use the Cloudera Workbench interactive mode, the Python libraries aren't loaded automatically when adding the Spark package, i.e. using "import mmlspark" will generate an error. This error is solved when I install manually the Python wheel file. Therefore, the wheel file would suffice.

mhamilton723 added the area/lightgbm label Apr 4, 2019

imatiach-msft closed this as completed May 1, 2019

edsonaoki mentioned this issue May 9, 2019

WARN JsonImpl: Error while sending REST request. Temporary disabling JSON requests #563

Closed

StrikerRUS mentioned this issue Feb 20, 2020

feat: Change locking strategy of Booster, allow for share and unique locks microsoft/LightGBM#2760

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LightGBM: NullPointerException in version 0.16 #514

LightGBM: NullPointerException in version 0.16 #514

sully90 commented Mar 13, 2019

imatiach-msft commented Mar 13, 2019

sully90 commented Mar 14, 2019

a3w4e5r commented Mar 15, 2019

imatiach-msft commented Mar 29, 2019

imatiach-msft commented Mar 29, 2019

edsonaoki commented Apr 4, 2019 •

edited

Loading

a3w4e5r commented Apr 4, 2019

edsonaoki commented Apr 4, 2019

a3w4e5r commented Apr 8, 2019

edsonaoki commented Apr 10, 2019 •

edited

Loading

edsonaoki commented Apr 16, 2019

edsonaoki commented Apr 17, 2019

imatiach-msft commented May 1, 2019

edsonaoki commented May 6, 2019 •

edited

Loading

imatiach-msft commented May 6, 2019

edsonaoki commented May 6, 2019

LightGBM: NullPointerException in version 0.16 #514

LightGBM: NullPointerException in version 0.16 #514

Comments

sully90 commented Mar 13, 2019

imatiach-msft commented Mar 13, 2019

sully90 commented Mar 14, 2019

a3w4e5r commented Mar 15, 2019

imatiach-msft commented Mar 29, 2019

imatiach-msft commented Mar 29, 2019

edsonaoki commented Apr 4, 2019 • edited Loading

a3w4e5r commented Apr 4, 2019

edsonaoki commented Apr 4, 2019

a3w4e5r commented Apr 8, 2019

edsonaoki commented Apr 10, 2019 • edited Loading

edsonaoki commented Apr 16, 2019

edsonaoki commented Apr 17, 2019

imatiach-msft commented May 1, 2019

edsonaoki commented May 6, 2019 • edited Loading

imatiach-msft commented May 6, 2019

edsonaoki commented May 6, 2019

edsonaoki commented Apr 4, 2019 •

edited

Loading

edsonaoki commented Apr 10, 2019 •

edited

Loading

edsonaoki commented May 6, 2019 •

edited

Loading