-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LightGBM: NullPointerException in version 0.16 #514
Comments
@sully90 really sorry about the trouble you are having. Would you be able to provide more logs from the driver and workers? Also, for this issue, that you are trying to workaround: "ran into the GLIBCXX_3.4.20 not found error", what may be a proper fix for this? I could give you a debug build as well if you can send me the jar you built with some extra debug info printed out. |
Hi @imatiach-msft, I've created a gist with the stdout logs here. As for the GLIBCXX_3.4.20 issue, I believe the problem is that centOS 7 doesn't support the version of libstdc++ that was used to compile LightGBM in the build available via the spark packages, even though LightGBM are meant to guarantee
Then when I try and build a model locally (setting spark-master to Thanks a lot for the help! |
Hi,For the GLIBCXX_3.4.20 issue, I set a new release of
|
Also, @sully90 would you be able to send the logs for some of the workers that failed? I looked through the driver log you sent but didn't see anything particularly interesting. If we can debug together over skype/teams/hangouts/phone that might be faster too. |
@a3w4e5r which release of libstdc++.so.6 did you use that works with CentOs 7? I have only used a version for CentOS 6 which I found in https://centos.pkgs.org/6/nux-dextop-x86_64/chrome-deps-stable-3.11-1.x86_64.rpm.html, which results in a connection error when I try to use the model fit function of LightGBM:
|
@edsonaoki I got libstdc++.so.6 from gcc-7.1.0, works with RedHat 7 |
@a3w4e5r thanks! unfortunately I don't have shell access to the Yarn executor nodes, which use RedHat Enterprise 7, and I can't compile gcc there. Is there some way to get a precompiled libstdc++.so.6 for Red Hat 7/CentOS 7? |
@edsonaoki Sorry to reply so late,I can send you mine,leave your Email,If you still need it now. |
Hi @a3w4e5r thanks a lot! Can you send to ? |
Hi @imatiach-msft and all, I used the version of libstdc++.so.6 for CentOs 7 that @a3w4e5r compiled and kindly sent to me. As last time, I don't get the "GLIBCXX_3.4.20 not found" error but I get the connection error when I try to use the model fit function of LightGBM. Here are some details in the driver / executor logs: Driver logs:
Executor logs:
It's a bit strange that based on the logs, the driver successfully receives a connection from the executors, performs some tasks, and sends a response, but there is a "Connection error" with no apparent cause later. For reference, we are running the Spark cluster on Apache Yarn with RHEL 7 (CentOS 7) in the cluster environment. |
Hi everyone. I realised that the LightGBM actually works normally with Yarn in Cluster mode, just not with Yarn in Client mode where it gives the errors above. Perhaps due to the fact that in Client mode, the driver runs in an Ubuntu machine, and hence with a different GLIBC from the executors (running on RHEL 7)? |
closing as now we use the official linux .so files produced from Microsoft/LightGBM build, which uses ubuntu 14.04 docker that does not have the glibc issue. --packages |
@imatiach-msft that's great news! since I can't compile the package on my own (I only have internet connectivity in Windows environment), can you make the Python wheel (.whl) file available somewhere? It's unfortunately necessary to use MMLSpark in the Cloudera Workbench in client mode. |
adding @mhamilton723 - I'm not sure if the python wheel is enough, you need to specify the underlying scala code as part of the spark package. The python wheel is built at that maven url though, but I don't have the permissions to look at the blob, I think Mark may have them. Hmm, I feel like there must exist a better way for you to add a spark package in cloudera workbench, I just don't know anything about how cloudera workbench works. I know in databricks (on azure or aws) you can just add spark packages using the info above directly, and both the scala and pyspark APIs are added. |
Hi @imatiach-msft, to clarify, I can create an internal Maven repository, such that in the cloudera workbench, I can add the MMLSpark as a Spark package pointing to this internal repository. It works normally when I submit the job using spark-submit. However, when I use the Cloudera Workbench interactive mode, the Python libraries aren't loaded automatically when adding the Spark package, i.e. using "import mmlspark" will generate an error. This error is solved when I install manually the Python wheel file. Therefore, the wheel file would suffice. |
Hi all,
I'm getting the same issue as #405, but with the latest release. We're running spark on a centOS 7 cluster, and as such ran into the GLIBCXX_3.4.20 not found error, (see microsoft/LightGBM#1945), so we had to build the library manually with SWIG and set our spark driver path accordingly. While this lets us use LightGBM locally on our spark cluster with mmlspark, we get the following exception when using two or more nodes:
And spark worker logs show the same errors as the previous issue:
We've checked our networking settings and everything seems to be fine, so not sure why we're having issues.
Any help would be greatly appreciated!
Many thanks,
David
The text was updated successfully, but these errors were encountered: