-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] remove HDFS support? #6436
Labels
Comments
@jameslamb Thank you for referring. For Macports: Since we do not enable this option presently, removing it won’t break anything. So we are good. |
I'm all for this change, simplification is always good and if we neither have tests nor indication that nobody has used this (extensively), that's even more of an argument 👍 |
Thanks for the heads up. will remove the arg from nixpkgs when updating for 4.4.0 |
Merged
jameslamb
added a commit
that referenced
this issue
Jul 10, 2024
jameslamb
added a commit
that referenced
this issue
Jul 12, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Proposal
I'm requesting comment on the following proposal:
USE_HDFS
And doing all of these only after issuing build-time and run-time deprecation warnings for 1 release.
Summary
6+ years ago, #1243 was merged, adding "experimental" support for Hadoop Filesystem ("HDFS") to LightGBM.
Specifically, that meant adding the ability to provide an HDFS URI to Dataset creation methods that accept files, and have LightGBM read the file from HDFS.
Since then:
I don't believe SynapseML depends on the HDFS build of LightGBM (GitHub search). Hopefully @imatiach-msft or @mhamilton723 can confirm.
Searching the rest of GitHub (GitHub search), I just see other forks of LightGBM and two recipes from repackages:
Motivation
Simplifies the project's public interface (build system options, installation docs, etc.).
Simplifies file-reading C++ code, making it easier to understand and refactor in the future.
Here in 2024, users have other and better options for using LightGBM distributed training on data in HDFS, like:
lightgbm.dask
(Dask docs on Hadoop)SynapseML
(https://github.com/microsoft/SynapseML)The text was updated successfully, but these errors were encountered: