Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whether xgboost_requires needs to be changed #797

Open
t-karatsu opened this issue Feb 17, 2021 · 11 comments
Open

Whether xgboost_requires needs to be changed #797

t-karatsu opened this issue Feb 17, 2021 · 11 comments

Comments

@t-karatsu
Copy link

I'm sorry if I made a mistake in the place to post.

https://github.com/dask/dask-ml/blob/main/setup.py
Currently dask-ml depends on the following OSS when building with xgboost function:

xgboost_requires = ["dask-xgboost", "xgboost"]

https://github.com/dmlc/xgboost/blob/master/python-package/setup.py
However currently xgboost owns the dask option.
Does xgboost_requires need to be modified?

  • xgboost+dask only
    or
  • dask-xgboost and xgboost+dask
@adamjstewart
Copy link

To clarify, according to dask/dask-xgboost#80, dask-xgboost is now deprecated and packages should now depend on xgboost+dask instead.

@TomAugspurger
Copy link
Member

Yep, @adamjstewart is correct. (it'd just be xgboost in the dask-ml[xgboost]` extra, since dask is already in the main requirements).

@t-karatsu
Copy link
Author

t-karatsu commented Feb 19, 2021

@TomAugspurger
Thanks for your reply.
I tried to correct some sources and build, but one of the xgboost modules was not found when building docs. Is there a problem with my fix?

  • Draft of correction
diff --git a/dask_ml/xgboost.py b/dask_ml/xgboost.py
index 86e841db..70551df7 100644
--- a/dask_ml/xgboost.py
+++ b/dask_ml/xgboost.py
@@ -4,4 +4,4 @@ This may be used for training an XGBoost model on a cluster. XGBoost
 will be setup in distributed mode alongside your existing
 ``dask.distributed`` cluster.
 """
-from dask_xgboost import *  # noqa
+from xgboost import *  # noqa
diff --git a/setup.py b/setup.py
index 857f6911..fb280973 100644
--- a/setup.py
+++ b/setup.py
@@ -35,7 +35,7 @@ test_requires = [
     "pytest-mock",
 ]
 dev_requires = doc_requires + test_requires
-xgboost_requires = ["dask-xgboost", "xgboost"]
+xgboost_requires = ["xgboost[dask]"]
 complete_requires = xgboost_requires

 extras_require = {
  • Occurred warning

Only the dask_ml.xgboost.predict module was not found.

==> [2021-02-19-18:10:56.073922] 'make' '-j16' 'html'
Running Sphinx v3.2.0
making output directory... done
WARNING: html_static_path entry '_static' does not exist
loading intersphinx inventory from https://docs.python.org/3.6/objects.inv...
loading intersphinx inventory from http://scikit-learn.org/stable/objects.inv...
loading intersphinx inventory from https://docs.dask.org/en/latest/objects.inv...
loading intersphinx inventory from https://distributed.dask.org/en/latest/objects.inv...
loading intersphinx inventory from http://dask-glm.readthedocs.io/en/latest/objects.inv...
intersphinx inventory has moved: http://scikit-learn.org/stable/objects.inv -> https://scikit-learn.org/stable/objects.inv
intersphinx inventory has moved: http://dask-glm.readthedocs.io/en/latest/objects.inv -> https://dask-glm.readthedocs.io/
en/latest/objects.inv
[autosummary] generating autosummary for: changelog.rst, clustering.rst, compose.rst, contributing.rst, cross_validation.rst, examples.rst, glm.rst, history.rst, hyper-parameter-search.rst, incremental.rst, ..., keras.rst, meta-estimators.rst, modules/api.rst, modules/generted/dask_ml.compose.ColumnTransformer.rst, modules/generted/dask_ml.compose.make_column_transformer.rst, naive-bayes.rst, preprocessing.rst, pytorch.rst, roadmap.rst, xgboost.rst
WARNING: [autosummary] failed to import 'dask_ml.xgboost.predict': no module named dask_ml.xgboost.predict

This module was not loaded.

reading sources... [ 86%] modules/generated/dask_ml.wrappers.ParallelPostFit
reading sources... [ 87%] modules/generated/dask_ml.xgboost.XGBClassifier
reading sources... [ 89%] modules/generated/dask_ml.xgboost.XGBRegressor
reading sources... [ 90%] modules/generated/dask_ml.xgboost.train
reading sources... [ 91%] modules/generted/dask_ml.compose.ColumnTransformer

@t-karatsu
Copy link
Author

@TomAugspurger
Could you check this?
Is it still difficult to be consistent to change dependencies now?

@TomAugspurger
Copy link
Member

I can take a look next week.

@t-karatsu
Copy link
Author

Hello, @TomAugspurger. What’s the situation of this issue after that?

@TomAugspurger
Copy link
Member

I think we can just remove the xgboost stuff in the setup.py and update the docs at https://ml.dask.org/xgboost.html to use it.

People should just use xgboost.dask, and there's really no reason to add it under the dask_ml.xgboost namespace.

@jakirkham
Copy link
Member

cc @hcho3 @trivialfis @JohnZed (for vis)

@t-karatsu
Copy link
Author

t-karatsu commented Mar 26, 2021

What you want to say is

  • Delete dask-ml / xgboost.py
  • To use the xgboost function, call it directly from xgboost.dask

don't you? I understood, thanks.

@TomAugspurger
Copy link
Member

To use the xgboost function, call it directly from xgboost.dask

Correct.

Delete dask-ml / xgboost.py

We'll want to add a deprecation warning to that model that emits when people import it.

@mmccarty
Copy link
Member

Looks like this is being worked on in #844

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants