Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Rewrite R demos, replace with vignettes #1944

Closed
Laurae2 opened this issue Jan 13, 2019 · 9 comments
Closed

[R-package] Rewrite R demos, replace with vignettes #1944

Laurae2 opened this issue Jan 13, 2019 · 9 comments

Comments

@Laurae2
Copy link
Contributor

Laurae2 commented Jan 13, 2019

The R package should have all the demos remade, short, concise, and focusing on specific topics, instead of currently being confusing to end-users who are trying to pick up the library (see some the recent issues, especially the multiclass ones).

I'll rewrite most of them later in the next months.

@StrikerRUS
Copy link
Collaborator

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

@jameslamb
Copy link
Collaborator

I want to add (for whoever comes back to this issue), that we should consider factoring ggplot2 out of the demos when they are re-written. Right now the demos are the only reason ggplot2 is still a Suggests dependency of the R package (see #2543 for more info) and I don't think it's giving us a lot of added benefit beyond what can be done in the base plotting options available in R.

@jameslamb
Copy link
Collaborator

I think that when we come to this, we should change from the current demo/ set up to vignettes in R markdown. That will provide a better experience for people reading the documentation from the docs site because we can mix rich text (in markdown) with the code. Right now the demos are all code and comments, which you can only find if you know how to use the demo() function in the terminal.

@jameslamb jameslamb mentioned this issue May 12, 2020
@jameslamb jameslamb changed the title [R-package] Rewrite R demos [R-package] Rewrite R demos, replace with vignettes May 16, 2020
@jameslamb
Copy link
Collaborator

I just updated the title of this issue per this comment: #1944 (comment)

@mayer79
Copy link
Contributor

mayer79 commented Feb 7, 2021

@jameslamb Are there still plans to replace demos by simple vignettes? If yes, I could give it a try.

@jameslamb
Copy link
Collaborator

Yes definitely! I have been focused on the new Dask module in the Python package recently, so haven't been giving as much attention to the R side. I'd love if you submitted a proposal for this.

I don't think you should try to move all demos at once. I think it would be good to have a first pull request that just adds an introductory "Getting Started with LightGBM" vignette, to replace https://github.com/microsoft/LightGBM/blob/master/R-package/demo/basic_walkthrough.R.

Thank you so much for offering to help 😀

@jameslamb jameslamb reopened this Feb 7, 2021
@mayer79
Copy link
Contributor

mayer79 commented Feb 7, 2021

Sure thing!

@jameslamb
Copy link
Collaborator

Pulling this discussion over from #4775 (comment), I'd like to propose the following list of vignettes to start (with the names of relevant demos from https://github.com/microsoft/LightGBM/tree/f7e39388d0327e687854f22d7e06aec302770994/R-package/demo in parentheses)

@mayer79 what do you think about this list? Any topics you think are missing?

  • add one example to https://github.com/microsoft/LightGBM/blob/f7e39388d0327e687854f22d7e06aec302770994/R-package/vignettes/basic_walkthrough.Rmd for each main machine learning task
    • learning to rank
    • binary classification
    • multiclass classification
    • regression
    • quantile regression
  • "loading training data"
    • explanation of the Dataset objct and all the supported formats for input data (incl. loading from file)
    • saving Datasets
    • slicing Datasets, creating validation sets
    • discussion of key parameters like max_bin
  • "customizing the training process"
    • weighted training (weight_param.R)
    • custom objective functions (multiclass_custom_object.R)
    • early stopping (early_stopping.R)
    • custom evaluation metrics
    • boosting from an init_score (boost_from_prediction.R)
    • training continuation using init_model
  • "tuning parameters"
    • how to use lgb.cv() (cross_validation.R)
  • saving and loading models
    • saveRDS() / readRDS()
    • saving models as text
    • saving models as JSON
    • training in R and loading in Python (and the opposite)

Also linking #4859. I think {pkgdown} offering the ability to do redirects on vignette links will be useful, since we might realize we want a different structure after working through this.

I think we should try to avoid documenting too much information that overlaps with other docs in the project. For example, the R-package README (https://lightgbm.readthedocs.io/en/latest/R/index.html) already has significant details about different installation methods, so vignettes don't need to cover that.

Also linking this important discussion with @StrikerRUS about being cautious about vignette code that takes too long to run:

#4775 (comment) and #4775 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants