-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-package] [docs] add intro vignette #3946
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for taking the time to set this up!! I left a first round of comments for your consideration. Later tonight I'll try pulling this and knitting the vignettes myself, and I also try pkgdown::build_site()
to see what that output looks like.
This comment has been minimized.
This comment has been minimized.
Co-authored-by: James Lamb <[email protected]>
… into vignette-basics
Co-authored-by: James Lamb <[email protected]>
Co-authored-by: James Lamb <[email protected]>
… into vignette-basics
Co-authored-by: James Lamb <[email protected]>
Co-authored-by: James Lamb <[email protected]>
Co-authored-by: James Lamb <[email protected]>
Co-authored-by: James Lamb <[email protected]>
Co-authored-by: James Lamb <[email protected]>
… into vignette-basics
Co-authored-by: James Lamb <[email protected]>
Co-authored-by: James Lamb <[email protected]>
… into vignette-basics
Co-authored-by: James Lamb <[email protected]>
Sorry, I know I just marked this "Ready for Review" but I want to move it back to Draft while I investigate one other thing. I think that I might be able to make this work with CMake-based builds, which would simplify this PR a lot. |
/gha run r-valgrind Workflow R valgrind tests has been triggered! 🚀 Status: failure ❌. |
/gha run r-solaris Workflow Solaris CRAN check has been triggered! 🚀 solaris-x86-patched: https://builder.r-hub.io/status/lightgbm_3.3.1.99.tar.gz-7206a1d9c07147dbb811520a9b1e43a3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think this is ready for review! Since so much of the code on this branch is now mine, I don't think my approval should count toward a merge.
@StrikerRUS @Laurae2 when you have time, could you please take a look?
@mayer79 I'd also welcome any suggestions you have on the changes I've pushed.
There has been a lot of discussion and comments on this PR as we worked through it, so I'll try to summarize the key points here.
There is also a section in "Writing R Extensions" dedicated to building vignettes, which could be helpful for reviewing: https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Writing-package-vignettes.
1. Building {lightgbm}
now requires installing it
details (click me)
The package has to now be installed during R CMD build
in order to build vignettes.
From "Writing R Extensions":
R CMD build will automatically create the (PDF or HTML versions of the) vignettes in inst/doc for distribution with the package sources. By including the vignette outputs in the package sources it is not necessary that these can be re-built at install time, i.e., the package author can use private R packages, screen snapshots and LaTeX extensions which are only available on their machine.
That means that code like this:
sh build-cran-package.sh
R CMD INSTALL lightgbm_*.tar.gz
will now end up compiling LightGBM twice.
This makes our CI jobs take a bit longer, but the effect on users should be minimal. Vignettes won't be re-built when anyone runs install.packages()
to install from CRAN. This PR introduces the option --no-build-vignettes
to build-cran-package.sh
and build_r.R
to allow people building from source to opt out of rebuilding the vignettes.
It also means that all of {lightgbm}
's dependencies have to be installed before running R CMD build
.
2. {lightgbm}
now depends on {knitr}
and {rmarkdown}
details (click me)
I'm proposing adding these as Suggests
dependencies. They're not necessary to install the package so users shouldn't be impacted. You can see from the list of "Reverse Suggests" at https://cran.r-project.org/web/packages/knitr/index.html and https://cran.r-project.org/web/packages/rmarkdown/index.html that this is a common pattern.
3. build-cran-package.sh
now needs to carefully remove object files from the package tarball
details (click me)
See this comment from early in the development on this PR: #3946 (comment).
The installation of the package during R CMD build
leaves behind .o
files, and these cannot be distributed with the package. Adding rules to .Rbuildignore
did not prevent this issue. Neither did adding a clean:
target to src/Makevars.in
.
I believe the issue is that R's own code automatically cleans up .o
files during this stage, but only if they're in the current working directory at build time.
Notice that the objects listed in #3946 (comment) are only those nested under src/
... lightgbm_R.o
and c_api.o
are remmoved correctly.
R CMD build
calls an internal function cleanup_pkg()
here:
But that function's logic will only clean up *.o
/ *.so
files one level below the current working directory!
I'm planning to open a bug report with R, but even if that goes well it'll be a long time (like on the order of years) until all versions of R that LightGBM supports have a fix.
So for now, I'm proposing adding code in build-cran-package.sh
that untars the package, removes those .o
files, and re-tars it. That code could be removed in the future if #3249 is implemented and the R package uses a unity build instead (as @hcho3 suggested in #3188 (comment)).
Thanks very much!
dismissing request-changes review to unblock eventual merging
r-knitr=1.35=r41hc72bb7e_0 \ | ||
r-matrix=1.3_4=r41he454529_0 \ | ||
r-pkgdown=1.6.1=r41hc72bb7e_0 \ | ||
r-rmarkdown=2.11=r41hc72bb7e_0 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't check these changes at RTD because this PR is made from a fork. I suggest re-target this PR's branch to new let's say dev
branch and merge it instantly. Then I'll be able to activate RTD test builds for that branch. After all checks passed we merge dev
into master
via new PR. WDYT about this plan?
Also, I guess we need some changes in our pkgdown
config file to make vignettes publicly available and rendered at our docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest re-target this PR's branch to new let's say dev branch and merge it instantly
Yep I agree! Good idea. I'll make that dev
branch right now and re-target this.
I guess we need some changes in our pkgdown config file
Ah you're right! According to https://pkgdown.r-lib.org/articles/pkgdown.html#articles, {pkgdown}
always automatically builds the HTML for vignettes, but I think since {lightgbm}
's config explicitly sets the navigation, we need to explicitly add a navbar item for it.
Added in ddd9a62.
Tested locally with
sh build-cran-package.sh
R CMD INSTALL --with-keep.source lightgbm_3.3.1.99.tar.gz
cd R-package
Rscript -e "pkgdown::build_site( \
lazy = FALSE \
, install = FALSE \
, devel = FALSE \
, examples = TRUE \
, run_dont_run = TRUE \
, seed = 42L \
, preview = FALSE \
, new_process = TRUE \
)
"
open docs/index.html
Vignettes end up on an "Articles" page like this:
And then clicking through the link there, I see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pointed this at a new LightGBM branch, dev-vignettes
. Decided not to literally use "dev
", just because that has a special meaning in some git workflows, and I didn't want anyone coming to this repo to misinterpret and start building from that branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this over to #4775
* [R-package] [docs] add intro vignette (#3946) * add 10 test vignettes * Revert "add 10 test vignettes" This reverts commit 40fb2e2. * Apply suggestions from code review Co-authored-by: Nikita Titov <[email protected]> Co-authored-by: Michael Mayer <[email protected]> Co-authored-by: Nikita Titov <[email protected]>
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
First step towards #1944.
This initial vignette shows the very basic workflow of LightGBM. As such, it would not replace the demo with the corresponding name as that seems rather a guide how to deal with sparse data.
The html looks as in this zip here:
basic_walkthrough.zip
Some notes on R vignettes in general