Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

knit_meta_id grows out of control when kableExtra is loaded #1538

Closed
kevinushey opened this issue Feb 28, 2019 · 10 comments
Closed

knit_meta_id grows out of control when kableExtra is loaded #1538

kevinushey opened this issue Feb 28, 2019 · 10 comments
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@kevinushey
Copy link
Contributor

Not sure if this is a bug in kableExtra or rmarkdown, but reporting here for posterity.

library(kableExtra)
str(attr(knitr:::.knitEnv$meta, "knit_meta_id"))
#>  chr [1:14] "" "" "" "" "" "" "" "" "" "" "" "" "" ""
file <- tempfile(fileext = ".Rmd")
file.create(file)
#> [1] TRUE
rmarkdown::render(file, output_format = "html_notebook")
#> /usr/local/bin/pandoc +RTS -K512m -RTS file14f2f24ec1abe.utf8.md --to html4 --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash+smart --output file14f2f24ec1abe.html --email-obfuscation none --self-contained --standalone --section-divs --template /Users/kevin/Library/R/3.5/library/rmarkdown/rmd/h/default.html --no-highlight --variable highlightjs=1 --variable 'theme:bootstrap' --include-in-header /var/folders/qt/txb728ms5012wp_4yn369wth0000gn/T//Rtmp231SWc/rmarkdown-str14f2f8cfc2e2.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --metadata pagetitle=file14f2f24ec1abe.utf8.md --variable code_folding=show --variable source_embed=file14f2f24ec1abe.Rmd --include-after-body /var/folders/qt/txb728ms5012wp_4yn369wth0000gn/T//Rtmp231SWc/file14f2f387f3.html --variable code_menu=1 --variable kable-scroll=1
#> 
#> Output created: file14f2f24ec1abe.nb.html
str(attr(knitr:::.knitEnv$meta, "knit_meta_id"))
#>  chr [1:196] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ...
rmarkdown::render(file, output_format = "html_notebook")
#> /usr/local/bin/pandoc +RTS -K512m -RTS file14f2f24ec1abe.utf8.md --to html4 --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash+smart --output file14f2f24ec1abe.html --email-obfuscation none --self-contained --standalone --section-divs --template /Users/kevin/Library/R/3.5/library/rmarkdown/rmd/h/default.html --no-highlight --variable highlightjs=1 --variable 'theme:bootstrap' --include-in-header /var/folders/qt/txb728ms5012wp_4yn369wth0000gn/T//Rtmp231SWc/rmarkdown-str14f2f67d26286.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --metadata pagetitle=file14f2f24ec1abe.utf8.md --variable code_folding=show --variable source_embed=file14f2f24ec1abe.Rmd --include-after-body /var/folders/qt/txb728ms5012wp_4yn369wth0000gn/T//Rtmp231SWc/file14f2f254280aa.html --variable code_menu=1 --variable kable-scroll=1
#> 
#> Output created: file14f2f24ec1abe.nb.html
str(attr(knitr:::.knitEnv$meta, "knit_meta_id"))
#>  chr [1:2744] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ...

Created on 2019-02-27 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                                      
#>  version  R version 3.5.2 Patched (2019-01-14 r75994)
#>  os       macOS Mojave 10.14.3                       
#>  system   x86_64, darwin15.6.0                       
#>  ui       X11                                        
#>  language (EN)                                       
#>  collate  en_US.UTF-8                                
#>  ctype    en_US.UTF-8                                
#>  tz       America/Los_Angeles                        
#>  date     2019-02-27                                 
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                        
#>  assertthat    0.2.0      2017-04-11 [1] CRAN (R 3.5.0)                
#>  backports     1.1.3      2018-12-14 [1] CRAN (R 3.5.0)                
#>  base64enc     0.1-3      2015-07-28 [1] CRAN (R 3.5.0)                
#>  callr         3.1.1      2018-12-21 [1] CRAN (R 3.5.0)                
#>  cli           1.0.1      2018-09-25 [1] CRAN (R 3.5.0)                
#>  colorspace    1.4-0      2019-01-13 [1] standard (@1.4-0)             
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.5.0)                
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 3.5.0)                
#>  devtools      2.0.1      2018-10-26 [1] CRAN (R 3.5.2)                
#>  digest        0.6.18     2018-10-10 [1] CRAN (R 3.5.0)                
#>  evaluate      0.13       2019-02-12 [1] CRAN (R 3.5.2)                
#>  fs            1.2.6      2018-08-23 [1] CRAN (R 3.5.0)                
#>  glue          1.3.0      2018-07-17 [1] CRAN (R 3.5.0)                
#>  highr         0.7        2018-06-09 [1] standard (@0.7)               
#>  hms           0.4.2      2018-03-10 [1] CRAN (R 3.5.0)                
#>  htmltools     0.3.6      2017-04-28 [1] standard (@0.3.6)             
#>  httr          1.4.0      2018-12-11 [1] CRAN (R 3.5.0)                
#>  kableExtra  * 1.0.1      2019-01-22 [1] CRAN (R 3.5.2)                
#>  knitr         1.21       2018-12-10 [1] CRAN (R 3.5.2)                
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.5.0)                
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.5.0)                
#>  munsell       0.5.0      2018-06-12 [1] standard (@0.5.0)             
#>  pillar        1.3.1      2018-12-15 [1] CRAN (R 3.5.0)                
#>  pkgbuild      1.0.2      2018-10-16 [1] CRAN (R 3.5.0)                
#>  pkgconfig     2.0.2      2018-08-16 [1] CRAN (R 3.5.0)                
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.5.0)                
#>  prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.5.0)                
#>  processx      3.2.1      2018-12-05 [1] CRAN (R 3.5.0)                
#>  ps            1.3.0      2018-12-21 [1] CRAN (R 3.5.0)                
#>  R6            2.4.0      2019-02-14 [1] CRAN (R 3.5.2)                
#>  Rcpp          1.0.0.1    2019-02-12 [1] local                         
#>  readr         1.3.1      2018-12-21 [1] standard (@1.3.1)             
#>  remotes       2.0.2.9000 2019-01-22 [1] Github (r-lib/remotes@8397195)
#>  rlang         0.3.1      2019-01-08 [1] CRAN (R 3.5.2)                
#>  rmarkdown     1.11       2018-12-08 [1] standard (@1.11)              
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.5.0)                
#>  rstudioapi    0.9.0      2019-01-22 [1] local                         
#>  rvest         0.3.2      2016-06-17 [1] standard (@0.3.2)             
#>  scales        1.0.0      2018-08-09 [1] standard (@1.0.0)             
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.5.0)                
#>  stringi       1.3.1      2019-02-13 [1] CRAN (R 3.5.2)                
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 3.5.2)                
#>  testthat      2.0.1      2018-10-13 [1] standard (@2.0.1)             
#>  tibble        2.0.1      2019-01-12 [1] CRAN (R 3.5.2)                
#>  usethis       1.4.0      2018-08-14 [1] CRAN (R 3.5.0)                
#>  viridisLite   0.3.0      2018-02-01 [1] standard (@0.3.0)             
#>  webshot       0.5.1      2018-09-28 [1] CRAN (R 3.5.0)                
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 3.5.0)                
#>  xfun          0.5        2019-02-20 [1] CRAN (R 3.5.2)                
#>  xml2          1.2.0      2018-01-24 [1] CRAN (R 3.5.0)                
#>  yaml          2.2.0      2018-07-25 [1] standard (@2.2.0)             
#> 
#> [1] /Users/kevin/Library/R/3.5/library
#> [2] /Library/Frameworks/R.framework/Versions/3.5/Resources/library

Note that the knit_meta_id attribute grows in size each time render() is called. If called enough times, the session will be exhausted of memory. See rstudio/rstudio#2340 for a case where this happens in the wild -- e.g. in R Notebooks, we do call render() behind the scenes multiple times in a single session.

The main things to note about this bug:

  1. Loading the kableExtra package modifies the .knitEnv$meta object;
  2. Calling rmarkdown::render() causes that object to explode in size, presumedly from here:

    rmarkdown/R/render.R

    Lines 416 to 423 in bbd0786

    # reset knit_meta (and ensure it's always reset before exiting render)
    old_knit_meta <- knit_meta_reset()
    on.exit({
    knit_meta_reset()
    if (length(old_knit_meta)) {
    knitr::knit_meta_add(old_knit_meta, attr(old_knit_meta, 'knit_meta_id'))
    }
    }, add = TRUE)

I'm not sure if the issue here is that kableExtra is doing something it shouldn't, or if rmarkdown is not correctly saving / restoring the knit meta object.

@CrashVector
Copy link

Yes!! I've been fighting this for the last few days trying to figure out what is going on! I have a server with 192GB of ram, and after the third time or so of running a render to pdf using kableExtra, my rsession grinds to a halt utilizing ALL available ram. I have to bring the server down in order to clear the session. I have been trying to troubleshoot this, and it seems like the more kable tables I have in my template, the quicker this happens. Does that make sense in terms of the underlying bug?

@nikkoc
Copy link

nikkoc commented Feb 28, 2019

What's interesting is that I can recreate this bug using the code above, but as soon as I call knitr::knit_meta(), the problem goes away for each additional render and I have to restart in order to start getting the bug again.

It looks like kableExtra adds to the knitr:::.knitEnv$meta every time kableExtra::usepackage_latex is called, which appears to be default on load. This seems to be why we get a vector of 14 empty strings for attr(knitr:::.knitEnv$meta, "knit_meta_id").

But having an initial value shouldn't cause the value to get LARGER for each additional render. Given the code in rmarkdown::render at lines 416:423 referenced above, it should just stay as a vector of 14 empty strings. Except it isn't, so I think this is a bug in rmarkdown::render().

It seems like a quick workaround is to just call knitr::knit_meta() after loading kableExtra (or to be safe, before calling rmarkdown::render()) to make sure that the knit_meta is clear before calling the render function.

@CrashVector
Copy link

CrashVector commented Feb 28, 2019

It's overkill, but i've added invisible(knitr::knit_meta(clean=T)) before i call my render in my main script, then I added knitr::knit_meta() after loading kableExtra in my markdown, and then invisible(gc()) after the render call in my main script.

BUT, i no longer have the runaway memory issue! My server takes 35 minutes to boot (a problem for another day), so I am so happy to have a workaround for this!

@nikkoc
Copy link

nikkoc commented Feb 28, 2019

It looks like attr(knitr:::.knitEnv$meta, "knit_meta_id") size is multiplying by the length of old_knit_meta inside knitr::knit_meta_add(). The knit_meta_add function is as follows:

knit_meta_add = function(meta, label = '') {
  if (length(meta)) {
    meta_id = attr(.knitEnv$meta, 'knit_meta_id')
    .knitEnv$meta = c(.knitEnv$meta, meta)
    attr(.knitEnv$meta, 'knit_meta_id') = c(meta_id, rep(label, length(meta)))
  }
  .knitEnv$meta
}

So we can see that it reps the label param for the length of the meta param as though it assumes the length of label is 1. However, render is passing the entire vector, so we repeat the entire vector for the length of the meta param. If we load kableExtra, that immediately makes the meta length 14 and the label length 14, so after the first iteration, the knit_meta_id attribute is now 14*14=196 strings long. Assuming the length of meta hasn't changed, the second iteration would now be 196*14=2,744 strings long. And so on and so forth.

It looks like this chunk of code in rmarkdown::render() was added in PR #1153 . Since I'm not familiar with how knitr metadata works, I don't know if the solution would be to change knitr::knit_meta_add() to somehow accept a vector or to change rmarkdown::render() to only pass through a vector of length = 1. I'm leaning towards the former, but really I have no clue.

@yihui yihui added this to the v1.12 milestone Mar 4, 2019
@yihui yihui added the bug an unexpected problem or unintended behavior label Mar 4, 2019
@yihui
Copy link
Member

yihui commented Mar 4, 2019

@nikkoc Thanks for investigating the issue and pointing out the root cause! You were absolutely correct, and I have fixed it in knitr. You may install the development version via

remotes::install_github('yihui/knitr')

clrpackages pushed a commit to clearlinux-pkgs/R-knitr that referenced this issue Mar 11, 2019
…tly provided and not UTF-8

Emily Riederer (1):
      Implement sass / scss engines (#1666)

Han Oostdijk (1):
      engine cat honors options echo and eval (#1618)

Hiroaki Yutani (1):
      Make kable(format = "pandoc") fall back to markdown format when data has 0 row (#1678)

Inferrator (1):
      kable -- added a `label` argument that allows specifying a LaTeX refe… (#1655)

Sebastian Meyer (1):
      make stitch template work without knitr being attached (#1668)

Walter Zhang (1):
      Added complex data type to `is_numeric` (#1663)

Yihui Xie (59):
      remove file_ext again; hoping spelling will be updated before the next knitr release...
      use xfun::read/write_utf8()
      warn against non-UTF8 encodings!
      force UTF-8 in pandoc()
      import xfun::read/write_utf8()
      drop the encoding argument in Sweave2knitr()
      remove the encoding argument in knit2pandoc() and knit2pdf()
      only support UTF-8 knit2html()
      import xfun::file_string()
      remove the `encoding` argument in knit2wp()
      fix #1644: quote the output path in pandoc()
      no CRAN incoming check
      pandoc may not be installed on certain CRAN platforms; of course, it is the glorious Solaris!
      also try rmarkdown::pandoc_exec() to find pandoc, and use system2() instead of system()
      lost pandoc_cfg() in 9bfec7d11fa49ecfe4525abe2c2125ddeba77a57
      warn only against the encoding value that is explicitly provided and not UTF-8
      when there are multiple figures in a code chunk, their labels should be different; this should fix https://stackoverflow.com/q/53880195/559676
      don't attach the package; use the option `echo` as the example instead of `fig.path`
      a news item for #1655
      close rstudio/rmarkdown#1513: do not warn on encoding if the native encoding is UTF-8
      don't show code coverage status since it looks alarming but I don't really care that much
      state the source of this engine (in case we want to ping the original author in the future)
      write out the code in UTF-8
      don't overprotect users here: if they set the `package` option at all, we assume they have read the documentation
      merge if-else
      just signal the error if style is plain wrong, regardless of the chunk option error = TRUE or FALSE
      use getFromNamespace() instead of get() because get() has a bad default inherits = TRUE, which often causes more trouble than convenience
      mostly cosmetic changes
      tweak the news and bump version
      fix #1675: make fig.show='hide' work for include_graphics()
      close #1676: add chunk options `class.error`, `class.warning`, and `class.message` to customize the CSS classes for errors, warnings, and messages, respectively
      use tinytex::latexmk() instead of tools::texi2dvi() to compile tikz graphics
      fix rstudio/rmarkdown#1538: should rep_len() instead of rep() on labels
      actually check if the input file is encoded in UTF-8 or not: rstudio/rmarkdown#1513 (comment)
      pdflatex cannot compile a .tex file if the path is .\foo.tex: https://stackoverflow.com/q/54839403/559676 use basename() to remove the initial .\
      unused variables (thanks, RStudio)
      try to install dvisvgm automatically
      always stop() if conversion from dvi to svg fails
      use the magick package instead of ImageMagick to convert pdf figure generated by the tikz engine to other formats
      factor out tempfile(tmpdir = '.') as wd_tempfile()
      it seems image_read() cannot read PDF images at least on Windows; so use image_read_pdf() instead, and this will require the pdftools package
      install libmagick++-dev
      install libpoppler-cpp-dev for the R package pdftools
      a news item for #1618
      close #1656: clarify on the documentation of opts_current
      fix #1649: when external = FALSE, use hook_plot_tex() instead of the Markdown plot hook hook_plot_md_base()
      close #1648: support the chunk option out.extra for include_url()
      cosmetic
      don't remove empty lines in purl()
      add the `encoding` argument back to knit2pandoc() for backward compatibility (this is primarily for the argparse package, which has an Rrst vignette on CRAN compiled through knit2pandoc())
      instead of preserving all empty lines like 46826deffbe30815eab2c159dcc17966fca90967, only preserve those not in the beginning or end
      fix a test on Windows: if the locale is not based on UTF-8, this expression should just return TRUE to avoid an error from assert()
      a more robust implementation of is_abs_path()
      use the assert('', {}) syntax
      there isn't really a good reason to install all soft dependencies
      point to the blog post https://yihui.name/en/2018/11/biggest-regret-knitr/
      Revert "a more robust implementation of is_abs_path()"
      vectorize is_abs_path()
      CRAN release v1.22
@haozhu233
Copy link
Contributor

I also improved this on the kableExtra side in the new version so those latex packages are only loaded when knitr::is_latex_output() == TRUE

https://github.com/haozhu233/kableExtra/blob/e69a890301f8c13047f4bc3453f944de7211be5f/R/zzz.R#L2-L20

@Kreitzberg
Copy link

I believe I am still having this issue. I was able to knit an Rstudio notebook until I added kable tables with kableExtra and now I'm running out of memory. I'm not sure how I should go about making sure that this is the issue. I installed the dev version of knitr and its still happening.

Any help would be awesome.

@kevinushey
Copy link
Contributor Author

@Kreitzberg you are most likely seeing a separate issue -- I cannot reproduce the original problem I reported with the development version of knitr.

If you can, I would recommend filing a new issue with a reproducible example. (If you cannot create a reproducible example, it is unlikely that your problem will be fixed.)

@Kreitzberg
Copy link

Nevermind, I realized I forgot to use head() and was printing large tables! Thanks for your help.

@github-actions
Copy link

github-actions bot commented Nov 3, 2020

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

6 participants