Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove encoding parameter in render() to avoid redundant and annoying warning #1513

Closed
dongzhuoer opened this issue Jan 12, 2019 · 12 comments
Closed
Milestone

Comments

@dongzhuoer
Copy link

dongzhuoer commented Jan 12, 2019

Now, knitr::knit('input_file', encoding = 'native.enc') would give a warning

The encoding ("native.enc") is not UTF-8. We will only support UTF-8 in the future. Please re-save your file "demo.Rmd" with the UTF-8 encoding.

Although, knitr::knit() checks

if (!missing(encoding) && encoding != 'UTF-8') warning(
  'The encoding ("', encoding, '") is not UTF-8. We will only support UTF-8 in',
  ' the future. Please re-save your file "', input, '" with the UTF-8 encoding.'
)

to ensure that knitr::knit('input_file') won't issue the warning, rmarkdown::render('input_file') would call something like

knitr::knit('input_file', encoding = getOption('encoding'))

In this case, encoding is not missing, thus the annoying warning comes out, with the user not knowing what happens (my input_file is DEFINITELY saved in utf-8).

@dongzhuoer dongzhuoer changed the title encoding remove encoding parameter in render() to avoid redundant and annoying warning Jan 12, 2019
@yihui
Copy link
Member

yihui commented Jan 13, 2019

If your input file is saved in UTF-8, you should call rmarkdown::render("input.Rmd", encoding = "UTF_8"). Using the default encoding value in render() could be wrong in this case (especially if you are on Windows).

@dongzhuoer
Copy link
Author

I use Ubuntu, I just call rmarkdown::render("input.Rmd") and everything works well. I tried adding option(encoding = "UTF-8") to ~/.Rprofile, but that results in error when install R packages.

By the way, if user press the Knit button in RStudio, there is no way (at least not an easy way) to add encoding = "UTF_8".

@dongzhuoer
Copy link
Author

Anyway, I still think the warning is annoying and not instructive.

What about soft-deprecate encoding parameter of rmarkdown::render() and ensure input file must be encoded in UTF-8. Internally, rmarkdown::render() can call

knitr::knit('input_file', encoding = "UTF-8")

@dongzhuoer
Copy link
Author

Another working-around is

rmarkdown::render <- function(input, ..., encoding = getOption("encoding")) {
    ...
  if (missing(encoding))
    knitr::knit(input, ...)
  else
    knitr::knit(input, ..., encoding = encoding)
}

@fernandomayer
Copy link

Any thoughts on this?

I use Arch Linux and always use the devtools::install_github() version of knitr and rmarkdown. My files are all UTF-8 and never had that message until a few days ago...

@yihui yihui added this to the v1.12 milestone Jan 24, 2019
@yihui
Copy link
Member

yihui commented Jan 24, 2019

If you reinstall knitr from Github, the warning should be gone. Thanks!

@fernandomayer
Copy link

Great, thanks!

@dpprdan
Copy link
Contributor

dpprdan commented Mar 1, 2019

@yihui: On Windows, CP1252 encoding, I still get this message when I run render("test.Rmd") with the current github versions of both rmarkdown and knitr. test.Rmd is UTF-8 encoded.

If I understand correctly,

knit = function([...]encoding = getOption('encoding')) {
[...]
    if (!missing(encoding) && !is_utf8_enc(encoding)) warning(
      'The encoding ("', encoding, '") is not UTF-8. We will only support UTF-8 in',
      ' the future. Please re-save your file "', input, '" with the UTF-8 encoding.'
      )
   } [...]

will always throw this warning on Windows by default, because the getOption('encoding') default is:

getOption('encoding')
#> [1] "native.enc"

Also I think that Please re-save your file "test.Rmd" with the UTF-8 encoding. is misleading (and in my case above it is wrong), because the actual encoding of the file is not checked (again IIUC). I guess what is really meant is "Specify encoding = "UTF-8" and make sure your file is UTF-8 encoded."?

I guess this is all in preparation to drop the encoding argument in an upcoming v2 and we are talking migration paths/how to support legacy code here? Would it be possible to have knit() check whether the input is indeed UTF-8 encoded (reliably and with little overhead)?

@dpprdan
Copy link
Contributor

dpprdan commented Mar 1, 2019

will always throw this warning on Windows by default

I was partly wrong about that. knit(f, encoding = getOption('encoding') does not throw a warning (for reasons I don't yet understand), but evaluating getOption('encoding') beforehand and passing that on to the encoding argument does.

library(knitr)
library(rmarkdown)
(f = system.file("examples", "knitr-minimal.Rmd", package = "knitr"))
#> [1] "C:/Users/daniel/Documents/.R/win-library/knitr/examples/knitr-minimal.Rmd"
knit(f)
#> processing file: C:/Users/daniel/Documents/.R/win-library/knitr/examples/knitr-minimal.Rmd
#> output file: knitr-minimal.md
#> [1] "knitr-minimal.md"
(enc <- getOption("encoding"))
#> [1] "native.enc"
knit(f, encoding = enc)
#> Warning in knit(f, encoding = enc): The encoding ("native.enc") is not
#> UTF-8. We will only support UTF-8 in the future. Please re-save your file
#> "C:/Users/daniel/Documents/.R/win-library/knitr/examples/knitr-minimal.Rmd"
#> with the UTF-8 encoding.
#> processing file: C:/Users/daniel/Documents/.R/win-library/knitr/examples/knitr-minimal.Rmd
#> output file: knitr-minimal.md
#> [1] "knitr-minimal.md"

So the default knitr::knit(f, encoding = getOption('encoding') won’t throw a warning on Windows, but the default rmarkdown::render(f, encoding = getOption('encoding') will.

render(f)
#> Warning in knitr::knit(knit_input, knit_output, envir = envir, quiet =
#> quiet, : The encoding ("native.enc") is not UTF-8. We will only support
#> UTF-8 in the future. Please re-save your file "knitr-minimal.Rmd" with the
#> UTF-8 encoding.
#> processing file: knitr-minimal.Rmd
#> output file: knitr-minimal.knit.md
#> "C:/PROGRA~1/Pandoc/pandoc" +RTS -K512m -RTS knitr-minimal.utf8.md --to html4 --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash+smart --output knitr-minimal.html --email-obfuscation none --self-contained --standalone --section-divs --template "C:\Users\daniel\Documents\.R\win-library\rmarkdown\rmd\h\default.html" --no-highlight --variable highlightjs=1 --variable "theme:bootstrap" --include-in-header "C:\Temp\R\Rtmpsr9a64\rmarkdown-stra74479860aa.html" --mathjax --variable "mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" --metadata pagetitle=knitr-minimal.utf8.md
#> 
#> Output created: knitr-minimal.html
Session info
devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.5.2 (2018-12-20)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language en                          
#>  collate  German_Germany.1252         
#>  ctype    German_Germany.1252         
#>  tz       Europe/Berlin               
#>  date     2019-03-01                  
#> 
#> - Packages --------------------------------------------------------------
#>  package     * version date       lib source                            
#>  assertthat    0.2.0   2017-04-11 [1] CRAN (R 3.5.1)                    
#>  backports     1.1.3   2018-12-14 [1] CRAN (R 3.5.1)                    
#>  callr         3.1.1   2018-12-21 [1] CRAN (R 3.5.2)                    
#>  cli           1.0.1   2018-09-25 [1] CRAN (R 3.5.1)                    
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 3.5.1)                    
#>  curl          3.3     2019-01-10 [1] CRAN (R 3.5.2)                    
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 3.5.1)                    
#>  devtools      2.0.1   2018-10-26 [1] CRAN (R 3.5.1)                    
#>  digest        0.6.18  2018-10-10 [1] CRAN (R 3.5.1)                    
#>  evaluate      0.13    2019-02-12 [1] CRAN (R 3.5.2)                    
#>  fs            1.2.6   2018-08-23 [1] CRAN (R 3.5.1)                    
#>  glue          1.3.0   2018-07-17 [1] CRAN (R 3.5.1)                    
#>  highr         0.7     2018-06-09 [1] CRAN (R 3.5.1)                    
#>  htmltools     0.3.6   2017-04-28 [1] CRAN (R 3.5.1)                    
#>  httr          1.4.0   2018-12-11 [1] CRAN (R 3.5.1)                    
#>  knitr       * 1.21.10 2019-03-01 [1] Github (yihui/knitr@c2aae98)      
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 3.5.1)                    
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 3.5.1)                    
#>  mime          0.6     2018-10-05 [1] CRAN (R 3.5.1)                    
#>  pkgbuild      1.0.2   2018-10-16 [1] CRAN (R 3.5.1)                    
#>  pkgload       1.0.2   2018-10-29 [1] CRAN (R 3.5.1)                    
#>  prettyunits   1.0.2   2015-07-13 [1] CRAN (R 3.5.1)                    
#>  processx      3.2.1   2018-12-05 [1] CRAN (R 3.5.1)                    
#>  ps            1.3.0   2018-12-21 [1] CRAN (R 3.5.2)                    
#>  R6            2.4.0   2019-02-14 [1] CRAN (R 3.5.2)                    
#>  Rcpp          1.0.0   2018-11-07 [1] CRAN (R 3.5.1)                    
#>  remotes       2.0.2   2018-10-30 [1] CRAN (R 3.5.1)                    
#>  rlang         0.3.1   2019-01-08 [1] CRAN (R 3.5.2)                    
#>  rmarkdown   * 1.11.9  2019-03-01 [1] Github (rstudio/rmarkdown@3901a9d)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.5.1)                    
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.5.1)                    
#>  stringi       1.3.1   2019-02-13 [1] CRAN (R 3.5.2)                    
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 3.5.2)                    
#>  testthat      2.0.1   2018-10-13 [1] CRAN (R 3.5.1)                    
#>  usethis       1.4.0   2018-08-14 [1] CRAN (R 3.5.1)                    
#>  withr         2.1.2   2018-03-15 [1] CRAN (R 3.5.1)                    
#>  xfun          0.5     2019-02-20 [1] CRAN (R 3.5.2)                    
#>  xml2          1.2.0   2018-01-24 [1] CRAN (R 3.5.1)                    
#>  yaml          2.2.0   2018-07-25 [1] CRAN (R 3.5.1)                    
#> 
#> [1] C:/Users/daniel/Documents/.R/win-library
#> [2] C:/Program Files/R/R-3.5.2/library

@yihui
Copy link
Member

yihui commented Mar 1, 2019

@dpprdan Yes, knitr should check if the encoding of the input file is UTF-8 or not. The warning message is indeed misleading here. I'll look into it. Thanks for the report!

@yihui
Copy link
Member

yihui commented Mar 4, 2019

@dpprdan Done in knitr.

clrpackages pushed a commit to clearlinux-pkgs/R-knitr that referenced this issue Mar 11, 2019
…tly provided and not UTF-8

Emily Riederer (1):
      Implement sass / scss engines (#1666)

Han Oostdijk (1):
      engine cat honors options echo and eval (#1618)

Hiroaki Yutani (1):
      Make kable(format = "pandoc") fall back to markdown format when data has 0 row (#1678)

Inferrator (1):
      kable -- added a `label` argument that allows specifying a LaTeX refe… (#1655)

Sebastian Meyer (1):
      make stitch template work without knitr being attached (#1668)

Walter Zhang (1):
      Added complex data type to `is_numeric` (#1663)

Yihui Xie (59):
      remove file_ext again; hoping spelling will be updated before the next knitr release...
      use xfun::read/write_utf8()
      warn against non-UTF8 encodings!
      force UTF-8 in pandoc()
      import xfun::read/write_utf8()
      drop the encoding argument in Sweave2knitr()
      remove the encoding argument in knit2pandoc() and knit2pdf()
      only support UTF-8 knit2html()
      import xfun::file_string()
      remove the `encoding` argument in knit2wp()
      fix #1644: quote the output path in pandoc()
      no CRAN incoming check
      pandoc may not be installed on certain CRAN platforms; of course, it is the glorious Solaris!
      also try rmarkdown::pandoc_exec() to find pandoc, and use system2() instead of system()
      lost pandoc_cfg() in 9bfec7d11fa49ecfe4525abe2c2125ddeba77a57
      warn only against the encoding value that is explicitly provided and not UTF-8
      when there are multiple figures in a code chunk, their labels should be different; this should fix https://stackoverflow.com/q/53880195/559676
      don't attach the package; use the option `echo` as the example instead of `fig.path`
      a news item for #1655
      close rstudio/rmarkdown#1513: do not warn on encoding if the native encoding is UTF-8
      don't show code coverage status since it looks alarming but I don't really care that much
      state the source of this engine (in case we want to ping the original author in the future)
      write out the code in UTF-8
      don't overprotect users here: if they set the `package` option at all, we assume they have read the documentation
      merge if-else
      just signal the error if style is plain wrong, regardless of the chunk option error = TRUE or FALSE
      use getFromNamespace() instead of get() because get() has a bad default inherits = TRUE, which often causes more trouble than convenience
      mostly cosmetic changes
      tweak the news and bump version
      fix #1675: make fig.show='hide' work for include_graphics()
      close #1676: add chunk options `class.error`, `class.warning`, and `class.message` to customize the CSS classes for errors, warnings, and messages, respectively
      use tinytex::latexmk() instead of tools::texi2dvi() to compile tikz graphics
      fix rstudio/rmarkdown#1538: should rep_len() instead of rep() on labels
      actually check if the input file is encoded in UTF-8 or not: rstudio/rmarkdown#1513 (comment)
      pdflatex cannot compile a .tex file if the path is .\foo.tex: https://stackoverflow.com/q/54839403/559676 use basename() to remove the initial .\
      unused variables (thanks, RStudio)
      try to install dvisvgm automatically
      always stop() if conversion from dvi to svg fails
      use the magick package instead of ImageMagick to convert pdf figure generated by the tikz engine to other formats
      factor out tempfile(tmpdir = '.') as wd_tempfile()
      it seems image_read() cannot read PDF images at least on Windows; so use image_read_pdf() instead, and this will require the pdftools package
      install libmagick++-dev
      install libpoppler-cpp-dev for the R package pdftools
      a news item for #1618
      close #1656: clarify on the documentation of opts_current
      fix #1649: when external = FALSE, use hook_plot_tex() instead of the Markdown plot hook hook_plot_md_base()
      close #1648: support the chunk option out.extra for include_url()
      cosmetic
      don't remove empty lines in purl()
      add the `encoding` argument back to knit2pandoc() for backward compatibility (this is primarily for the argparse package, which has an Rrst vignette on CRAN compiled through knit2pandoc())
      instead of preserving all empty lines like 46826deffbe30815eab2c159dcc17966fca90967, only preserve those not in the beginning or end
      fix a test on Windows: if the locale is not based on UTF-8, this expression should just return TRUE to avoid an error from assert()
      a more robust implementation of is_abs_path()
      use the assert('', {}) syntax
      there isn't really a good reason to install all soft dependencies
      point to the blog post https://yihui.name/en/2018/11/biggest-regret-knitr/
      Revert "a more robust implementation of is_abs_path()"
      vectorize is_abs_path()
      CRAN release v1.22
@github-actions
Copy link

github-actions bot commented Nov 3, 2020

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants