-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use writeLines for fread(text=.) #4805
Conversation
Cleaner & equivalent behavior
Codecov Report
@@ Coverage Diff @@
## master #4805 +/- ##
=======================================
Coverage 99.47% 99.47%
=======================================
Files 73 73
Lines 14557 14557
=======================================
Hits 14481 14481
Misses 76 76
Continue to review full report at Codecov.
|
@@ -35,7 +35,7 @@ yaml=FALSE, autostart=NA, tmpdir=tempdir(), tz="") | |||
if (!is.character(text)) stop("'text=' is type ", typeof(text), " but must be character.") | |||
if (!length(text)) return(data.table()) | |||
if (length(text) > 1L) { | |||
cat(text, file=(tmpFile<-tempfile(tmpdir=tmpdir)), sep="\n") # avoid paste0() which could create a new very long single string in R's memory | |||
writeLines(text, tmpFile<-tempfile(tmpdir=tmpdir)) # avoid paste0() which could create a new very long single string in R's memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also use useBytes=TRUE
here to facilitate writing UTF-8 strings on Windows:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MichaelChirico This isn't something I've tested, but maybe fwrite(as.data.table(text), tmpFile<-tempfile(tmpdir = tmpdir), col.names = FALSE)
is also worth exploring. I'm basing this assumption on the speed that I observed here.
Or maybe list(text)
instead of as.data.table(text)
since fwrite
should work with that too..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think it may warrant some benchmarking. Actually I have run into an issue recently where cat()
is very slow for long input. In my case I was cat
-ing a whole file by character (e.g. cat(strsplit(readChar(f, file.size(f)), NULL)[[1]], sep = '', file = f)
is quite slow)
A benchmark of approaches (h/t @mrdwab for suggesting
output:
So Also the need to supply |
Cleaner & equivalent behavior