[R] Progress bar for `read_feather` for `R` and a verbose version #43404

ajinkya-k · 2024-07-24T04:40:00Z

Describe the enhancement requested

I would like to request a that a progress bar be shown when using read_feather function in R especially for large files, so that the user can see if the file is actually being read and progress is being read, similar to data.table::fread which shows a simple progress bar enabled using the showProgress argument in fread. I have a use case in which I am using read_feather to read a large file into R from a network drive, and there is no indication if R is even making progress on loading the file during some runs. In others it loads in ~300 seconds. fread also has a verbose option which dumps a lot more output, and would also be well worth implementing, but a progress bar at minimum would also be great!

Component(s)

R

The text was updated successfully, but these errors were encountered:

thisisnic · 2024-07-27T11:32:14Z

I think this is a great idea @ajinkya-k though I'm not sure how feasible this is as this has been discussed in relation to another piece of functionality and we concluded it would be tricky as it'd require non-trivial updates to the Arrow C++ library.

Out of interest, once you've loaded the file, are you performing further dplyr manipulations? It might be that you get better performance calling open_dataset() on the file so it's not pulled into your R session, running whatever manipulations, and then only calling collect() to pull the relevant bits into memory.

ajinkya-k · 2024-07-27T16:58:18Z

Thanks for the update @thisisnic . I do a join and a few filters that drop less than 1% of the rows and then collect, but it's still a huge dataset after that, which I plug into a Bayesian model. The Bayesian model does work, it's just that due to DUA constraints I have to keep the file on a network drive and pull from there. And therefore it's hard to figure out if the file is even being loaded at all, i.e. a progress bar will help me figure out if the read is even progressing at all, or if the network throttling means the process is hung up.

thisisnic · 2024-07-28T07:06:50Z

Ah, that makes sense, doesn't sound like there's much else to suggest in terms of temporary workarounds then!

ajinkya-k · 2024-07-29T20:52:22Z

@thisisnic I ran the code a few more times and it turns out that the read_feather code was indeed working but it was very very slow compared to loading the exact same data stored as a css using fread function. Is there a known issue with network drives on windows?

thisisnic · 2024-07-30T22:29:35Z

I believe I've seen issues with this kind of thing on Windows reading across a network drive though unsure - could be worth comparing with a local file to test.

ajinkya-k · 2024-07-30T23:55:48Z

yeah unfortunately cant make a copy of the data on my local machine due to DUA constraints. I might try an opensource dataset to test this though. Any suggestions for dataset?

ajinkya-k added the Type: enhancement label Jul 24, 2024

github-actions bot added the Component: R label Jul 24, 2024

thisisnic changed the title ~~Progress bar for read_feather for R and a verbose version~~ [R] Progress bar for read_feather for R and a verbose version Jul 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R] Progress bar for `read_feather` for `R` and a verbose version #43404

[R] Progress bar for `read_feather` for `R` and a verbose version #43404

ajinkya-k commented Jul 24, 2024

thisisnic commented Jul 27, 2024

ajinkya-k commented Jul 27, 2024

thisisnic commented Jul 28, 2024

ajinkya-k commented Jul 29, 2024

thisisnic commented Jul 30, 2024

ajinkya-k commented Jul 30, 2024

[R] Progress bar for read_feather for R and a verbose version #43404

[R] Progress bar for read_feather for R and a verbose version #43404

Comments

ajinkya-k commented Jul 24, 2024

Describe the enhancement requested

Component(s)

thisisnic commented Jul 27, 2024

ajinkya-k commented Jul 27, 2024

thisisnic commented Jul 28, 2024

ajinkya-k commented Jul 29, 2024

thisisnic commented Jul 30, 2024

ajinkya-k commented Jul 30, 2024

[R] Progress bar for `read_feather` for `R` and a verbose version #43404

[R] Progress bar for `read_feather` for `R` and a verbose version #43404