-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up writeH5AD #129
Comments
Hi @stemangiola Despite what some AI bot thinks I'm not sure how easy this would be to change. {zellkonverter} does most of the object writing by passing things to Python and getting Python anndata to do it. The only exception is for As an alternative, I'm also involved with {anndataR} where we are trying to make a native R implementation of anndata. This kind of then would be great to have there if you want to contribute something. |
I've long forgotten what I did but I doubt parallelization is going to help much here. HDF5 writes are single-threaded due to their SWMR model. If your input assays are If you need fast writes, you'd be better off with something like TileDB. But even so... for around a million cells, the analysis is going to take at least an hour anyway, so an extra 5-10 minutes saving to disk doesn't seem too bad. |
Thanks @LTLA , for Is TileDB supported well for both R and Python? @lazappi {anndataR} seems interesting! |
I'm not aware of any TileDB has official support for both R and Python. I've mostly used the R client and I've never had any major problems. |
I handle a lot of data, HCA-scale. I would like a way to speed up the saving of large SCE. This could be achieved through low-level optimisation if possible, parallelization or possibly tuning of the block size (such as for
HDF5Array
).Thanks!
The text was updated successfully, but these errors were encountered: