-
Notifications
You must be signed in to change notification settings - Fork 47
Advanced Topic: C vs Fortran Order
The Precomputed standard specifies that "raw" encoded chunks be stored in Fortran (column-major) order.
"The subvolume data for the chunk is stored directly in
little-endian binary format in [x, y, z, channel] Fortran
order (i.e. consecutive x values are contiguous) without
any header."
- Precomputed Specification, June 2, 2019
Initially, we translated downloaded images into C order in order to be compatible with most people's expectations. However, this translation both takes time and sometimes memory to perform and Numpy makes most operations in Python transparent to the underlying representation. Usually the difference becomes important only with certain C extensions, the users of which are hopefully knowledgable about these issues.
Therefore, rather than forcing all users to pay the cost of transforming Fortran order data into C order data, I decided to defer the C vs. Fortran decision to the end user, a person that has better information about what tradeoffs are acceptable for their application. Here's how to switch to C order if you need it:
This is the typical way to do it, however it will incur 2x memory overhead.
img = vol[...]
img = np.ascontiguousarray(img)
You can also do np.transpose
without np.ascontiguousarray
to avoid the memory overhead, however it will ordinarily only change the internal striding, not the physical memory layout. Usually this is okay, but sometimes e.g. C extensions expect a particular memory layout.
We were getting three channel float32 C order data from a GPU and needed to transform it to Fortran order for upload. The 2x memory pressure incurred would kill the process in some configurations. Therefore, I added an in-place transposition function to the Cython library fastremap
.
$ pip install fastremap
import fastremap
img = GPU(...)
img = fastremap.asfortranarray(img)
vol[...] = img
Notably, if your data are 2D or 3D and square (cubic), the in-place transposition is also faster than the numpy function.