Skip to content

Advanced Topic: C vs Fortran Order

William Silversmith edited this page Jun 3, 2019 · 7 revisions

The Precomputed standard specifies that "raw" encoded chunks be stored in Fortran (column-major) order.

"The subvolume data for the chunk is stored directly in 
little-endian binary format in [x, y, z, channel] Fortran 
order (i.e. consecutive x values are contiguous) without 
any header."

- Precomputed Specification, June 2, 2019

Initially, we translated downloaded images into C order in order to be compatible with most people's expectations. However, this translation both takes time and sometimes memory to perform and Numpy makes most operations in Python transparent to the underlying representation. Usually the difference becomes important only with certain C extensions, the users of which are hopefully knowledgable about these issues.

Therefore, rather than forcing all users to pay the cost of transforming Fortran order data into C order data, I decided to defer the C vs. Fortran decision to the end user, a person that has better information about what tradeoffs are acceptable for their application. Here's how to switch to C order if you need it:

Using Numpy

This is the typical way to do it, however it will incur 2x memory overhead.

img = vol[...]
img = np.ascontiguousarray(img)

You can also do np.transpose without np.ascontiguousarray to avoid the memory overhead, however it will ordinarily only change the internal striding, not the physical memory layout. Usually this is okay, but sometimes e.g. C extensions expect a particular memory layout.

Using fastremap

We were getting three channel float32 C order data from a GPU and needed to transform it to Fortran order for upload. The 2x memory pressure incurred would kill the process in some configurations. Therefore, I added an in-place transposition function to the Cython library fastremap.

$ pip install fastremap
import fastremap

img = GPU(...)
img = fastremap.asfortranarray(img)
vol[...] = img

Notably, if your data are 2D or 3D and square (cubic), the in-place transposition is also faster than the numpy function.