How to use ArrayBuilder most efficiently? #307
Replies: 1 comment
-
ArrayBuilder code is completely contained within the src/libawkward/builder directory (interface in include/awkward/builder). Each ArrayBuilder in Python (or Numba) is an instance of ArrayBuilder in C++, which is an interface class that holds a tree of Builder nodes. The first Builder is an UnknownBuilder which represents data of unknown type. If you only call ArrayBuilder's The reason for all this generality is to make Each Builder node in this tree can have zero or more GrowableBuffers associated with it. A GrowableBuffer grows in a similar way to When you perform a But this policy has some bad consequences. For instance, if you wanted to use an ArrayBuilder in a way that clears the existing data and starts appending from zero, such an action would have to replace, not reuse, the This choice of policy favors frequent
The second policy would be good for frequent clearing and reuse, which I didn't know was a common use, but it appears to be in your case. The third policy might be better for memory management, since a field full of equal-size allocations that get deleted when each ArrayBuilder is done would be less fragmented: new ArrayBuilders can fill in those equal-size slots. Thus, the third policy might be best for situations with many ArrayBuilders making arrays, which I think might be common (It certainly is in our unit tests!) Independent of the problem of finding the best GrowableBuffer policy, there's another performance bottleneck in the way that ArrayBuilder has to discover the type of its data while accumulating. For months, we've been thinking about adding a TypedArrayBuilder that would fix this problem by requiring the type up-front. It could then build the one and only tree for that type, and most commands, such as Although the TypedArrayBuilder knows the type before filling, it doesn't know the type at compile-time, since we have to distribute a wheel for Python that can't instantiate the infinitely many types, so we need something that "almost compiles" at runtime. @ianna is implementing this in C++ using AwkwardForth (arxiv:2102.13516), which is also intended for communication between Uproot and Awkward Array: it's a low-level language to generate code for (i.e. not writing it by hand) that is optimized for filling buffers quickly. Instructions in AwkwardForth are not quite as fast as compiled C++ (they're about 5 ns per instruction with no ability to do compiler-optimizations to reduce the number of instructions), but some AwkwardForth programs are faster than fetching data from memory, and certainly from disk. Most importantly, they are written at runtime. @ianna's implementation of TypedArrayBuilder has fixed C++ code for the This talk about generating code at runtime begs the question, "Why not use a JIT-compiler?" so here is where I will talk about Numba. AwkwardForth is an alternative to a full JIT-compiler without dependencies—its purpose is to ensure that Awkward Array doesn't strictly require LLVM for all users. But if you have LLVM because you're using JAX or Numba, that's a different story. The non-typed ArrayBuilder is implemented in Numba by having Numba call ArrayBuilder's C++ through function pointers. (The Numba lowering is in src/awkward/_connect/_numba/builder, ArrayBuilder's function pointer interface is here.) From Numba's perspective, the ArrayBuilder is a single opaque type, since the types of Numba objects can't change at runtime the way that ArrayBuilder does. From a performance perspective, this means that we gain nothing from Numba's ability to JIT-compile to a specific type, and LLVM can't optimize code that is called through external function pointers. The TypedArrayBuilder concept could help here, but not the C++ implementation linked through function pointers. To really gain a performance advantage from a TypedArrayBuilder in Numba, we would have to use the given type to generate specialized Numba code, which unfortunately means a complete reimplementation. We're going forward with TypedArrayBuilder in C++ now because (a) it has non-Numba applications and (b) it sets the interface for how a TypedArrayBuilder should work, so that implementing a Numba version would be adhering to an already-developed standard. (This "design work" is relevant because the TypedArrayBuilder interface is not exactly the same as the ArrayBuilder interface—some commands, like As a very first step, we should think about Numba-native GrowableBuffers, which would be useful even without the (Typed)ArrayBuilder. You've brought that up before, @HDembinski. As described above, there are at least three ways to do the buffer-growing: (1) replace with exponentially larger buffers and share with I don't see how a "changing size array" concept fits into JAX's view of JIT-compilation, but if you have any ideas there, let me know! Most of what I've talked about above presupposes that we change how ArrayBuilder (or at least GrowableBuffer) works to improve performance. In its current state, you have to for a, b, c in tree.iterate(["a", "b", "c"], how=tuple):
builder = ArrayBuilder()
process(builder, a, b, c)
hist.fill(builder.snapshot()) and can't builder = ArrayBuilder()
for a, b, c in tree.iterate(["a", "b", "c"], how=tuple):
process(builder, a, b, c)
hist.fill(builder.snapshot()) because the latter would fill histograms with more and more data (refilling it with what was filled on previous iterations). There's no Also implicit in this example: it looks like your I think it should be possible to implement a GrowableBuffer in Numba, even without extending Numba. Each buffer would be a NumPy array in Numba. I don't know if Numba's type system allows you to replace a variable representing an array, but it's possible to make typed lists, and you can replace an element of a list or append to the list. (By replacing a length-1 list, you could implement policy (1) or (2); by appending, you could implement policy (3).) I searched for an example of this, and this StackOverflow answer shows how to do it in a Looking even further into the future, your problem would be more easily solved if histogram objects were recognized by Numba (by writing a Numba extension for it). @henryiii and I have talked about that—that would be great. It would be a matter of replacing the boost-histogram with a Numba model consisting of NumPy arrays viewing the histogram's counts and then implementing the |
Beta Was this translation helpful? Give feedback.
-
I am wondering how smart ArrayBuilder is implemented. Consider the following typical analysis code, where we read a chunk of events, process them with numba and the fill the result in some (boost-)histograms. ArrayBuilder is needed for this, because in the analysis the size of the output array is not predictable.
This should work, but it is not efficient, since ArrayBuilders are created and destroyed in each iteration of the outer loop and allocating memory from the OS is an expensive operation. ArrayBuilder should reuse its accumulated memory similarly to a std::vector on
clear
for the next iteration.I read in the docs that it is safe to reuse the builder after calling snapshot. So I was wondering if the following code is smart enough to not accumulate an ever increasing amount of memory:
I read in the docs that the array returned by builder.snapshot is a memory view. So at least in theory this array once destroyed could call back into its parent to signal the viewed memory is now free again to be used for building the next array.
So does it work this way or could it be implemented to work in this way?
Beta Was this translation helpful? Give feedback.
All reactions