-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems in the IO code #393
Comments
Let me try to bring some information on the original intent into this. I have to point out, that not all of this is ideally implemented yet, and there is still some historical baggage around that we didn't come around to fixing yet, because schema evolution would have touched similar code. I am working on getting closer to the ideas described below. There are two types of collection buffers, which mainly differ by the ownership of the data they transport:
The main idea is that since collections own their data, we simply need a convenient way to handle all necessary data for writing through one access point (instead of three like it was in the past). On the other hand when reading that, there is no collection yet to own the data, so we read all data into buffers and then make the collections constructible form these buffers (effectively "stealing" the data in the process). Another important point is that while relation (
The podio/python/templates/Collection.cc.jinja2 Lines 161 to 164 in c7328d6
In the Some more detailed comments
This is almost certainly a bug. The
At least the intent of this was definitely to return a podio/python/templates/SIOBlock.cc.jinja2 Lines 62 to 65 in c7328d6
Also one of these casts seem to be necessary when reading via ROOT, as there is something going on with a Line 133 in c7328d6
(this recast is essentially simply calling such the asVector with the correct type, using a std::function to type-erase the whole thing.)
See also: podio/include/podio/CollectionBuffers.h Lines 84 to 103 in c7328d6
|
@jmcarcell - thanks for looking into all that! |
When implementing support for RNTuple I found a few issues with some parts of the IO code that will have to be modified or worked around for RNTuple to work.
unique_ptr
. This place is:podio/python/templates/CollectionData.cc.jinja2
Line 94 in c7328d6
vector.data()
instead of a pointer to the vector itself. This also relies on the fact that implementations ofstd::vector
save the pointer to the actual array in the firstsize_ptr
bytes (if this wasn't the case the cast wouldn't even point to the data array):podio/include/podio/CollectionBuffers.h
Line 40 in c7328d6
How all of this works together is a mystery to me, as the TTree functions seem to be happy but the RNTuple ones don't. For the first point, I think we also probably rely on the fact that
unique_ptr
can have size zero. The first point can be easily worked around by saving another pointer to the actual data and using that one. Fixing any of those points will cause all the writers and readers to stop writing where they either crash or (the writers) don't crash but they don't write the data correctly.The text was updated successfully, but these errors were encountered: