-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a reader and writer interface #522
Conversation
74dd6d6
to
cf6cc78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't yet looked into all the details, but one general comment that I already have now is: How hard would it be to make an I/O independent reader interfaces? The hardest part is most likely the different return types of readNextEntry
/readEntry
.
I would then make the user facing interface look something like this:
struct Reader {
podio::Frame readNextFrame(const std::string& category);
};
which then internally just calls the readNextEntry
from the specific reader and constructs the Frame from the returned FrameData. Potentially the return type should be std::optional<podio::Frame>
to make it easier to check whether we have run out of entries, because readNextEntry
will return a nullptr
in that case.
I pushed although it's unfinished (all test pass when building with rntuple and without SIO) but the details can be done later. So now the readers return an class Reader {
public:
Reader(const std::string& filename, const std::string& fileType="TTree");
podio::Frame readNextFrame(const std::string& name);
podio::Frame readFrame(const std::string& name, size_t index);
size_t getEntries(const std::string& name); So to use auto reader = Reader("file.root");
for (auto i = 0; i < reader.getEntries(); i++)
auto frame = reader.readNextFrame("events"); which should cover most cases together with the other method to read a specific Frame and then one can always use an specific reader for more flexibility. |
8e36225
to
5642b3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move the RNTuple
related renaming commit to a separate PR? It would make this a bit more "atomic" and would yield a slightly cleaner diff for the review of the interface related changes.
I have a few comments for the interface already below as well. I think it looks good in general, just a few details that could be improved.
4d5032c
to
f172dc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we want to replicate the reader / writer interfaces on the python side exactly. In the end the whole Mixin business is there to abstract that away and we have duck typing in python, so we probably should just aim at providing (renaming?) the makeReader
or makeWriter
functionality in python to do the right thing.
Another advantage of that would be that the FrameCategoryIterator
will always get some FrameDataT
and will not have to differentiate between dealing with the reader interface or with a concrete reader.
If we replicate the reader / writer interfaces they have to go into some generic module that is independent of the backend (i.e. it would have to be moved out of root_io
where it currently is).
Python changes have been removed, it's not necessary for example for the readers since they detect the type automatically but I guess it should be nice to have something more similar in C++ and python. Needs #568; Clang 12 doesn't like to return a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only have some test organization and nitpicks here.
tests/root_io/read_interface.cpp
Outdated
#if PODIO_ENABLE_SIO | ||
auto readerSIO = podio::makeReader("example_frame_sio_interface.sio"); | ||
if (read_frames(readerSIO)) { | ||
return 1; | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go into the sio_io
tests directory, also for the writing. That might require shuffling some of the code in the .cpp files into headers and putting them into the top level tests
directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some more general includes in the top tests directory and then the actual tests in tests/root_io
and tests/sio_io
. I've added a new library podioIO
since I don't think there is a good place for the reader and writer (but then one will need to link to this one to use the interfaces). I think this should eventually replace the current instances since the interface should cover most of the use cases and you can also change backend very easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the new library / target. I think this should be the new default target to link against. It brings along all the correct dependencies in any case (it links publicly against podioRootIO
and potentially also podioSioIO
). So users should just use podioIO
in their setups. We will have to document that somewhere. However, I don't think we have a fitting section yet to put it into. I only ever put the cmake macros on to slides so far, it seems.
Some test names changes so sanitizer workflows started to fall over them because of ROOT internals. I have split them into purely TTree based and purely RNTuple based tests so that at least the latter should be doable in sanitizer builds. |
522c636
to
ba6722b
Compare
src/Writer.cc
Outdated
throw std::runtime_error("ROOT RNTuple writer not available. Please recompile with ROOT RNTuple support."); | ||
#endif | ||
} | ||
if ((type == "default" && endsWith(filename, ".sio")) || lower(type) == "sio") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure whether we should check the type
here as well. currently it would be possible to write example.root
as an sio
file, which would then break when trying to read it back, probably with some rather incomprehensible error.
I think it would be less error prone, if we only check the type
for the .root
file ending and ignore it entirely for sio
. Otherwise we would at least also have to foresee the possibility to customize the type
in the reader interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I don't understand why the |
Reading a frame with SIO is also not free from leaks and this is why it's commented out in the sanitizer tests, right?
The
so it doesn't have anything to do with the interface. I'll just add it to the list of excluded |
Ah you are right, I missed that they are also excluded from the sanitizer tests. I thought they were running. So this is indeed not something new with the interface, but a general leak. I have opened an issue to keep track of that. |
BEGINRELEASENOTES
podioIO
with the interface readers and writersENDRELEASENOTES