-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an RDataSource for podio files and collections #593
Merged
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
0caecdf
Moving RDataSource closer to Podio/EDM4hep
kjvbrt 8bb3c08
Using /// and @ for the doxygen docs
kjvbrt 684b2fb
Moving doc strings to header file
kjvbrt f63db32
Clang format
kjvbrt de92a7b
Adding podio::ROOTDataSource class to the rootmap
kjvbrt 6d5cfba
Minor adjustments
kjvbrt 12858be
Another set of small adjustments
kjvbrt 34d1257
Separating datasource into standalone library
kjvbrt a3e7bda
Adding ON flag to all tests
kjvbrt 90a940b
Formatting
kjvbrt 4f18cf4
The headers should install now
kjvbrt 943a690
Removing exporting of compile commands
kjvbrt e6b73ad
Other suggested adjustment
kjvbrt 1e1677c
Installing also utilities directory
kjvbrt b059d23
Cleanup setup code slightly to avoid unnecessary copies
tmadlener e54dce9
Adding missing podioDataSourceDict target
kjvbrt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
#ifndef PODIO_DATASOURCE_H | ||
#define PODIO_DATASOURCE_H | ||
|
||
// Podio | ||
#include <podio/CollectionBase.h> | ||
#include <podio/Frame.h> | ||
#include <podio/Reader.h> | ||
|
||
// ROOT | ||
#include <ROOT/RDataFrame.hxx> | ||
#include <ROOT/RDataSource.hxx> | ||
|
||
// STL | ||
#include <memory> | ||
#include <string> | ||
#include <typeinfo> | ||
#include <utility> | ||
#include <vector> | ||
|
||
namespace podio { | ||
class DataSource : public ROOT::RDF::RDataSource { | ||
public: | ||
/// | ||
/// @brief Construct the podio::DataSource from the provided file. | ||
/// | ||
explicit DataSource(const std::string& filePath, int nEvents = -1); | ||
|
||
/// | ||
/// @brief Construct the podio::DataSource from the provided file list. | ||
/// | ||
explicit DataSource(const std::vector<std::string>& filePathList, int nEvents = -1); | ||
|
||
/// | ||
/// @brief Inform the podio::DataSource of the desired level of parallelism. | ||
/// | ||
void SetNSlots(unsigned int nSlots) override; | ||
|
||
/// | ||
/// @brief Inform podio::DataSource that an event-loop is about to start. | ||
/// | ||
void Initialize() override; | ||
|
||
/// | ||
/// @brief Retrieve from podio::DataSource a set of ranges of entries that | ||
/// can be processed concurrently. | ||
/// | ||
std::vector<std::pair<ULong64_t, ULong64_t>> GetEntryRanges() override; | ||
|
||
/// | ||
/// @brief Inform podio::DataSource that a certain thread is about to start | ||
/// working on a certain range of entries. | ||
/// | ||
void InitSlot(unsigned int slot, ULong64_t firstEntry) override; | ||
|
||
/// | ||
/// @brief Inform podio::DataSource that a certain thread is about to start | ||
/// working on a certain entry. | ||
/// | ||
bool SetEntry(unsigned int slot, ULong64_t entry) override; | ||
|
||
/// | ||
/// @brief Inform podio::DataSource that a certain thread finished working | ||
/// on a certain range of entries. | ||
/// | ||
void FinalizeSlot(unsigned int slot) override; | ||
|
||
/// | ||
/// @brief Inform podio::DataSource that an event-loop finished. | ||
/// | ||
void Finalize() override; | ||
|
||
/// | ||
/// @brief Returns a reference to the collection of the dataset's column | ||
/// names | ||
/// | ||
const std::vector<std::string>& GetColumnNames() const override; | ||
|
||
/// | ||
/// @brief Checks if the dataset has a certain column. | ||
/// | ||
bool HasColumn(std::string_view columnName) const override; | ||
|
||
/// | ||
/// @brief Type of a column as a string. Required for JITting. | ||
/// | ||
std::string GetTypeName(std::string_view columnName) const override; | ||
|
||
protected: | ||
/// | ||
/// @brief Type-erased vector of pointers to pointers to column | ||
/// values --- one per slot. | ||
/// | ||
std::vector<void*> GetColumnReadersImpl(std::string_view name, const std::type_info& typeInfo) override; | ||
|
||
std::string AsString() override { | ||
return "Podio data source"; | ||
}; | ||
|
||
private: | ||
/// Number of slots/threads | ||
unsigned int m_nSlots = 1; | ||
|
||
/// Input filename | ||
std::vector<std::string> m_filePathList = {}; | ||
|
||
/// Total number of events | ||
ULong64_t m_nEvents = 0; | ||
|
||
/// Ranges of events available to be processed | ||
std::vector<std::pair<ULong64_t, ULong64_t>> m_rangesAvailable = {}; | ||
|
||
/// Ranges of events available ever created | ||
std::vector<std::pair<ULong64_t, ULong64_t>> m_rangesAll = {}; | ||
|
||
/// Column names | ||
std::vector<std::string> m_columnNames{}; | ||
|
||
/// Column types | ||
std::vector<std::string> m_columnTypes = {}; | ||
|
||
/// Collections, m_Collections[columnIndex][slotIndex] | ||
std::vector<std::vector<const podio::CollectionBase*>> m_Collections = {}; | ||
|
||
/// Active collections | ||
std::vector<unsigned int> m_activeCollections = {}; | ||
|
||
/// Root podio readers | ||
std::vector<std::unique_ptr<podio::Reader>> m_podioReaders = {}; | ||
|
||
/// Podio frames | ||
std::vector<std::unique_ptr<podio::Frame>> m_frames = {}; | ||
|
||
/// | ||
/// @brief Setup input for the podio::DataSource. | ||
/// | ||
/// @param[in] Number of events. | ||
/// @return void. | ||
/// | ||
void SetupInput(int nEvents); | ||
}; | ||
|
||
/// | ||
/// @brief Create RDataFrame from multiple Podio files. | ||
/// | ||
/// @param[in] filePathList List of file paths from which the RDataFrame | ||
/// will be created. | ||
/// @return RDataFrame created from input file list. | ||
/// | ||
ROOT::RDataFrame CreateDataFrame(const std::vector<std::string>& filePathList); | ||
|
||
/// | ||
/// @brief Create RDataFrame from a Podio file. | ||
/// | ||
/// @param[in] filePath File path from which the RDataFrame will be created. | ||
/// @return RDataFrame created from input file list. | ||
/// | ||
ROOT::RDataFrame CreateDataFrame(const std::string& filePath); | ||
} // namespace podio | ||
|
||
#endif /* PODIO_DATASOURCE_H */ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a minimal version for RDataSource that we could / should check here for the RDataSource as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RDataSource is in ROOT for a long time (not experimental since ROOT 6.14), I tested this in nightlies stack (6.32)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, then I think we can also skip the check here. Thanks for checking.