Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GenericParameters should be stored the same way for TTree and RNTuple backends #590

Closed
tmadlener opened this issue Apr 23, 2024 · 6 comments · Fixed by #615 or #625
Closed

GenericParameters should be stored the same way for TTree and RNTuple backends #590

tmadlener opened this issue Apr 23, 2024 · 6 comments · Fixed by #615 or #625
Assignees

Comments

@tmadlener
Copy link
Collaborator

Currently the ROOTWriter and ROOTReader (TTree) based store the GenericParameters of a Frame directly via a custom dictionary, e.g.:

podio/src/ROOTReader.cc

Lines 42 to 46 in 02a4b9d

GenericParameters params;
auto* emd = &params;
branch->SetAddress(&emd);
branch->GetEntry(localEntry);
return params;

On the other hand the RNTuple backend splits these up into pairs of vectors for the keys and the values:

podio/src/RNTupleWriter.cc

Lines 249 to 257 in 02a4b9d

model->AddField(RFieldBase::Create(root_utils::intKeyName, "std::vector<std::string>>").Unwrap());
model->AddField(RFieldBase::Create(root_utils::floatKeyName, "std::vector<std::string>>").Unwrap());
model->AddField(RFieldBase::Create(root_utils::doubleKeyName, "std::vector<std::string>>").Unwrap());
model->AddField(RFieldBase::Create(root_utils::stringKeyName, "std::vector<std::string>>").Unwrap());
model->AddField(RFieldBase::Create(root_utils::intValueName, "std::vector<std::vector<int>>").Unwrap());
model->AddField(RFieldBase::Create(root_utils::floatValueName, "std::vector<std::vector<float>>").Unwrap());
model->AddField(RFieldBase::Create(root_utils::doubleValueName, "std::vector<std::vector<double>>").Unwrap());
model->AddField(RFieldBase::Create(root_utils::stringValueName, "std::vector<std::vector<std::string>>").Unwrap());

These should both use the same mechanism to avoid confusion. Given that RNTuple doesn't support writing map types and that pairs of vectors are easier to interpret without knowing about GenericParameters, we decided to switch the TTree based backend to that as well. See EDM4hep meeting on Apr 23, 2024

@jmcarcell
Copy link
Member

At the time of writing the RNTupleWriter.cc there wasn't support for std::map but now there is support for it: root-project/root#13904
So the cleaner choice is to move the RNTupleWriter to use std::map. However, someone complained today that uproot may not support std::map but I don't know if we want to have our files be compatible with uproot...

@tmadlener
Copy link
Collaborator Author

I would prefer the solution that makes the files more easy to read without root / with uproot actually. I also think that would probably play a bit nicer with RDataFrame (without having ever tested this).

@Zehvogel
Copy link
Contributor

Zehvogel commented Jun 4, 2024

I also think that would probably play a bit nicer with RDataFrame (without having ever tested this).

For RDataFrame the easiest way to access GenericParameters is to include the header and use the real type.

ROOT.gInterpreter.Declare("#include <podio/GenericParameters.h>")
df = df.Define("xsec_fb", "PARAMETERS.getValue<float>(\"crossSection\")")

I.e. as long as GenericParameters makes it into the file as GenericParameters and not something like GenericParametersData I am happy :)

@tmadlener tmadlener self-assigned this Jun 4, 2024
@tmadlener
Copy link
Collaborator Author

I am not sure we can really preserve the currently possible behavior for reading the parameters as a GenericParameter object as is possible now using TTrees. However, we might be able to simply offer similar functionality via the RDataSource in #593.

My currently favored options at the moment would be to either

  • Move the TTree implementation over to use the same approach as RNTuple, i.e. pairs of vectors containing the keys and values
  • or make both the TTree and RNtuple based implementations write maps

@Zehvogel
Copy link
Contributor

Zehvogel commented Jun 5, 2024

However, we might be able to simply offer similar functionality via the RDataSource in #593.

Yes please! But then it would be really good to have a clear time line for the RDataSource before we break stuff that works at the moment...

@andresailer
Copy link
Member

Todo: Add something to make the use of GenericParameters easier in FCCAnalysis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment