-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal to remove ext files from dev branch #852
Comments
While the unpackers are experiment specific, I do not think that the resulting files describing the mapped data are necessarily so. Two typical cases of changes might appear:
It would be great if every ext h101 file committed to r3broot would include the unpacker (name, git hash), date, path and exact invocation used to create that file. If we were to auto-generate the h101 headers during compilation, this would imply that each compile can only work with a single experiment. I would favor the current approach where one can analyze different experiments with the same compile. (I would however favor CI tests with experiment unpackers so we can ensure that nobody breaks the compatibility of the h101 files in git with any unpacker.) On a side note, I think the construction of the overall h101 for the ucesb reader in the C macros, complete with offsetof's and everything is something which should go. The overall struct consisting of N detector specific structs does not seem to be required anywhere. Instead, each reader class could tell the R3BUcesbReader (or source or whatever) the size of its h101 struct at runtime, and the R3BUcesbSomething could use the arithmetic technique of addition to figure out how large the buffer should be in total, and provide each reader with a pointer to their region in the buffer. For bonus points, this should be accomplished by a variadic method instead of passing void pointers. Of course, it could also be argued that there is no reason why the readers one wants to instantiate should be specified in the macro at all, as they are implied by the mapping. If the unpacker of an experiment provides data of some detector Foo, just instantiate R3BFooReader. (In that case, there should be some way to override this. When analyzing calibration runs of Califa with the s522 unpacker, one probably would not want to instantiate the readers for all of the other detectors. Of course, one fix would be to use the califa_only unpacker for that (if it were not for the misguided decision to encode the presumed gain settings of the califa preamplifiers in the channel mapping).) Another improvement would be to allow multiple instances of one reader which read data of the same type, but with different name prefixes. For example, if both the PSPs and CALIFA use FEBEX with the TUM firmware, we can already have have the same code handling the unpacking. If we break the 1:1 mapping between ext h101 structs (data types) and the name string prefix (instances), we could also use two instnces of the same R3BTumFebexReader(std::string srcPrefix) for reading both of into two TClonesArrays of identical types but different names. |
This seems related to the cleanup that is needed to address the issues of incomplete data mappings of #712 . Keeping generated files away from version control systems is a good idea. (Edit: to avoid misunderstandings: This means to either generate them on-the-fly, or avoid them at all. Keeping files outside version control is not an improvement.) One thing that should for sure be changed is to stop using the
should then show nothing. That would be a step on the way. Hmm, perhaps they are not used, just copy-pasted? Then much easier to get rid of them. Then the comment of Dec 15 could be re-read. |
I should perhaps clarify a bit: The structures declared/used in r3broot I imagine to be hand-crafted/cleaned versions of what may possibly have come from ucesb as a suggestion for what structure members are available for a detector (i.e. a generated header). Likewise, the From the writing end, the ext_data interface gets a list of everything which is available from a certain unpacker. It does not need any generated header file for that. It always comes with the data! There is thus no need to generate any headers again and again, e.g. during compilation. There is also no need to dump them for full experiments, since the bits and pieces are known by the substructures in r3broot. |
UCESB has been fixed to not include the deprecated |
The |
Hi all, As Haakan mentioned, 'Keeping generated files away from version control systems is a good idea.' That is my suggestion. Most of our Readers are written to be versatile in regards to detector and channel numbers. They extract this information from the mapping information in the ext_h101_{det name}.h file. The issue is that for 'frozen' detectors, the mapping in the unpacker is consistent. You can copy-paste the ext files from a previous experiment. However, the number of truly 'frozen' detectors is very small. In addition, the mapping naming convention and channel mapping is commonly not 1-to-1 from experiment to experiment. For example, the mapping and naming conventions of the tofd has slightly deviated for ever experiment. Most students who do the calibrations do not know the ext file are generated files. They just use the tofd_h101_tofd.hh files in the Git repository. This is also not clearly specified anywhere in the code base - not basic-user friendly. If (for example) you use the ext_h101_tofd.hh file from the previous experiment, the mapping and naming convention is not the same and the code should fail. This does not always happen (especially if it's a mapping problem). I do NOT want the ext files to be generated during the compile. That's overkill. Rather, the ext files should just be removed from version control and copied over for each experiment. Alternatively, as Philipp mentioned, we could just add the experiment to the end of the ext file name. I think that might be the easiest - especially since the current naming convention is all over the place and the ext file repository needs cleaning up. |
From my understanding, the typical ext_h101 file defines a struct with some member variables and a macro to map these to the corresponding data fields in the stream by name. (Onion data structures are just byte-compatible to the flat structures and are thus initialized with the same macro, it seems.) As the mapping between struct member names and strings in the macro is 1:1, I would assume that a mapping file which has different strings for mapping fields from the stream will also have different names in the struct (i.e. will not compile with the reader code). I had a look at the ext_h101_tofd.h for S522 and S454 and the onion struct does not look very different (apart from array sizes differing, which seems fine (if you have the largest size in git)). The actual mapping being different is a separate problem. I do not have any good solution for mapping, generally. Doing the mapping in R3BRoot will involve tons of illegible parameter containers. Mapping in the unpacker (which is what we do) avoids this, but it turns out that there are mapping adjacent problems (like "what is the febex path of that califa channel?" or "what is the trigger timing I have to use with that TofD channel?" which do not lend themselves to be encoded with a permutation. \footnote{Not impossible, though: califa encodes the gain setting as part of a channel id. Still, I very much disrecommend using the most significant byte of the TofD channel index to store the index of the trigger timing channel to use with that.} In practice, this means that the mapping is all over the place: the main mapping happens in the unpacker, and either the complete mapping or ancillary parts of it are also accessed from R3BRoot. |
|
```
EXT_STR_ITEM_INFO_LIM(ok,si,offset,struct_t,printerr,\
CALIFA_ENE, UINT32,\
"CALIFA_ENE",4864); \
```
from my understanding, this maps the h101 field identified by the string "CALIFA_ENE" to the part of the struct member called CALIFA_ENE, correct?
In principle, one could also smuggle in a variable prefix in the string so that
```
struct h101_califa_munich
{
uint32_t ENE;
...
};
```
can get mapped to "CALIFA_ENE" or "PSP_ENE" at runtime.
This would allow using a generic R3BMunichFebexReader to handle both cases.
(trying to post this by phone per mail, apologies if it goes horribly wrong).
|
@klenze yes. And the name 'smuggling' too! I.e. the string name is what is matched, the 'bare' name is what is used to find the actual offset in the local (=destination) structure. |
Could it be a better solution that we have a separate git repository for all ext header files and include this separate repository as a git submodule under this repository? We could tag the version of the ext header repository according to experiment name before each beam time. If someone needs to analysis old data, it's also very flexible to change the version of the ext header repository back to a specific experiment version while keeping using the latest version of R3BRoot. |
A candiate place to store them would be in the experimental upexps dir for each experiement as they are generated from here e.g |
Dear @YanzhaoW @lr11345, what you are proposing is duplication of code. That is opposite to what I suggested and something which I will not deal with. In fact, the duplications in upexps is something which I for a very long time have pointed out as not being good, and asked to be cleaned up. |
Dear @inkdot7 I agree. But I'm not sure it's possible to always keep this backward compatibility for all detectors. For NeuLAND, sure, an ext header file with trigger for 13 dps would also work for 12 dps without trigger (I will test this) iff we never change the var names or var types. But I don't know whether this could also be applied to all other detectors. If yes, we should just use the latest version of ext header for old experiments as well. If no, then maybe a git submodule may be an option. |
If there is a need (or simply better names come up) then change all experiment analysis / unpackers to use those names. That will of course be much easier with less copy-paste in upexps.
Most of them at least. Begin there, and hopefully the joy spreads (i.e. amount of code in git shrinks.) |
I absolutely would not recommend subjecting our users to git submodules. @inkdot7: A practical problem @ajedele has encountered is that for some experiments, ROLU ran in a separate TDC (which has its own trigger timing channels which can and must be mapped) and for other experiments, ROLU ran in the same TDC module as LOS, so the trigger signals would have to be mapped to both LOS and ROLU, which I vaguely recall not working? Also useful would be the ability to declare a signal which gets mapped to nothing (or a zero element signal array) to make a consumer happy. Of course I fully expect you to tell us to actually take ownership of our h101 in the consumer and just use EXT_DATA_ITEM_FLAGS_OPTIONAL. :-) Looking at the PSP h101 file, It seems to me that the number of preprocessor lines generated scales with the number of detectors. A decent language would prove a way to generate all these by just feeding the onion version of the psp struct into an appropriate macro, but the metaprogramming facilities of C++ (templates, preprocessor) are not suitable for that. |
Yes. :-) To not duplicate code, it would be useful to track which detectors are measured on a global clock, and which are measured 'just' relative to a master start signal. (Thanks for pointing this out, as it explained what is meant by trigger time channels.)
Sounds very plausible. Is unfortunate.
Yes, the proposal is to do that semi-manually, i.e. to extract the core out of the ext_files and then do some loop around that to set up multiple detectors. |
Why?
Do you mean compile-time reflection? |
I think the main use case for git submodules might be instances of Conway's law. If you have a handful of different teams working on with a frequently changing lightweight interface (such as a protocol or file format), then it might make sense to put that interface into a git submodule in all of their projects. (I dunno, I have never designed such a system.) Maintaining one h101 branch per experiment would not be a good use of git submodules (where a commit in the parent repository always contains the has of a single commit in the child repository). Additionally, most of the arguments against the original proposal to auto-create the h101 files also apply:
Anecdotally, the CALIFA DAQ originally used submodules within submodules. That was already a hassle even though I had push permissions, and I flattened everything into a single repository. (We have an independent repo for per machine configurations, but that is in a .gitignored subdirectory.)
|
The external files are generated for each experiment in the unpacker. The external files are used by the reader to map the channels correctly.
The external files are experiment specific and should be generated in the experiment-specific unpacker and should be copied over for each experiment. Otherwise, this leads to potential incorrect mapping or channel numbers/ADC ranges/trigger information. Our naming and mapping changes often enough that this will be an issue.
Since this is in the 1st step of the analysis, most new users will be unaware of the potentially incorrect external files.
The text was updated successfully, but these errors were encountered: