Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecoMCTruthLinker crashes #125

Open
ggrenier opened this issue Dec 20, 2023 · 3 comments
Open

RecoMCTruthLinker crashes #125

ggrenier opened this issue Dec 20, 2023 · 3 comments

Comments

@ggrenier
Copy link
Contributor

Running on a machine with OS version : "CentOS Linux release 7.9.2009 (Core)"

Using cvmfs build : source /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/v02-03-02/init_ilcsoft.sh

Using ILDConfig version v02-03-02.

Running Marlin in directory ILDConfig-02-03-02/StandardConfig/production/
with the command :
Marlin MarlinStdReco.xml --constant.lcgeo_DIR=${lcgeo_DIR} --constant.DetectorModel=ILD_l2_v02 --constant.CMSEnergy=250 --global.LCIOInputFiles=/scratch/ddsim_E1-calib.Puds91.Gsgreen.e0.p0.I110048.01.slcio --global.MaxRecordNumber=10

Input file has been copied from the dirac grid. Location on dirac grid : /ilc/user/g/ggrenier/prod/v02-02-03/uds/sim/ILD_l2_v02/ddsim_E1-calib.Puds91.Gsgreen.e0.p0.I110048.01.slcio

Result crash in RecoMCTruthLinker : program output ends with :

 [ MESSAGE0 "MyRecoMCTruthLinker"]  processEvent 0  - 0

 *** Break *** segmentation violation

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f87330ae60c in waitpid () from /usr/lib64/libc.so.6
#1  0x00007f873302bf62 in do_system () from /usr/lib64/libc.so.6
#2  0x00007f8731b6fe3b in Exec (shellcmd=<optimized out>, this=0x6469f0) at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/root/6.28.04/core/unix/src/TUnixSystem.cxx:2104
#3  TUnixSystem::StackTrace() () at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/root/6.28.04/core/unix/src/TUnixSystem.cxx:2395
#4  0x00007f8731b6d575 in TUnixSystem::DispatchSignals (this=0x6469f0, sig=kSigSegmentationViolation) at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/root/6.28.04/core/unix/src/TUnixSystem.cxx:3615
#5  <signal handler called>
#6  RecoMCTruthLinker::clusterLinker(EVENT::LCEvent*, EVENT::LCCollection*, EVENT::LCCollection*, EVENT::LCCollection**, EVENT::LCCollection**, EVENT::LCCollection**) () at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/v02-03-02/MarlinReco/v01-34/Analysis/RecoMCTruthLink/src/RecoMCTruthLinker.cc:1282
#7  0x00007f870d985562 in RecoMCTruthLinker::processEvent(EVENT::LCEvent*) () at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/v02-03-02/MarlinReco/v01-34/Analysis/RecoMCTruthLink/src/RecoMCTruthLinker.cc:383
#8  0x00007f8733faddd6 in marlin::ProcessorMgr::processEvent(EVENT::LCEvent*) () at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/v02-03-02/Marlin/v01-19/source/src/ProcessorMgr.cc:494
#9  0x00007f8733ee4371 in SIO::SIOReader::processEvent (this=0x354e840, event=...) at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/v02-03-02/lcio/v02-20/src/cpp/src/SIO/SIOReader.cc:204
#10 0x00007f8733eeeee3 in operator() (recdata=..., recinfo=..., __closure=<synthetic pointer>) at /cvmfs/sft.cern.ch/lcg/releases/gcc/10.3.0-f5826/x86_64-centos7/include/c++/10.3.0/ext/atomicity.h:100
#11 read_records<MT::LCReader::readStream(const LCReaderListenerList&, int)::<lambda(const sio::record_info&)>, MT::LCReader::readStream(const LCReaderListenerList&, int)::<lambda(const sio::record_info&, const sio::buffer_span&)> > (func=..., valid=..., outbuf=..., stream=...) at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/sio/v00-01/include/sio/api.h:419
#12 MT::LCReader::readStream(std::unordered_set<MT::LCReaderListener*, std::hash<MT::LCReaderListener*>, std::equal_to<MT::LCReaderListener*>, std::allocator<MT::LCReaderListener*> > const&, int) [clone .localalias] () at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/v02-03-02/lcio/v02-20/src/cpp/src/MT/LCReader.cc:557
#13 0x00007f8733eef8a0 in MT::LCReader::readStream(MT::LCReaderListener*, int) () at /cvmfs/sft.cern.ch/lcg/releases/gcc/10.3.0-f5826/x86_64-centos7/include/c++/10.3.0/initializer_list:79
#14 0x000000000040878f in main () at /cvmfs/ilc.desy.de/sw/x86_64_gcc103_centos7/v02-03-02/Marlin/v01-19/source/src/Marlin.cc:458
#15 0x00007f873300b555 in __libc_start_main () from /usr/lib64/libc.so.6
#16 0x0000000000408fdf in _start () at /cvmfs/sft.cern.ch/lcg/releases/gcc/10.3.0-f5826/x86_64-centos7/include/c++/10.3.0/bits/basic_string.tcc:206
===========================================================
@tmadlener
Copy link
Contributor

Hi @ggrenier, thanks for the report. Would it be possible for you to run Marlin again with a slightly higher verbosity (at least for the reco mc truth linker? (I.e. set the Verbosity steering parameter to DEBUG in the steering file). Just from looking at the code it is not entirely clear to me how this would crash at the point it does, because there seem to be checks to avoid that in principle. However, there should be a print out in that case and I am currently not sure whether that is simply missing because of a too low output level, or because the code does not do what I think it does.

This is where the crash happens (line 1282):

if ( it->first == 0 ) { // ( if == 0, this cluster contains some (but not all) sim-hits with unknown origin.
// If *all* sim-hits would have had unknown origin, we would already have "continue":ed above)
streamlog_out( MESSAGE ) << " SimHit with unknown origin in cluster " << clu << " ( " << clu->id() << " ) " << std::endl;
continue;
}
if (it->first->getGeneratorStatus() == 1 ) { // genstat 1 particle, ie. it is a bona fide
// creating true particle: enter it into
// the list, and note how much energy it
// contributed.
theMCPs.push_back(it->first); MCPes.push_back(it->second); ifound++;

@ggrenier
Copy link
Contributor Author

Hi @tmadlener
Running with adding option --MyRecoMCTruthLinker.Verbosity=DEBUG gives the output
marlin_debug.log

@tmadlener
Copy link
Contributor

Thanks. It looks a bit like some of the internal mapping goes wrong, but it is hard to say where just from the outputs. Do all the input clusters and hits have their mc links set properly? Resp. is it possible that one of these input LCRelation collections is missing from the configuration?

I will try to have a look at this with a debugger, but that will be after the christmas break in 2024.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants