Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failure in RunManagerMTWorker::produce() when running with 256 Threads #31483

Closed
tommasoboccali opened this issue Sep 16, 2020 · 8 comments
Closed

Comments

@tommasoboccali
Copy link

(please understand this is very LOW priority - I just thought it is worth reporting)

Ciao, in a test on HPC I was trying to run a GEN-SIM job with 256 threads and 256 streams.
It works, even with very decent CPU efficiency, but from time to time it prints

%MSG-i ThreadStreamSetup: (NoModuleName) 15-Sep-2020 15:04:10 CEST pre-events
setting # threads 256
setting # streams 256
%MSG
...
...
...
%MSG-w SimG4CoreApplication: OscarMTProducer:g4SimHits 15-Sep-2020 15:16:40 CEST Run: 1 Event: 5493
RunManagerMTWorker::produce(): stream 106 thread 323 initializing in the produce(..) method - there is a problem
%MSG

(by the way, there should not be a thread 323....)

Is there a known limit to the level of MT we can run?

ciao ciao

tom

@cmsbuild
Copy link
Contributor

A new Issue was created by @tommasoboccali Tommaso Boccali.

@Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@silviodonato
Copy link
Contributor

assign simulation, core

@cmsbuild
Copy link
Contributor

New categories assigned: core,simulation

@Dr15Jones,@smuzaffar,@mdhildreth,@makortel,@civanch you have been requested to review this Pull request/Issue and eventually sign? Thanks

@Dr15Jones
Copy link
Contributor

In the past we've run simulation jobs on KNLs using 256 threads without seeing that issue. Grant it, that was done near the beginning of Run 2 so has been a while.

@Dr15Jones
Copy link
Contributor

@tommasoboccali which CMSSW release were you using?

@makortel
Copy link
Contributor

(by the way, there should not be a thread 323....)

There can, because that number counts the number of unique threads that run ~any method of RunManagerMTWorker, and TBB does add and remove threads from its thread pool.

In the printout

RunManagerMTWorker::produce(): stream 106 thread 323 initializing in the produce(..) method - there is a problem

the "there is a problem" is misleading, because it just tells that the per-thread initialization has not been done yet for that particular thread (i.e. there is no real problem, as far as I can tell)

if (!(m_tls && m_tls->threadInitialized)) {
edm::LogVerbatim("SimG4CoreApplication")
<< "RunManagerMTWorker::produce(): stream " << inpevt.streamID() << " thread " << getThreadIndex()
<< " initializing in the produce(..) method - there is a problem";
initializeG4(&runManagerMaster, es);
m_tls->threadInitialized = true;

@tommasoboccali
Copy link
Author

Thanks Chris!
indeed the job ends w/o problems, I was wondering how serious this is.

it is not 👍

tom

@civanch
Copy link
Contributor

civanch commented Sep 16, 2020

I will double check, that the printout is corrected (we agreed about with Matti some time ago) - there is no problem but an indication, that there is a stream in which beginRun() initialisation was not done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants