-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrote checkForModuleDependencyCorrectness #34735
Rewrote checkForModuleDependencyCorrectness #34735
Conversation
The IBs were showing the old algorithm, using boost graph library, could hit some pathological cases and take >10 minutes to run. The new algorithm simulates how the framework would run the modules and checks to see if a deadlock would occur.
The function is no longer needed as the dependency checks are now done using a different algoritm.
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34735/24382
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34735/24383
|
A new Pull Request was created by @Dr15Jones (Chris Jones) for master. It involves the following packages:
@makortel, @smuzaffar, @cmsbuild, @Dr15Jones can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
I tested this change on step2 of workflow 11725.0 under CMSSW_12_0_X_2021-07-28-2300. On the machine I tested, the original job took > 10 minutes to do the job initialization. Using this code, it took less than 2 minutes. However, the new code gave an error stating a dependent module was later on a path. The problem was the same modules appear multiple times on the same path which confused the part of the algorithm that is meant to enforce policy, not the part that tests for runnability. I'll modify the algorithm to ignore duplicate modules on the same path. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34735/24384
|
Pull request #34735 was updated. @makortel, @smuzaffar, @cmsbuild, @Dr15Jones can you please check and sign again. |
please test |
@Martin-Grunewald @fwyzard The new module dependency checking algorithm explicitly adds some enforcements of policy which the old algorithm appeared to be doing. One such policy is if a module 'a' consumes data from 'module 'b' then if module 'b' appears on at least 1 path with module 'a' it must appear on ALL paths with module 'b'. We believe that policy was requested by the HLT. NOTE: that is strictly a policy enforcement as in the cases mentioned above, the framework could actually properly schedule the modules even if the paths were not completely consistent. The reason I mention this is I was just testing step2 of workflow 11725.0 and it is failing with
Therefore strict enforcement of this policy will likely break the IBs as the old algorithm was not catching all the cases. So we need to know if this policy must actually be enforced and if so, who will clean up the existing problems. |
@fwyzard |
Sure ! The error reported by Chris
mentions the module .7 is the workflow modifier used by |
See #33802 and the discussion around #33802 (comment) . By the way, I should correct myself: the agreement was that the mkFit customisation should not block any HLT-related work from being merged - for the framework changes, it's up to the Core Software group. |
Thanks Andrea! Sorry to say, but my first reaction is that this sucks: That customisation is alien to HLT, why is an HLT modification done by Reco? In view of integration into HLT 'some time in the future'? At this stage it may help Reco but is no good for HLT. Who is the proponent to push this into HLT? On what time scale? Hmm, I would prefer if that modification of HLT would be removed alltogether from IB and other 'official' tests, and kept private for now until it gets proposed and approved for integration into HLT. |
Eh... I don't disagree.
I would say @mmasciov, @slava77, and @makortel, based on the previous presentations, discussion, and work on PRs. However it's not clear (to me) if it will be useful for the HLT, at least on the timescale of the beginning of Run 3. |
-1 Failed Tests: RelVals RelVals-INPUT RelVals
RelVals-INPUT
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-76ce15/17580/summary.html Comparison SummarySummary:
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy, @perrotta (and backports should be raised in the release meeting by the corresponding L2) |
PR description:
PR validation:
Code compiles. Framework unit tests (including new ones) pass.
fixes #34633
fixes #31199
fixes cms-sw/framework-team#210