significantly speed up parsing of easyconfig files by only extracting comments from an easyconfig file when they're actually needed #3498
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While trying to figure out why there's a long delay before actually starting to install extensions, especially with
R
easyconfigs which include 100s of extensions, I noticed that 85% of the time is spent in theextract_comments
method (forR-4.0.0-foss-2020a.eb
):Part of this is that for every extension in the easyconfig file the easyconfig file itself is re-parsed, to ensure the extension installation starts with a clean slate. It may be possible to avoid that, but ensuring this is done correctly (that is, while avoiding that stuff leaks from the "parent" installation (e.g.
R
) to extensions, or between extensions themselves) is not trivial.An easy win is to only extract comments from easyconfig files when needed, i.e. when
self.comments
is used (which is basically only done when calling thedump()
method to dump a parsed easyconfig file). This makes a huge difference, here's the profile foreb
getting ready to install extensions forR-4.0.0-foss-2020a.eb
(same as what is covered by the profile above):Without
--debug
, it's significantly faster even: