Skip to content
This repository has been archived by the owner on Jan 27, 2019. It is now read-only.

a little faster hash computation #177

Merged
merged 5 commits into from
Nov 3, 2016
Merged

Conversation

Villemoes
Copy link
Contributor

This seems to cut around 12% of hash computation.

Rasmus Villemoes added 5 commits October 21, 2016 11:29
First of all, a non-greedy ".*?" with no zero-width assertions
or anything else following is always guaranteed to match the empty
string, so it contributes nothing.

The once-compiled TASKFUNC_RE is only used in a single place, the
.meta() method. There, we only extract match_group(0), which is the
entire string matched, so there's no reason for the regexp to have a
capturing parentheses. It was probably there due to some confusion as to
whether the strings in the "emit" list we're building should contain the
"do_" prefix or not. It should, as also witnessed by the

        if not emit_task.startswith("do_"):
            emit_task = "do_" + emit_task

and this is indeed also what the current regexp and its use
achieves. This just makes the code a little less misleading.
Nobody includes the "do_" prefix in the META_EMIT_PREFIX, so we might as
well make that a rule. Also, rather than prepending "do_" in the tight
inner loop, it might as well be done at split time.

This cuts about 1-2% of hash computation time.
In order for x to start with y (both of which are known to be non-empty
strings), we must have x[0]==y[0]. Using that, we can split the list
of (taskname, prefix) pairs according to prefix[0] and, when deciding
whether to keep var, only loop over those with var[0]==prefix[0].

The benefit of this extra complexity is ~10% faster hash computation.
Profiling shows these lines to be semi-hot (being hit 100000s of times),
and also that in 100% of cases, the to-be-added string is not already in
the list. Adding the string twice would be harmless, so remove the
conditionals.
Nobody has ever used the ability to blacklist a variable from tasks, and
this is a rather hot path during hash computation, so eliminate some
effectively-dead-but-expensive code.
@sknsean
Copy link
Contributor

sknsean commented Nov 1, 2016

I have been running this code for 14 days now, i have seen no issues with it.
Nice to get oe bake a little faster 👍

@esben esben modified the milestone: 6.3.0 Nov 3, 2016
@esben esben merged commit 053f2b9 into oe-lite:master Nov 3, 2016
@Villemoes Villemoes deleted the ravi/faster_hash branch November 3, 2016 20:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants