Optimizations for similar_variables.py #1945

0xGusMcCrae · 2023-06-03T15:02:01Z

Optimizations to the algorithm for detecting similar variable names in similar_variables.py.

Runtime for slither/dev on current C4 competiton MAIA codebase:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)

40      0.005    0.000    39.760   0.994   .../slither/detectors/variables/similar_variables.py:76(_detect)

Runtime for optimized version on same codebase:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)

40      0.004    0.000    17.769   0.444   .../slither/detectors/variables/similar_variables.py:83(_detect)

Runtime is reduced by >50% here, and the issues detected are identical.

Changes (in detect_sim):

Inner loop reworked so that it doesn't repeat comparisons. No need to compare the current outer loop's variable to anything with a lower index, as it's already been compared.
caching v1.name.lower() in the outer loop so it is only repeated once per outer loop rather than once per inner loop

0xGusMcCrae · 2023-06-03T16:42:24Z

There were some further optimizations (down to ~14 seconds for the MAIA codebase) that I didn't include in this PR since I wasn't 100% sure that they wouldn't break anything if anything external depended on the similar function having its exact current behavior.

It's possible to feed already lowercase variable names into similar and bring the length equality check from similar out into detect_sim

And AFAIK this line can be deleted:

slither/slither/detectors/variables/similar_variables.py

Line 75 in 79ff12a

if (v2, v1) not in ret:

since v2 won't be re-compared to v1 with the new inner loop setup.

Something like this:

@staticmethod
def similar(seq1: str, seq2: str) -> bool:
    ...
    Returns:
        bool: true if names are similar
    """
    val = difflib.SequenceMatcher(a=seq1, b=seq2).ratio()
    ret = val > 0.90
    return ret

@staticmethod
def detect_sim(contract: Contract) -> Set[Tuple[LocalVariable, LocalVariable]]:
    ...
    all_var = list(set(all_var + contract_var))

    ret = []
    for i in range(len(all_var)):
        v1 = all_var[i]
        _v1_name_lower = v1.name.lower()
        for j in range(i,len(all_var)): 
            v2 = all_var[j]
            if len(v1.name) != len(v2.name):
                continue
            _v2_name_lower = v2.name.lower()
            if _v1_name_lower != _v2_name_lower:
                if SimilarVarsDetection.similar(_v1_name_lower, _v2_name_lower):
                    ret.append((v1, v2))

    return set(ret)

So if further optimization is desired, these are possibilities.

slither/detectors/variables/similar_variables.py

montyly · 2023-06-09T13:29:39Z

that's awesome, thanks @0xGusMcCrae

montyly · 2023-06-09T13:30:32Z

I think we can reformat similar to follow your recommendation, it's only used in this detector as far as I can tell

0xGusMcCrae added 3 commits June 2, 2023 16:12

initial optimization

a826831

reduced num iterations in inner loop

1b4c0b9

remove changes to how similar() works

15e8ce9

0xGusMcCrae requested review from montyly, 0xalpharush and smonicas as code owners June 3, 2023 15:02

linting

173698d

0xalpharush reviewed Jun 6, 2023

View reviewed changes

slither/detectors/variables/similar_variables.py Show resolved Hide resolved

0xalpharush approved these changes Jun 9, 2023

View reviewed changes

0xalpharush added this to the 0.9.4 milestone Jun 9, 2023

montyly merged commit 3f8d719 into crytic:dev Jun 9, 2023

montyly mentioned this pull request Jun 9, 2023

similar variables detector is extremely slow #1630

Open

0xGusMcCrae mentioned this pull request Jun 21, 2023

Additional optimizations for similar_variables.py #1980

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations for similar_variables.py #1945

Optimizations for similar_variables.py #1945

0xGusMcCrae commented Jun 3, 2023 •

edited

Loading

0xGusMcCrae commented Jun 3, 2023 •

edited

Loading

montyly commented Jun 9, 2023

montyly commented Jun 9, 2023

Optimizations for similar_variables.py #1945

Optimizations for similar_variables.py #1945

Conversation

0xGusMcCrae commented Jun 3, 2023 • edited Loading

0xGusMcCrae commented Jun 3, 2023 • edited Loading

montyly commented Jun 9, 2023

montyly commented Jun 9, 2023

0xGusMcCrae commented Jun 3, 2023 •

edited

Loading

0xGusMcCrae commented Jun 3, 2023 •

edited

Loading