Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Fix lsra memory consumption #11233

Merged
merged 7 commits into from
Apr 28, 2017

Conversation

sandreenko
Copy link

Do not allocate new buffer each time when we call Clear for bitset.
It removes quadratic memory consumption for predecessors in lsra (N * N/ 64 size* were allocated, where N is the number of blocks).

Use the same preallocated buffer for lsra preds.

There are no spmi asm diffs.
Fix #11142, it allocated 3GB of memory, now it takes 300 MB,

PTAL @dotnet/jit-contrib , as I understand I have to merge it into the release branch too.

@russellhadley
Copy link

Pretty cool. Does this move the needle on memstats or TP?

src/jit/lsra.h Outdated
@@ -1131,7 +1131,7 @@ class LinearScan : public LinearScanInterface
int compareBlocksForSequencing(BasicBlock* block1, BasicBlock* block2, bool useBlockWeights);
BasicBlockList* blockSequenceWorkList;
bool blockSequencingDone;
void addToBlockSequenceWorkList(BlockSet sequencedBlockSet, BasicBlock* block);
void addToBlockSequenceWorkList(BlockSet sequencedBlockSet, BasicBlock* block, BlockSet predSet);
void removeFromBlockSequenceWorkList(BasicBlockList* listNode, BasicBlockList* prevNode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should really use BlockSet_ValArg_T instead of BlockSet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I guess you don't want to pass by value, do you? Should it be BlockSet&?


In reply to: 113554764 [](ancestors = 113554764)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

void addToBlockSequenceWorkList(const BlockSet& sequencedBlockSet, BasicBlock* block, BlockSet& predSet);

since sequencedBlockSet is const/unchanged, and predSet is changed. If you don't use & then for short form bitsets you'll pass by value. It turns out that doesn't (currently) matter, since they don't allocate memory and you only care about reducing memory, but it would be weird for short bitsets to be passed by value and long bitsets to be passed by reference, accidentally.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A value arg should be fine here, since the bitset is cleared upon each call to addBlockSequenceWorklist. This should probably use BlockSet_ValArg_T, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually, we don't want to be passing these bitsets "by value".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Your suggestion to use a reference parameter should suffice, then.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good, with additional desktop testing.

@@ -180,11 +180,10 @@ class BitSetUint64

inline void ClearD(Env env)
{
// Recall that ClearD does *not* require "*this" to be of the current epoch.
// Recall that ClearD requires "*this" to be of the current epoch.
Uint64BitSetOps::ClearD(env, m_bits);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is changing the requirements on ClearD, have you examined all the callsites to ensure that they are consistent with this? I would suggest that this undergo additional testing (e.g. desktop SPMI) to ensure that we don't encounter this assert.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked that there are no spmi failures or spmi diffs.

@briansull
Copy link

@sandreenko
I would like to take this fix for the Desktop 4.7.1 release of the JIT as well.
@CarolEidt
Can you confirm that it is low risk.

src/jit/lsra.cpp Outdated
@@ -1540,6 +1540,7 @@ int LinearScan::compareBlocksForSequencing(BasicBlock* block1, BasicBlock* block
// Arguments:
// sequencedBlockSet - the set of blocks that are already sequenced
// block - the new block to be added
// predSet - the buffer to save predessesors set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: predessesors

I think you should elaborate, e.g.: "a block set allocated by the caller used here as a temporary block set for constructing a predecessor set. Allocated by the caller to avoid reallocating a new block set with every call to this function"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks.

Copy link

@pgavlin pgavlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to LSRA look fine, but the changes to BitSet::ClearDLong are too risky.

@@ -664,9 +664,11 @@ void BitSetOps</*BitSetType*/ BitSetShortLongRep,
/*BitSetTraits*/ BitSetTraits>::ClearDLong(Env env, BitSetShortLongRep& bs)
{
assert(!IsShort(env));
// Recall that ClearD does *not* require "bs" to be of the current epoch.
// Therefore, we must allocate a new representation.
bs = MakeEmptyArrayBits(env);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes me very nervous, as we do not seem to have any way to verify that the size of the bitset has not changed since we created it (i.e. there appears to be no way to check that the epoch at creation is the same as the epoch at the clear). I think you should introduce a new method (perhaps ClearDUnsafe? I don't have a great name for it) with the semantics you've introduced here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sandreenko We had talked about you converting existing ClearD() callers to call MakeEmpty() directly, for existing callers that could not easily be proven to work with the new semantics. I presume the existing semantics, and huge comment you deleted for ClearD(), were there for a reason.

Is there some reason you decided not to do that? Could you somehow convince yourself the new semantics work for every call site?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually checked all calls to ClearD, the initial idea was to replace them with MakeEmpty when it is necessary. But It appeared that all current users of ClearD do not change epoch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can assert(CheckEpoch(env)) which is being done below in bitsetasuint64inclass.h. I believe that assert could be made here as well.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But It appeared that all current users of ClearD do not change epoch.

I see twenty uses of ClearD in the compiler today:

src/jit/assertionprop.cpp:4559:    BitVecOps::ClearD(apTraits, fgFirstBB->bbAssertionIn);
src/jit/codegenlegacy.cpp:12585:        VarSetOps::ClearD(compiler, compiler->compCurLife);
src/jit/copyprop.cpp:299:        VarSetOps::ClearD(this, optCopyPropKillSet);
src/jit/emit.cpp:1475:    VarSetOps::ClearD(emitComp, emitInitGCrefVars);
src/jit/emit.cpp:1476:    VarSetOps::ClearD(emitComp, emitPrevGCrefVars);
src/jit/emit.cpp:4567:    VarSetOps::ClearD(emitComp, emitThisGCrefVars); // This is initialized to Empty at the start of codegen.
src/jit/flowgraph.cpp:1818:        // Initialize the per-block bbReach sets. (Note that we can't just call BlockSetOps::ClearD()
src/jit/flowgraph.cpp:10014:        BlockSetOps::ClearD(this, bNext->bbReach);
src/jit/liveness.cpp:489:        VarSetOps::ClearD(this, fgCurUseSet);
src/jit/liveness.cpp:490:        VarSetOps::ClearD(this, fgCurDefSet);
src/jit/liveness.cpp:725:            VarSetOps::ClearD(this, inScope);
src/jit/liveness.cpp:915:        VarSetOps::ClearD(this, initVars);
src/jit/liveness.cpp:1151:        VarSetOps::ClearD(m_compiler, m_liveOut);
src/jit/liveness.cpp:1265:            VarSetOps::ClearD(m_compiler, m_liveIn);
src/jit/liveness.cpp:1266:            VarSetOps::ClearD(m_compiler, m_liveOut);
src/jit/lsra.cpp:1741:    BlockSetOps::ClearD(compiler, bbVisitedSet);
src/jit/lsra.cpp:9317:            VarSetOps::ClearD(compiler, sameResolutionSet);
src/jit/lsra.h:1109:        BlockSetOps::ClearD(compiler, bbVisitedSet);
src/jit/regalloc.cpp:1343:            VarSetOps::ClearD(this, varAsSet);
src/jit/regalloc.cpp:6351:            VarSetOps::ClearD(this, raLclRegIntf[i]);

How did you verify that the epoch of each bitset does not change between the creation of each bitset and the point at which they are cleared? If the epoch has changed, the failure mode could be pretty bad, as if the number of bits has increased we could end up scribbling over arbitrary memory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a much safer change to me.

I think that in this case if we replace all ClearD with MakeEmpty we will lose a valuable information.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in this case if we replace all ClearD with MakeEmpty we will lose a valuable information.

Yes, this is a good point. You should instead create a separate ClearD method with the desired semantics.

FWIW, I can attest that liveness would also be able to use the new clear method in all cases that it currently uses ClearD.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We could replace it with a new Clear() function that is non-destructive.

Now, it's possible that all the ClearD functions today can use the new semantics, meaning we don't need a new Clear function. It's just very suspicious that is true, given the huge comments about why ClearD is (currently) special.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just very suspicious that is true, given the huge comments about why ClearD is (currently) special.

This is my concern as well, especially given that the potential impacts of a mismatch are rather severe.

Copy link
Author

@sandreenko sandreenko Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first looked at ClearD I thought it just clear bits without this epoch magic.
It was unique function, but nobody used it in such way.
For example:
flowgraph.cpp:

// Initialize the per-block bbReach sets. (Note that we can't just call BlockSetOps::ClearD()
// when re-running this computation, because if the epoch changes, the size and representation of the
// sets might change).

So I prefer not to create the second function, that will confuse us in the future, but to replace the first.

However: liveness, lsra, regalloc, assertionprop, copyprop and emit do not change the epoch.
flowgraph does, but it calls MakeEmpty before ClearD. But let me check that with noway_asserts.

Uint64BitSetOps::ClearD(env, m_bits);
#ifdef DEBUG
// But it updates it to of the current epoch.
m_epoch = BitSetTraits::GetEpoch(env);
assert(CheckEpoch(env));
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for #ifdef DEBUG. Just make it:

    inline void ClearD(Env env)
    {
        assert(CheckEpoch(env));
        Uint64BitSetOps::ClearD(env, m_bits);
    }

Sergey Andreenko added 3 commits April 27, 2017 12:27
The new ClearD reuses existing buffer and assumes that epoch wasn't
changed.
@sandreenko sandreenko force-pushed the fix-lsra-memory-consumption branch from dafa057 to 4701304 Compare April 27, 2017 21:40
@sandreenko
Copy link
Author

@pgavlin @BruceForstall I updated PR according to our discussion.

VarSetOps::ClearD(emitComp, emitInitGCrefVars);
VarSetOps::ClearD(emitComp, emitPrevGCrefVars);
VarSetOps::OldStyleClearD(emitComp, emitInitGCrefVars);
VarSetOps::OldStyleClearD(emitComp, emitPrevGCrefVars);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This begs the question of whether these should not be initialized until CodeGen.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but we can make that change (if necessary) with a post-2.0 PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -1815,7 +1815,7 @@ void Compiler::fgComputeReachabilitySets()

for (block = fgFirstBB; block != nullptr; block = block->bbNext)
{
// Initialize the per-block bbReach sets. (Note that we can't just call BlockSetOps::ClearD()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is incorrect. I'm fine with leaving the code as-is, but at least the comment should be fixed.

Copy link

@pgavlin pgavlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from @CarolEidt's suggested comment change.

Thanks for doing this!

@CarolEidt
Copy link

I would like to see an issue to either remove all uses of OldStyleClearD or rename it, and to generally review the usage of the two flavors of ClearD and MakeEmpty

@pgavlin
Copy link

pgavlin commented Apr 27, 2017

I would like to see an issue to either remove all uses of OldStyleClearD or rename it, and to generally review the usage of the two flavors of ClearD and MakeEmpty

@sandreenko filed https://github.com/dotnet/coreclr/issues/11263. Does that suffice?

@CarolEidt
Copy link

@sandreenko filed #11263. Does that suffice?

Yes, thanks.

@sandreenko sandreenko merged commit 70014f8 into dotnet:master Apr 28, 2017
@sandreenko sandreenko deleted the fix-lsra-memory-consumption branch April 28, 2017 01:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants