Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CT: above our memoryLimit #207

Closed
KatieWoe opened this issue Oct 29, 2018 · 35 comments
Closed

CT: above our memoryLimit #207

KatieWoe opened this issue Oct 29, 2018 · 35 comments

Comments

@KatieWoe
Copy link
Contributor

KatieWoe commented Oct 29, 2018

unit-rates : fuzz : built : run
Uncaught Error: Average memory used (1016MB) is above our memoryLimit (1000MB). Current memory: 1709MB.
Error: Average memory used (1016MB) is above our memoryLimit (1000MB). Current memory: 1709MB.
    at t.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1831192)
    at e.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1968678)
    at e.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:151135)
    at e.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979642)
    at e.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979448)
    at e.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979308)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : built : run
Uncaught Error: Average memory used (1026MB) is above our memoryLimit (1000MB). Current memory: 1783MB.
Error: Average memory used (1026MB) is above our memoryLimit (1000MB). Current memory: 1783MB.
    at t.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1831192)
    at e.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1968678)
    at e.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:151135)
    at e.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979642)
    at e.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979448)
    at e.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979308)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : built : run
Uncaught Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1703MB.
Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1703MB.
    at t.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1831192)
    at e.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1968678)
    at e.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:151135)
    at e.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979642)
    at e.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979448)
    at e.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979308)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : require.js : run
Uncaught Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1693MB.
Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1693MB.
    at MemoryMonitor.measure (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/MemoryMonitor.js?bust=1540772386577:63:15)
    at Emitter.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540772386577:232:30)
    at Emitter.emit (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/axon/js/Emitter.js?bust=1540772386577:187:53)
    at Sim.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540772386577:915:34)
    at Sim.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540772386577:896:14)
    at Sim.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540772386577:879:14)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : require.js-canvas : run
Uncaught Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1810MB.
Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1810MB.
    at MemoryMonitor.measure (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/MemoryMonitor.js?bust=1540799275115:63:15)
    at Emitter.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540799275115:232:30)
    at Emitter.emit (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/axon/js/Emitter.js?bust=1540799275115:187:53)
    at Sim.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540799275115:915:34)
    at Sim.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540799275115:896:14)
    at Sim.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540799275115:879:14)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : require.js-canvas : run
Uncaught Error: Average memory used (1078MB) is above our memoryLimit (1000MB). Current memory: 1850MB.
Error: Average memory used (1078MB) is above our memoryLimit (1000MB). Current memory: 1850MB.
    at MemoryMonitor.measure (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/MemoryMonitor.js?bust=1540814119581:63:15)
    at Emitter.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540814119581:232:30)
    at Emitter.emit (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/axon/js/Emitter.js?bust=1540814119581:187:53)
    at Sim.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540814119581:915:34)
    at Sim.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540814119581:896:14)
    at Sim.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540814119581:879:14)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : xss-fuzz : run
Uncaught Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1842MB.
Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1842MB.
    at MemoryMonitor.measure (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/MemoryMonitor.js?bust=1540792615422:63:15)
    at Emitter.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540792615422:232:30)
    at Emitter.emit (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/axon/js/Emitter.js?bust=1540792615422:187:53)
    at Sim.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540792615422:915:34)
    at Sim.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540792615422:896:14)
    at Sim.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540792615422:879:14)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM
@pixelzoom
Copy link
Contributor

Prior to publishing, this sim had extensive memory leak testing on 3/6/17, see #170, and the heap size never grew past 55MB.

I looked through all commits since 3/6/17. The only significant change in master since then was to convert from TWEEN to Animation. So I guess I'll start there.

@pixelzoom
Copy link
Contributor

pixelzoom commented Oct 30, 2018

Commits related to conversion from TWIXT to Animation, in chronological order:

8/24/18 eb9598a
8/24/18 a5deac3
10/27/18 a294a7f

@pixelzoom
Copy link
Contributor

pixelzoom commented Oct 30, 2018

According to @jonathanolson, CT fuzz test for requirejs is 120 seconds with query parameters:
?brand=phet&ea&fuzz&memoryLimit=1000.

So if I run for > 2 minutes with ?brand=phet&ea&fuzz, then I should see a heap snapshot > 1000MB.

@pixelzoom
Copy link
Contributor

With Chrome 70.0.3538.77 on macOS 10.11.6, running in requirejs mode with ?brand=phet&ea&fuzz for 3 minutes, the heap size was a whopping 1411 MB. It took over 30 minutes to build the snapshot. And meanwhile, the sim "paused before potential out-of-memory crash".

@pixelzoom
Copy link
Contributor

pixelzoom commented Oct 30, 2018

I don't see anything in the 1411 MB snapshot that looks related to Animation. But I do see a huge number of Emitter instances, apparently changeEmitter in Color.

I took another snapshot at startup, 76MB. (That's bigger than the 55MB size after 25 minutes of runtime in the original memory tests, see #170). A second snapshot 10 seconds later was 123MB. Comparing snapshots, I again see a lot of Emitter instances.

@jonathanolson is it possible that phetsims/scenery#879 is not resolved?

@pixelzoom
Copy link
Contributor

This issue most definitely blocks publication of this (and possibly other) sims.

@jonathanolson
Copy link
Contributor

Hmm, this looks related to phetsims/sun#362.

At least one case in unit-rates is:

  • KeypadLayer results in creating a lot of NumberKeypads
  • NumberKeypad creates a lot of buttons with createKey() that won't get disposed when things get disposed (even if unit-rates disposes them)
  • The buttons create PaintColorProperties to listen to Color objects (whose listeners would get disposed, but the buttons don't get disposed)
  • The SUN/ColorConstants.LIGHT_GRAY color object is used as the default for the disabled color, so most buttons will add a listener to the color to see if the color object will change.

I'd like to tag for the developer meeting, because any of the ways of solving this generally have consequences:

  • If every component properly calls dispose() on its subcomponents, then we don't run into this type of memory leak. This might be something in general good to do.
  • However it still feels like a flaw that we have to treat Color objects as mutable. If I've ever wanted a Color that can change, a Property.<Color> usually seems better. Having to add listeners to Color objects really seems like overkill, and having an immutable Color object sounds desirable. It's been a pain to work around for very little benefit.

Long-term, doing both things sounds correct, but I'm curious what others think.

@jonathanolson
Copy link
Contributor

I've had two CT browser tabs crash as part of phetsims/aqua#54 while testing this. I'm bumping priority to high so we ensure it is talked about in dev meeting.

@jonathanolson
Copy link
Contributor

I'll create an issue for immutable colors.

@pixelzoom
Copy link
Contributor

11/1/18 dev meeting notes:

  • create a scenery issue for longterm fix, assigned to @jonathanolson
  • in the meantime, @pixelzoom will apply a workaround -- add KeyPanel.dispose that calls dispose on any buttons that are created.

@pixelzoom
Copy link
Contributor

pixelzoom commented Nov 1, 2018

This isn't isolated to sim-specific code. I'll need to add dispose to SCENERY_PHET/NumberKeypad too.

@pixelzoom pixelzoom changed the title CT Memory CT: above our memoryLimit Nov 2, 2018
pixelzoom added a commit that referenced this issue Nov 2, 2018
@pixelzoom
Copy link
Contributor

In the above commit, I added dispose calls for all buttons, and a call to disposeSubtree for SCENERY/NumberKeypad. Tested with Chrome 70.0.3538.77 on macOS 10.11.6, running in requirejs mode with ?brand=phet&ea&fuzz.

The good news is that I can now take a heap snapshot without waiting 20 minutes, and CT shouldn't be complaining. Startup heap size was 86MB. After 1 minute, heaps size was 126MB. After 5 minutes, heap size was 98MB, which seems to indicate some GC. After 10 minutes, heap size was back up to 133MB.

The bad news is that these heap sizes are still much larger than expected. In prior memory leak testing (#170), startup heap size was 45MB, and the sim stabilized at ~52MB after 25 minutes. We are nowhere near those numbers.

Also worth noting is that the sim becomes sluggish/jerky almost immediately during fuzz testing, behavior that's indicative of a leak.

@jonathanolson any ideas?

@pixelzoom
Copy link
Contributor

Note to self: Each line of code that I added as a workaround for this issue has a TODO next to it.

@pixelzoom
Copy link
Contributor

Adding dev meeting label to discuss timeframe for addressing this. Since we have a number of sims being published with new release branches (graphing-quadratics, graphing-lines, resistance-in-a-wire,...) it would be good to get to the bottom of this.

@jonathanolson
Copy link
Contributor

I made the change for color listeners, so I'll retest for memory leaks right now.

@jonathanolson
Copy link
Contributor

2 minutes of fuzzing and the sim is still <100MB, so the "major" memory leak was resolved by changes made.

@samreid
Copy link
Member

samreid commented Nov 9, 2018

The preceding commit reduces requirejs phet-brand Unit Rates memory from 52.4MB to 43.6MB.

samreid added a commit to phetsims/scenery-phet that referenced this issue Nov 9, 2018
samreid added a commit to phetsims/axon that referenced this issue Nov 9, 2018
samreid added a commit to phetsims/scenery that referenced this issue Nov 9, 2018
@samreid
Copy link
Member

samreid commented Nov 9, 2018

After above commits, we are down to 43.0MB.

samreid added a commit to phetsims/scenery-phet that referenced this issue Nov 9, 2018
@samreid
Copy link
Member

samreid commented Nov 9, 2018

Down to 42.6 MB

@pixelzoom
Copy link
Contributor

@samreid looks like the source of the memory increase was identified. Can you summarize here in a comment, for posterity?

@samreid
Copy link
Member

samreid commented Nov 9, 2018

The changes I made were predominantly about factoring out instantiated parametric types. For instance, phetsims/axon@4a9762e factors out EmitterIO( [] ) instead of creating an equivalent but different one for each Emitter. I cannot say for certain that this was "the source of the memory increase", but it seemed like a safe and straightforward way to significantly reduce the memory footprint. We can probably reduce size further by eliminating closures in common code, or by other strategies.

@zepumph
Copy link
Member

zepumph commented Nov 9, 2018

For clarity, @jonathanolson and I (and mainly @jonathanolson) worked out that the source of the memory leak came from the commit when I moved phetioInherit from the phet-io repo into tandem. This consequently made it so that phetioInherit was not just "stubbing" out TypeIOs and returning no-op functions like function(){}, but instead it was actually creating TypeIOs. This exposed the memory leak that @samreid explained above to all sims, even in phet brand.

Re his comment:
I think that there is likely other places where we can decrease memory consumption, seeing as 150% is still a large size increase to all sims, and likely that is from things like default parameters that @samreid was able to factor out some.

zepumph added a commit to phetsims/scenery that referenced this issue Nov 9, 2018
zepumph added a commit to phetsims/sun that referenced this issue Nov 9, 2018
@zepumph
Copy link
Member

zepumph commented Nov 9, 2018

In the above commits, I found that unit rates phet brand went down another .3 MB for DragListener.draggedEmitter and not much for Checkbox.toggledEmitter but that likely effected other sims (with a lot of checkboxes maybe?).

I then looked at all usages of EmitterIO( and didn't see any others that looked like they were being used systemically. I'm now a bit more unsure about where the other MB have come from.

@pixelzoom
Copy link
Contributor

Somehow we went from @jonathanolson saying:

I'm zeroing in on the exact series of commits that triggered the issues.

To this email from @zepumph:

To be sure everyone is on the same page:

We know why and how this happened. Since moving phetioInherit out of phet-io ( support phetsims/axon#190), it is no longer hidden behind the ifphetio! plugin. Therefore each Type, even simple value types like Vector2, gained a fully separate TypeIO object and reference to it in any brand. So now each Vector2 has a Vector2IO that extends ObjectiO, whereas before each Vector2 instance would have had a reference to function(){} (the stub returned from the ifphetio! plugin).

Now with the question of where to go from here. . .
A few thoughts:

  • Eventually unit-rates will use this much memory because it will be instrumented for phet-io, and run in phet-io.
  • In today's dev meeting, we just talked about aligning a piece of phet-io with phet pretty heavily, in that all value validation will be done through phetioType. Though that isn't set in stone, to me it sets some precedent of tolerance towards adding this memory to phet brand.
  • This is one step closer to phet-io and phet branded sims being the same code. Perhaps one way to think about this is to strive for a "single sim" rather than two different ones depending on brand.
  • If we decide that the increased memory in phet brand is unacceptable, then we may need to rethink multiple decisions about phet-io, and integrating pieces into the main code base rather than keeping things separate.

I asked:

@samreid looks like the source of the memory increase was identified. Can you summarize here in a comment, for posterity?

Then @samreid described specific commits, which was not what I'm looking for

So to clarify.... What was the general problem, and what is the general solution that you're pursuing?

Also highly recommended to create a general issue somewhere to be dealing with this, not deal with it in this sim-specific issue.

@pixelzoom
Copy link
Contributor

And yes, I've read #207 (comment) and #207 (comment), but I don't understand.

@samreid
Copy link
Member

samreid commented Nov 9, 2018

And yes, I've read #207 (comment) and #207 (comment), but I don't understand.

Perhaps a short call would be very helpful? I'll be on slack.

@pixelzoom
Copy link
Contributor

pixelzoom commented Nov 9, 2018

@samreid and I discussed on Slack.

The cause of this issue is now moved to phetsims/tandem#71. Please comment and commit there.

@pixelzoom
Copy link
Contributor

I tested and got startup heap size of 45MB on Chrome + macOS, so that correlates with what @samreid is seeing. Closing this issue. Further memory improvements related to IO types will continue in phetsims/tandem#71.

jbphet pushed a commit to phetsims/axon that referenced this issue Nov 20, 2018
jbphet pushed a commit to phetsims/axon that referenced this issue Nov 20, 2018
jbphet pushed a commit to phetsims/axon that referenced this issue Nov 20, 2018
jbphet pushed a commit to phetsims/scenery-phet that referenced this issue Nov 20, 2018
jbphet pushed a commit to phetsims/scenery-phet that referenced this issue Nov 20, 2018
jbphet pushed a commit to phetsims/scenery that referenced this issue Nov 20, 2018
jbphet pushed a commit to phetsims/scenery that referenced this issue Nov 20, 2018
marlitas pushed a commit to phetsims/sun that referenced this issue Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants