Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParametricSpringNode optimizations #3

Closed
pixelzoom opened this issue Jul 6, 2015 · 29 comments
Closed

ParametricSpringNode optimizations #3

pixelzoom opened this issue Jul 6, 2015 · 29 comments
Assignees

Comments

@pixelzoom
Copy link
Contributor

Hooke's Law feels a bit sluggish to me on iPad, especially after we started using the more complicated ParametricSpringNode for the spring view.

To demonstrate, compare this version that uses a simple spring view:
http://www.colorado.edu/physics/phet/dev/html/hookes-law/1.0.0-dev.17/hookes-law_en.html

... to this version with ParametricSpringNode:
http://www.colorado.edu/physics/phet/dev/html/hookes-law/1.0.0-dev.18/hookes-law_en.html

@jonathanolson Could you please review ParametricSpringNode for potential optimizations?

@pixelzoom
Copy link
Contributor Author

More info... HookesLawSpringNode is the subtype of ParametricSpringNode that's used by the sim.

One optimization would be to remove this feature from HookesLawSpringNode:

    // shrink the coil radius (y dimension) a bit when stretching the spring, to simulate radius change
    spring.displacementProperty.link( function( displacement ) {
      if ( displacement <= 0 ) {
        // compressed
        propertySet.aspectRatioProperty.set( options.aspectRatio );
      }
      else {
        // stretched
        propertySet.aspectRatioProperty.set( options.aspectRatio * ( 1 - 0.05 * displacement / spring.displacementRange.max ) );
      }
    } );

This is expensive, as it causes the gradients to be recomputed every time the displacement changes. Without this, gradients would be computed only once, at construction. Since this effect is subtle, I'm not sure that it's worth the cost.

@arouinfar and @ariel-phet, how would you feel about dropping this bit of behavior? The coil radius would not shrink subtly when the spring is stretched.

@pixelzoom
Copy link
Contributor Author

Btw... The performance improvement for dropping the above feature is noticeable, but not enough to make performance feel snappy on iPad.

@pixelzoom
Copy link
Contributor Author

More info for @jonathanolson ... ParametricSpringNode has a lot of properties. But there are only 3 that change in this sim, as can be seen in HookesLawSpringNode (search for ".set"). The properties that change are: xScale, lineWidth and aspectRatio.

@arouinfar
Copy link
Contributor

@pixelzoom, I would be fine with dropping this behavior, especially since it has a noticeable impact on performance. While it's a nice touch, I doubt that many users will notice such a subtle change in the coil radius. I don't think it's worth sacrificing performance for it.

@jonathanolson
Copy link
Contributor

CPU time is being taken up predominantly by computing bounds of the stroked part of the shape (which is redone every frame).

I'll probably add a flag to do bounds computation in a way that doesn't figure out the exact stroked bounds (which should be much faster), and possibly an option to not compute bounds at all (see phetsims/scenery#433).

@pixelzoom
Copy link
Contributor Author

@ariel-phet said via email:

If you think it feels sluggish, it should probably be improved. I have not yet seen the sim on ipad2. Feel free to enlist some of JO's time if you think it would help for performance improvements.

@jonathanolson
Copy link
Contributor

After using boundsMethod:'unstroked', notes about Chrome desktop performance (relative numbers important):

110ms lineToPoint
  47ms Events constructor
  11ms addSegmentDirectly (adding invalidation listener?)
72ms self time (in the multilink)
65ms setTailAndTip
54ms Shape bounds on main path
44ms Events constructor in Segment
27ms moveTos

@jonathanolson
Copy link
Contributor

On desktop, looks like the multilink is typically being called twice per frame while I drag (it's unnecessarily doing twice the work, since only the last update before the next animation frame is displayed):

hookes-law-parametric-1

For this type of thing, I'd recommend attempting to update it only on the view step in general (since it's so potentially expensive).

@jonathanolson
Copy link
Contributor

boundsMethod: 'none' instead of 'unstroked' could have a speed-up, since presumably we could calculate the approximate width based on just the inputs to the spring shape.

On Chrome, profiling information notes that this probably isn't the biggest bottleneck however.

Updating to iOS 8.4, will test on the iPad 3.

@jonathanolson
Copy link
Contributor

Bottom-up profiling reveals Events.trigger0 to be the biggest contributor to CPU by 2x (next closest is lineToPoint itself).

Second is kite.svgNumber. I'll potentially look into getting direct function references for each file, instead of requiring the kite.X lookup every time.

@jonathanolson
Copy link
Contributor

Profiling from the iPad (using the web inspector from Safari, weinre/Valence/etc. were insufficient for profiling information):

Most expensive codepath to inner loop (there seem to be 2 different paths to the inner loop):

< 16ms self time in loops/etc >
156.8ms lineToPoint
  71ms addSegmentAndBounds
    52ms addSegment
      32ms <tmp name/anonymous>
        20ms addSegmentDirectly
          10ms onStatic (add listener)
          2ms push
      8ms getNondegenerate....
      2ms invalidate
    7ms getLastSubpath
    2ms invalidate
  25ms Line
    9ms Segment
      5ms Events
    8ms invalidate/trigger0 on construction
  17ms getLastSubpath
  7ms getLastPoint
  7ms addPoint
95ms setShape
  94ms Shape.getBounds
6ms < some listener of something that triggers, that uses setRect/x/left/right/centerY >
4ms moveToPoint

Note that this is just the first screen, moving the single hook.

@jonathanolson
Copy link
Contributor

Also a note to others about how annoying profiling with Safari can be. Here's the call tree (flattened because there is a width limit, so you can't see the tree structure):
screen shot 2015-07-14 at 12 44 49 am

This is only for one "Record", specifically one touchmove. It looks like the animation frames were quick themselves. You don't seem to be able to analyze across time, just independently for each event.

The call tree gets so deep due to the way we use properties:
screen shot 2015-07-14 at 12 45 17 am

And if you miss clicking the arrow and click on the row instead, it freezes for about 7 seconds and shows you the entire minified code on the right. Fun times.

@jonathanolson
Copy link
Contributor

My first proposed change is to not compute the bounds of the coil (use boundsMethod: 'none'), and my guess is that would have about a 35% speedup in execution time over boundsMethod: 'unstroked', which itself seemed to have been at least 2x faster than boundsMethod: 'accurate' (the default and latest dev version). For layout, instead of using the bounds, compute how big the spring will be.

My second proposed change is more invasive, and will require a few more changes to Shape/Path. It would involve creating one Shape/Path, and on changes, mutating the Vector2s used in the Shape (and then invalidating the shape). Presumably we'd create a SpringNode with a certain number of points, and if the number of points needed to change (which I was told didn't need to happen during an animation) it would need to build a new Shape.

So if you have a lineToPoint( p ), you would modify the x/y values of p to the new place.

Otherwise, there are various inner-loop optimizations that can be made (factoring out 2_pi/pointsPerLoop, or xScale_radius/pointsPerLoop, etc.) in case the JIT isn't taking care of that.

Let me know what you think about the proposed changes.

@pixelzoom
Copy link
Contributor Author

7/14/15 call with @jonathanolson

(1) I'll use boundsMethod: 'none' and change layout of springs to use only x and y options.

(2) jonathan will add Shape.invalidatePoints. The number of points remains constant for the springs, so I'll store Vector2[], mutate them, and call Shape.invalidatePoints. Do this once per frame.

(3) Add a guard so that multilink is called once per frame (see #3 (comment)). Might wait to do this until we see what the performance is after (1) and (2).

@jonathanolson
Copy link
Contributor

Kite/Scenery parts completed, feel free to assign me for further iPad 3 testing.

@pixelzoom
Copy link
Contributor Author

More stuff I'll need to do:

• Rewrite some of HookesLawIconFactory, which relies on accurate bounds of the springNode. For example, on my first attempt at using boundsMethod:'none', createIntroScreenIcon was failing an assertion in Node with "x not finite".

• Figure out how to accurately attach the right spring to the left spring in SeriesSystemNode.

@ariel-phet
Copy link

One comment...if buttery smooth on iPad2 and a little sluggish on iPad 3, no need to optimize further (we can accept sluggishness on iPad 3)

@pixelzoom
Copy link
Contributor Author

Roger. Butter ok on iPad2, molasses OK on iPad3.

pixelzoom added a commit that referenced this issue Aug 12, 2015
…ly, in preparation for using boundsMethod 'none'
pixelzoom added a commit that referenced this issue Aug 12, 2015
…ONSTANT, where CONSTANT is determined empirically (in preparation for using boundsMethod:'none')
@pixelzoom
Copy link
Contributor Author

I've completed the first major optimization, that is using boundsMethod: 'none' for the scenery Paths that make up ParametricSpringNode.

Here's a dev version that uses boundsMethod: 'none':
http://www.colorado.edu/physics/phet/dev/html/hookes-law/1.0.0-dev.27/hookes-law_en.html

Compare to the previous dev version, which used boundsMethod: 'accurate':
http://www.colorado.edu/physics/phet/dev/html/hookes-law/1.0.0-dev.26/hookes-law_en.html

@arouinfar @ariel-phet How does this feel to you? Sufficiently responsive? Should I proceed with additional (more invasive) optimizations (2) and (3), as described in #3 (comment)?

Also interested in @jonathanolson's feedback on #3 (comment). And if you could profile again as you did above, that would be appreciated. I'm interested to see if we really got a >2x speed up by using boundsMethod: 'none'.

@pixelzoom
Copy link
Contributor Author

Skype conversation with @jonathanolson:

[8/12/15, 12:45:19 PM] Chris Malley: hookes-law optimization question... In #3 (comment), you said "I'd recommend attempting to update it only on the view step in general". How do I go about doing that?
[8/12/15, 12:45:42 PM] Chris Malley: add a flag?
[8/12/15, 12:46:10 PM] Jonathan Olson: You can have a "this needs to be redrawn" dirty flag, and check on view step
[8/12/15, 12:46:18 PM] Jonathan Olson: and rebuild/redraw whatever in the view step
[8/12/15, 12:46:21 PM] Chris Malley: i'm also not clear on why multilink callback is being called twice per frame.
[8/12/15, 12:46:43 PM] Jonathan Olson: I'm not sure either, try inserting debuggers to see?
[8/12/15, 12:48:16 PM] Chris Malley: by "rebuild/redraw whatever in the view step", do you mean in the ScreenView's optional step function?
[8/12/15, 12:48:24 PM] Jonathan Olson: yes
[8/12/15, 12:48:51 PM] Chris Malley: hmmm. sounds complicated.
[8/12/15, 12:49:10 PM] Chris Malley: think i'll investigate the root cause first.

So I think I should investigate why the callback to ParametricSpringNode.multilink is being called multiple times per animation frame.

@arouinfar
Copy link
Contributor

@pixelzoom I don't notice a difference in responsiveness on my Mac. I've been working remotely the last few days (out sick) so I don't have access to an iPad to test on. Perhaps @ariel-phet can take a look at this on iPad tomorrow.

@pixelzoom
Copy link
Contributor Author

@arouinfar Unlikely that you'd notice much of a difference on a desktop computer, where the sim was already quite snappy.

@jonathanolson
Copy link
Contributor

Will plan to test tomorrow with better access to devices.

@pixelzoom
Copy link
Contributor Author

Since this is going to be migrate to scenery-phet (#16), I recommend implementing optimization (2) (see #3 (comment)) now.

@jonathanolson
Copy link
Contributor

Tested, CPU is mostly involved with the Kite overhead, I believe optimization (2) should be the most helpful.

pixelzoom added a commit that referenced this issue Aug 18, 2015
@pixelzoom
Copy link
Contributor Author

Playing around with optimization (2), mutating vectors...

Here's the signature of the multilink callback for the various Properties:

function( loops, radius, aspectRatio, pointsPerLoop, phase, deltaPhase, xScale ) {

It's only possible to mutate vectors if loops, pointsPerLoop, phase and deltaPhase remain constant. If loops or pointsPerLoop change, then the number of vectors changes. If phase or deltaPhase changes, then the number of total vectors is the same, but they are distributed differently between the "front" and "back" Shapes for the coil.

So I'll probably split this up into 2 multilink callbacks, where this one mutates vectors and calls Shape.invalidatePoints:

function( radius, xScale ) {

...and this one creates new Shapes and vectors:

function( loops, pointsPerLoop, aspectRatio, phase, deltaPhase ) {

... with some common code factored out that they both use.

@pixelzoom pixelzoom changed the title review ParametricSpringNode for potential optimizations ParametricSpringNode optimizations Sep 1, 2015
@pixelzoom
Copy link
Contributor Author

Here's a dev version that mutates vectors (optimization 2 above):
http://www.colorado.edu/physics/phet/dev/html/hookes-law/1.0.0-dev.28/hookes-law_en.html

Compare to previous dev version:
http://www.colorado.edu/physics/phet/dev/html/hookes-law/1.0.0-dev.27/hookes-law_en.html

The latest version is noticeably snappier, more responsive feeling. I'm satisfied with this.

@ariel-phet Do you want to test drive, since @arouinfar is out? If this looks good, we can proceed to dev testing for hookes-law.

@pixelzoom pixelzoom assigned ariel-phet and unassigned pixelzoom Sep 1, 2015
@ariel-phet
Copy link

@pixelzoom - I should be near an ipad later today, I will look at it then, but I am guessing that if you are satisfied it will be great.

pixelzoom added a commit that referenced this issue Sep 1, 2015
@ariel-phet
Copy link

Tested on iPad2, looks awesome. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants