Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: hoist static objects/arrays in templates #2589

Merged
merged 15 commits into from
Jan 4, 2022
Merged

Conversation

nolanlawson
Copy link
Collaborator

Details

Partially fixes #2559.

This adds an optimization to the template compiler where it identifies fully-static arrays and objects and hoists them into const variables outside of the tmpl function.

E.g.:

+const stc0 = { static: true };
+const stc1 = { really: { static: ['yep', 'totally', 'static' ] } };
 function tmpl() {
   return {
     foo: dynamic(),
-    bar: { static: true },
-    baz: { really: { static: ['yep', 'totally', 'static' ] } }
+    bar: stc0,
+    baz: stc1
   };
 }

This improves our "geometric mean" score on the js-framework-benchmark from 1.61 to 1.55. The main improvement is in "select row" which improves from 53.1 ±1.7 to 48.0 ±2.1.

Screen Shot 2021-12-01 at 1 46 43 PM

This also improves many of our benchmarks, although two are modestly regressed.

Screen Shot 2021-12-01 at 3 36 09 PM

I should probably investigate to see why these two are regressed, but the regression is small in absolute terms (0.85ms on one benchmark and 4.56ms on the other), so I'm not too concerned.

This PR is deliberately focused and conservative to avoid a risky change. In the future, we can do additional optimizations, e.g. to extract the VNodes themselves when they are inside of an iteration loop.

Does this pull request introduce a breaking change?

  • ✅ No, it does not introduce a breaking change.

Does this pull request introduce an observable change?

  • ⚠️ Yes, it does include an observable change.

The format of compiled template code has changed. I.e., this is a template compiler change.

GUS work item

W-10113376

@@ -10,6 +10,7 @@ import { TEMPLATE_FUNCTION_NAME, TEMPLATE_MODULES_PARAMETER } from '../../shared

import CodeGen from '../codegen';
import { identifierFromComponentName, generateTemplateMetadata } from '../helpers';
import { optimizeStaticExpressions } from '../optimize';
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're reading this in the GitHub UI, you can skip to here because everything above is just snapshot changes.

@nolanlawson nolanlawson changed the title perf: hoist static object/arrays in templates perf: hoist static objects/arrays in templates Dec 2, 2021
Copy link
Member

@pmdartus pmdartus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am extremely surprised that some benchmarks regressed with this change. Is the regression consistent across multiple runs?

@nolanlawson
Copy link
Collaborator Author

I am extremely surprised that some benchmarks regressed with this change. Is the regression consistent across multiple runs?

Yes, I actually did two runs to verify the results. I also bumped the sample size to min 300 and set the horizon at 1% to really be sure. My hunch is that the update-slotted-content test is so small that it's exposing a one-time setup cost that is not amortized across many iterations (e.g. in the 1k or 10k tests).

@nolanlawson
Copy link
Collaborator Author

/nucleus test

1 similar comment
@nolanlawson
Copy link
Collaborator Author

/nucleus test

@nolanlawson
Copy link
Collaborator Author

This actually broke a test in LWR (outlet.spec.ts) because an error callback is being fired twice. The root issue is that these two vnodes being compared have the exact same object for vnode.data, which is not the case in master. I assume somewhere in our vdom code we are mutating the data object, which is safe when we're dealing with copies but not when the objects may be the same.

Screen Shot 2021-12-02 at 11 10 22 AM
Screen Shot 2021-12-02 at 11 07 16 AM

This PR may need some work to avoid downstream breakages.

@nolanlawson
Copy link
Collaborator Author

The error only happens with dynamic components. I should write a test for this, but here is the fix to see if it unbreaks Nucleus ^

@nolanlawson
Copy link
Collaborator Author

OK, so when I bump the slot-update-slotted-content benchmark from 500 rows to 10k, there is no difference between master and this PR. So I think it is indeed just measuring a one-time setup cost.

@nolanlawson
Copy link
Collaborator Author

Re-ran the benchmarks with the new array fix, and the results are almost exactly the same as before. js-framework-benchmark has also not changed since the compiled template is the same.

@nolanlawson
Copy link
Collaborator Author

I did not add any new tests, because it turns out I can just use readonly in TypeScript to avoid calling setters on VNodeData/VElementData.

return c(sel, Ctor, data, children);
// Shallow clone is necessary here becuase VElementData may be shared across VNodes due to
// hoisting optimization.
const newData = { ...data, key: `dc:${idx}:${data.key}` };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LWC compiler has been emitting unique keys to each VNode to speed up DOM diffing. The main downside with this is that it produces a lot of garbage in the generated template code.

I think there is certainly a better way to handle this without adding more junk to the generated code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was wondering what the key is for. If it's only to speed up DOM diffing, then what about removing it and relying on object equality? (Since objects are effectively immutable.) Or is the optimization still necessary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keys are used to checking if vnodes are equal. it is primarily used whenever nodes are dynamic (if block, each block). In those cases, we can't use object equality as those VNodes can't be hoisted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keys are to denotate placeables for elements in the template, that's the way I think about them, and we use that information in multiple places to determine whether or not a vnode is the result of a particular placeable or another.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another alternative here is to not use ... due to the perf implications, and instead just use obj inheritance via Object.create(data, { key: { value: dc:${idx}:${data.key} } }). Considering that we never use hasOwnProperty with such object, we should be fine with this as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, this is not a critical path since dynamic components are almost never used, just saying! :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I would personally prefer to keep ... because it's idiomatic. Also, I'm guessing the perf implications are only from Locker, and we disallow lwc:dynamic on the platform, so I'm assuming that this line of code will never run with Locker enabled?

Comment on lines +207 to +211
// Do a shallow clone since VNodeData may be shared across VNodes due to hoist optimization
vnode.data = {
...vnode.data,
props: cloneAndOmitKey(props, 'innerHTML'),
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When reaching this code path, there is no way the vnode.data is hoisted as the innerHTML prop requires an invocation to api_sanitize_html_content. It is safe to mutate it in place by case it to any.

By the way, this block of code looks a little misplaced.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would kind of prefer to leave it as a clone. It seems unsafe for a developer to rely on quirks of the optimization algorithm. In the future, we may very well want to hoist:

api_sanitize_html_content("static string")

In fact, we are already caching the output on the $ctx anyway:

props: {
innerHTML:
$ctx._sanitizedHtml$0 ||
($ctx._sanitizedHtml$0 = api_sanitize_html_content(
"Hello <b>world</b>!"
)),
},

So if we apply the hosting optimization, arguably we may not need the $ctx optimization anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uff, uff! elm.innerHTML === props.innerHTML broke my heart! I will have never approve that change. The whole point of the diffing algo is to avoid touching the DOM at all cost to compare things... reading the innerHTML from the element to compare it with the vnode is just wrong. We should fix that.

Additionally, the innerHTML should NOT be a regular property, it should have been a snabbdom module who's job is to compare two vnodes to determine whether or not the innerHTML needs to be set, rather than tricking the props module to not see the innerHTML for some cases when we know that it is the same. This is obviously not related to this PR, but needs to be fixed at some point. I might be missing something obvious here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caridy Opened an issue to track this: #2603

@@ -139,7 +152,6 @@ export function updateDynamicChildren(parentElm: Node, oldCh: VNodes, newCh: VNo
newStartVnode.hook.insert(newStartVnode, parentElm, oldStartVnode.elm!);
} else {
patchVnode(elmToMove, newStartVnode);
oldCh[idxInOld] = undefined as any;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to VNodes being read-only now, I had to remove this line. AFAICT, it is not needed. I don't understand why we would need to set this to undefined in the oldCh array. The old children should just be garbage-collected, as they either belong to the old VNode:

fn(vnode.elm!, oldVnode.children, children);

Or are replaced on the current VM:

const children = renderComponent(vm);
patchShadowRoot(vm, children);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't yet fully understand the implications of removing this line here. The fact that it breaks no test is good news to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You CANNOT remove this line! :)

Basically, this line actuates when there is a new element who's key is found on the old vnodes, there are 2 possibilities:

  1. the old node is in a position prior to the new position.
  2. the old node is in a position after the new position.

The posibility of being in the same position is not possible in this branch of the code.

On top of that, there is another problem: "whether or not the tagName (sel) matches.

So, the problem is that if the sel matches, and the position is prior the new position (remember that this algo goes backward), since we are in a loop, there is a possibility that later in the game you encounter another element that must be inserted in the position marked by the old position, and the diffing algo tries to remove the element associated to the old vnodes.

My hunch is that this will result on the moved node to disappear from the DOM, and it is very hard to write a test that validate this assumption, but we can certainly try.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caridy Thanks for the explanation. I'm looking into it, and it turns out that this else block is never hit in either our Jest tests or Karma tests:

idxInOld = oldKeyToIdx[newStartVnode.key!];
if (isUndef(idxInOld)) {
// New element
newStartVnode.hook.create(newStartVnode);
newStartVnode.hook.insert(newStartVnode, parentElm, oldStartVnode.elm!);
newStartVnode = newCh[++newStartIdx];
} else {
elmToMove = oldCh[idxInOld];
if (isVNode(elmToMove)) {
if (elmToMove.sel !== newStartVnode.sel) {
// New element
newStartVnode.hook.create(newStartVnode);
newStartVnode.hook.insert(newStartVnode, parentElm, oldStartVnode.elm!);
} else {
patchVnode(elmToMove, newStartVnode);
oldCh[idxInOld] = undefined as any;
newStartVnode.hook.move(elmToMove, parentElm, oldStartVnode.elm!);
}
}
newStartVnode = newCh[++newStartIdx];

In other words, isUndef(idxInOld) is always true. I'll try to write a test where it is false.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I managed to find a Karma test to hit this line of code (I'll open a separate PR to improve our test coverage). Based on the usage patterns, I think we can do this:

  1. Instead of using oldCh directly, use a clone ([...oldCh]) inside of updateDynamicChildren. AFAICT, oldCh is only used inside of that function – afterwards, it is discard and GC'ed.
  2. To avoid the perf hit of cloning arrays, we can do the clone only on line 142 (i.e. when calling oldCh[idxInOld] = undefined as any; ).

@caridy How does this sound?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the clone solves the problem, unless you always rely on the cloned array for the different paths of the algo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also closing via [...oldCh] will trigger the iterable protocol that is not ideal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caridy I'm saying we should do:

- oldCh[idxInOld] = undefined as any;
+ const oldChClone = [...oldCh];
+ oldChClone[idxInOld] = undefined as any;
+ oldCh = oldChClone;

It solves the problem because the oldCh array is Readonly – although only because of the empty array [], which may be shared between VNodes due to the optimization in this PR. (A non-empty array cannot be hoisted because it will contain VNodes, not static objects. This is why we only need a shallow clone, not a deep clone.)

also closing via [...oldCh] will trigger the iterable protocol that is not ideal.

Can you elaborate? Are you concerned about performance? It seems to me that this code path is very infrequent, so I'm not really concerned about the perf implications there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm fine with this approach. In locker code, we try to stay away from [...array] due to the perf implications of the iterables. cc @jdalton, but as you said, this is a very rare code path anyways.

@nolanlawson nolanlawson force-pushed the nolan/hoist-static branch 2 times, most recently from a933dea to 4d5cd91 Compare December 8, 2021 19:14
@nolanlawson
Copy link
Collaborator Author

Re-ran against the updated benchmarks (#2596) and everything is green now!

Screen Shot 2021-12-08 at 1 01 57 PM

Copy link
Member

@pmdartus pmdartus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am conditionally approving this PR until we figure out what the oldCh[idxInOld] = undefined as any; used to do and explain why this isn't needed anymore.

@nolanlawson
Copy link
Collaborator Author

@pmdartus Sounds good, let me do that research. I'm encouraged by the fact that none of our downstreams seem to be failing, but I'm also nervous that removing that line of code seems to have no impact.

@nolanlawson nolanlawson merged commit 0ea4856 into master Jan 4, 2022
@nolanlawson nolanlawson deleted the nolan/hoist-static branch January 4, 2022 22:12
nolanlawson added a commit that referenced this pull request Jan 5, 2022
nolanlawson added a commit that referenced this pull request Jan 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Perf] Memoize static objects in template iterator
3 participants