MutationObserver with querySelector for elements #77

zbraniecki · 2015-09-18T22:02:11Z

MutationObserver is a great tool for writing small libraries that operate on specific types of elements.
Examples:

Libraries that extend the behavior of particular element
** Date/Time pickers on <input type="datetime">
** Additional API on some types of <link> elements
Libraries that modify custom elements (from outside extension of web-components)
Libraries that modify elements with certain attributes:
** Libraries that aid with specific aria-* values
** Libraries that handle localization of data-l10n-* elements

Currently, MutationObserver API is designed to aid code that wants to operate on all subtree operations of a given root.
But in all those cases the basic premise is that they want to handle specific set of elements. Their additions/removals and possibly modifications.

The result is that in all those cases that I encountered in my work there's a massive amount of noise generated because the library is fishing for a particular element type, while it has to retrieve all node additions/removals and filter them out manually.

In many cases, the ratio of nodes inserted (dynamically or by the parser) to elements the library wants to operate on is 100/1.

My understanding is that the platform is in much better position to optimize filtering the elements for MutationObserver than anything that the MutationObserver callback can do.
Additionally, even if the code logic would be the same, the sole fact that the mutations argument has to be constructed and then the callback for them has to be called, for many mutations that are not relevant to the MutationObserver is a waste of CPU time and memory/GC allocation.

My initial proposal would look something like this:

const observerConfig = {
  attributes: true,
  attributeFilter: ['data-l10n-id', 'data-l10n-args'],
  childNodes: true,
  querySelector: '*[data-l10n-id], *[data-l10n-args]'
}; 

var mo = new MutationObserver(onMutations);
mo.observe(this.doc, observerConfig);

function onMutations(mutations) {
  const targets = new Set();
  const removedTargets = new Set();

  for (let mutation of mutations) {
    switch (mutation.type) {
      case 'attributes':
        targets.add(mutation.target);
        break;
      case 'childList':
        mutation.addedNodes.forEach(node => targets.add(node));
        mutation.removedNodes.forEach(node => removedTargets.add(node));
        break;
    }
  }

  translateElements(targets);
  cleanupAfterElements(removedTargets);
}

The text was updated successfully, but these errors were encountered:

smaug---- · 2015-10-09T15:43:47Z

So could you perhaps clarify the proposal a bit. How should adding an attribute work? what about removal? At which point should the selector run in each case?

smaug---- · 2015-10-09T22:08:10Z

Oh, silly me, you want selector based filtering for child nodes only, not attributes (the example just happens to do something with attributes too). But similar question still: at which point should the selector run, especially in case of child node removal?

zbraniecki · 2015-10-10T00:54:26Z

I don't know, and I don't think it matters that much. For all cases I can imagine, the only reason I want to observe removals is to clean up my own objects, so I just need to know that an element with certain data-l10n-id or certain data-l10n-format has been removed.

@smaug---- Does it answer your question?

zbraniecki · 2015-10-10T00:55:55Z

And the reason I have both attributes and childNodes is that usually, when the task is to "cover nodes with certain attribute", then I will need to know when they are added/removed and when an already existing node has this attribute added/removed/modified. Which ends up looking like the example, and I think it's fine and allows for fairly good level of customization.

smaug---- · 2015-10-10T10:36:56Z

It does matter quite a bit when the selector runs. When adding a child node selector can run
after we've added the node to child or it can actually run as late as just before delivering the MutationRecords, That might even be the case you want, since what if the attributes in the element have been modified after the element was added to the parent node but before MutationRecord delivery - the same selector might not select the element anymore. But then, running the selector so late would be rather big difference to how MutationObserver otherwise works... so I guess selector would need to run right after the element has been added to its parent, and the user of the MutationObserver needs to validate that the selector still applies when MutationRecord is delivered.

Removing a node would require, I guess, running selector before any DOM mutation has happened.

How to make this all perform well? selector matching isn't exactly cheap.
The performance of selectors was one criticism the old Shadow DOM spec got and it is now
using just simple slots.
@heycam @rniwa may have ideas about how to make this work fast.

But, do we want to use selectors. I think I'd be more comfortable with "raw DOM" data.
Filter out based on attributes in the element. childNodeAttributeFilter or some such.
That way the same list of filters could be used for two things, for attribute changes and child node changes.

var filter = ['data-l10n-id', 'data-l10n-args'];
var mo = new MutationObserver(yourCallback);
mo.observe(document, {
attributes: true,
attributeFilter: filter,
childNodes: true,
childNodeAttributeFilter: filter
});

That would filter out child nodes which don't have the relevant attribute, even in case innerHTML is used.

heycam · 2015-10-12T07:33:11Z

It might be simpler to have a separate selector observer mechanism, i.e. something that just watches for changes to the set of elements in a subtree that match the given selectors, and not be a filter on the existing fields on a MutationRecord.

Then we can just require the mutation observer records be generated whenever the element is restyled, and use the restyle process to do the "element now matching" / "element no longer matching" checks. It would mean that a sequence like:

observe e for ".x" change;
e.className.toggle("x");
e.className.toggle("x");

wouldn't generate a record, while:

observe e for ".x" change;
e.className.toggle("x");
e.offsetTop;
e.className.toggle("x");

would generate two (and possibly be coalesced? I think the way MutationObserver works allows for this kind of coalescing).

I don't think selector observers need to be more expensive than the selector matching we do for restyles, especially if we're modelling whether an element matches one of these selectors or not is done by pretending we have a style rule in the document with the relevant selector (and some hack to make it rooted at the root of the observation). I think the performance impact of adding a real style rule to a document being about the same as one for selector observers is reasonable.

The observer probably needs to maintain a set of elements that currently match, so that when an element is removed from the tree we remove it (and its descendants) from the set.

We'd also need to do something different for elements in display:none subtrees, which Gecko doesn't give styles unless script does a getComputedStyle, and we'd probably need to adjust the restyle process to get that to work...

rniwa · 2015-10-12T08:18:12Z

I don't think we want to use CSS selectors for filtering purposes. Evaluating selectors every time a node is added, removed, or modified is going to be unacceptably expensive.

Then we can just require the mutation observer records be generated whenever the element is restyled, and use the restyle process to do the "element now matching" / "element no longer matching" checks.

That would expose the timing of styling recalc. Since that's an area where different browsers optimize things differently, designing an interoperable API that doesn't degrade performance on all browsers is going to very hard.

In addition, all the use cases I've seen for filtering mutation records so far involved filtering nodes based on element names or attribute names (or combinations of the both in some rare cases). Given that I agree with @smaug---- that DOM-based approach is better. Also, it's not necessary for the filtering to be perfect. If we can weed out, say, 90% of all other nodes, then scripts can quite efficiently pick the one they need after that.

smaug---- · 2015-10-12T13:09:14Z

Ah, indeed we may want to filter also based on nodeNames.

so something like
childNodeAttributeFilter: "class"
childNodeFilter: ["div", "span"]
Though, there is the issue with namespaces, since attribute filters are for null-namespace attributes only and element name based filters would need to be something else.

But perhaps for now we could just add attribute name based filter for child nodes.
childNodeAttributeFilter is a bit long name, any better suggestions?

zbraniecki · 2015-10-12T16:07:22Z

If we can weed out, say, 90% of all other nodes, then scripts can quite efficiently pick the one they need after that.

Totally agree. :) All my use cases are about attribute based system. I can imagine HTML shims to use node filter as well, but we can probably do without full query selector for now.

domenic · 2015-10-12T16:08:23Z

My understanding is that the platform is in much better position to optimize filtering the elements for MutationObserver than anything that the MutationObserver callback can do.

I question this premise. Do we have any data, e.g. from a custom build of a popular rendering engine, to show this to be the case?

smaug---- · 2015-10-12T16:22:35Z

Well, we would create a lot less garbage/cycle collectable objects if browser engine would do filtering.

domenic · 2015-10-12T16:25:32Z

I have no doubt about the theoretical benefits. (Indeed, for pretty much any API that allows authors to do things in JavaScript, you can say, "if we moved some of this to C++/Rust, it would generate less garbage.") I am questioning whether it actually shows up in real-world scenarios.

zbraniecki · 2015-10-12T16:34:56Z

I instrumented Firefox OS localization framework that localizes nodes registered with data-l10n-id to record numbers.

Launching a single app (Settings) gave me:

117 mutations
254 added nodes
5 removed nodes
67 nodes that contain data-l10n-id

Navigating around the app (between 5 panels) gave me:

427 mutations
1264 added nodes
36 removed nodes
231 nodes that contain data-l10n-id

The additional problem we face is that the addedNode may be in fact a DOMFragment and we have to find nodes with data-l10n-id inside it, so we use... querySelector.

I just started working on a counter part to that API that brings Intl capabilities for elements. It will use data-intl-format attribute and will require pretty much the same MutationObserver heuristics.
The only problem is that the signal/noise ratio will be even lower.

And we have a shim for a color picker that only wants to register input[type=color], but has to listen to all mutations.

Does it bring an examples you've been looking for @domenic ?
If not, can you provide more information on what you are looking for? I'll try to get it.

smaug---- · 2015-10-12T16:38:52Z

I don't know what you mean with DOMFragment, but you would still need to still look for
data-l10n-id nodes under the addedNodes even if childNodeAttributeFilter was added.
MutationObserver doesn't know about the subtrees of the added nodes.

domenic · 2015-10-12T16:40:33Z

@zbraniecki thanks for investigating, but that wasn't quite what I was wondering. I was wondering about speed (in seconds) for filtering in C++, as this issue proposes, vs. filtering in JS using suitably-efficient strategies (so e.g. looping and checking el.localName instead of using querySelector).

It's not worth it to add an API to the platform if the only benefit is less "noise" to the mutation observer callbacks; you can write a trivial wrapper library to handle that. The benefit would be if there are real performance advantages, which is why I was hoping to see benchmarks.

zbraniecki · 2015-10-12T16:41:24Z

I launched a light-workload marionette test scenario on the app:

mutations: 667
addednodes: 2464
removednodes: 450
affectedl10nnodes: 407

that's signal/noise ratio of 0.14. That's a lot of resources going nowhere.

zbraniecki · 2015-10-12T16:42:17Z

It's not worth it to add an API to the platform if the only benefit is less "noise" to the mutation observer callbacks; you can write a trivial wrapper library to handle that. The benefit would be if there are real performance advantages, which is why I was hoping to see benchmarks.

Gotcha. Ok. I'd need help from someone with platform experience to write a shim for me and I'll be happy to test that!

zbraniecki · 2015-10-12T16:44:29Z

But also, notice that it's not only about speed in seconds. It's also about the amount of allocations and matching GCs.

My naive understanding is that there are four benefits:

clearner code
faster code
less CPU cycles wasted
less allocations/GCs

Wrapper could help with 1), you are asking about 2). Do we agree on 3 and 4?

domenic · 2015-10-12T16:46:21Z

Well, 3) is just the same as 2). But yes, 2/3 and 4 are both worth investigating.

I guess my issue is that 4 is a general problem with every web API ever; if you move portions of it into C++, there are less allocations and GCs. In the end why don't we just write all our code in webasm...

zbraniecki · 2015-10-12T16:52:16Z

Yeah, I see your point. I believe that the problem with MutationObserver is that most use cases I saw in our platform are the opposite of what MutationObserver API is designed for.

It seems to be designed to help in cases where the mutation observer callback is interested in vast majority of nodes under given root element.

I see MutationObserver as a tool for scenarios where the number of nodes of interest compared to amount of "traffic" is very, very low.
L10n here is an exceptionally high ratio scenario, because one can argue that a lot of elements are localizable in a fully localizable UI.

But MutationObserver is also the only way to properly shim a color picker, file picker, any other widget, mentioned intl API etc. and those will literally affect 4-15 elements in a document that may contain hundreds of nodes.

In order to use MutationObserver efficiently as a shim, one would also want to enable it before parser is done with DOM and get the DOM from the Parser filtered. But that increases the number of nodes that can get mutations unless we can improve the filtering.

So, while I agree with the concern, I believe that MutationObserver is special here - an API with a great potential to enhance the Web stack, that currently enables doing so at an unreasonable cost.

domenic · 2015-10-12T16:55:20Z

Yeah, it makes sense to add one-off hacks for general problems if the payoff is worthwhile. I just think it's worth measuring first. So yeah, we want speed and memory usage benchmarks to show this would be a worthwhile platform feature for every browser to implement. That seems like the bare minimum to get cross-vendor interest.

rniwa · 2015-10-12T18:29:22Z

I don't think such an investigation is necessary to justify the API. A high noise-to-signal ratio itself is a problem worth solving, and we've always wanted to add a filtering mechanism for MutationObserver from the beginning. We never got around to it because each of us were distracted by other things.

In addition, without doing any investigation, I can tell you that creating a JS array with thousands of entries in the browser engine is expensive and will cause a GC churn. The fact MutationObserver records are objects is already bad, and we spent a lot of time minimizing the amount of information being collected for performance. This is why collecting old values for CharacterData and attribute value is optional. There is literally nothing controversial / mysterious here about the cost.

smaug---- · 2015-10-12T22:52:25Z

So I was looking at http://www.w3.org/2008/webapps/wiki/MutationReplacement#NodeWatch_.28A_Microsoft_Proposal.29 again, and discussing with zbraniecki and I think one possible option here is to have a way to observe mutations, without actually creating the MutationRecords.
Then one can know that there is some mutation in DOM tree, and use querySelectorAll and effectively implement NodeWatch in JS. One could for example schedule querySelectorAll to run using rAF if there is a DOM mutation. In this l10n case, all the elements which have been handled could be marked with some attribute, say l10nHandled="true" and the selector would skip such elements.

zbraniecki · 2015-10-13T00:07:36Z

This is an early POC of how it might look like: https://bug1214026.bmoattachments.org/attachment.cgi?id=8672860 - I'll try to plug it into our OS and measure impact on performance. If it's good, then we can look into noRecordDelivery as a way to cut memory usage.

ukari · 2016-07-23T02:21:28Z

zbraniecki, I thing I meet the same problem like yours.
In some case, you could use mutation.target to instead of 'querySelector' to decrease the calling numbers of mutation.
Here is my code

new MutationObserver(
    function (mutations)
    {
      mutations.map(
        function (mutation)
        {
          if (mutation.target != '[object HTMLBodyElement]' &&
              mutation.target != '[object HTMLScriptElement]' &&
              (mutation.removedNodes.length != 0 ||
              mutation.addedNodes.length != 0)) {
            Languages.update(); //Here is what you want to do
          }
        }
      );
    }
  ).observe(
    document.querySelector('body'),
    {attributes: false, childList: true, characterData: false, subtree:true}
  );

rniwa · 2016-07-23T02:33:48Z

I still think we should just add a filter based on element name & attribute name but adding an option to not receive individual records also seems like a good idea as well. I think we should do all those things.

LeaVerou · 2018-03-01T00:32:15Z

A huge use case for this would be attaching events to elements before DOMContentLoaded.
Currently, most scripts add events to elements on DOMContentLoaded which on slow networks and/or large pages can take quite a while. In the meantime, users see a UI that doesn’t work, and don’t know when it will start working (since DOMContentLoaded is not something a user can train to recognize).

MutationObserver has the potential to fix this very pervasive problem and give developers the power to attach events to elements as they are rendered. However, currently that would require looping over every single element and using .matches() on it, which would be very expensive in exactly the case where it matters most (pages with lots of HTML).

If the platform could do most of this filtering efficiently, then using MO for this would be a feasible solution. It doesn’t have to be the entire selector syntax, but at least tags, classes, ids, attributes (partial too), commas.

I wonder if we could mine jQuery code (since that has a very specific pattern of $(selector).eventName(callback) while in vanilla JS it’s done in many different ways) to see what types of selectors people use. It might be an even more restricted set.

annevk · 2018-03-13T07:14:24Z

I'm going to close this on performance grounds as well as it having objections from implementers. #398 is still something we can (and I think should) pursue. Extensions that are similar in scope we could also consider in a new issue, but full selector support is out.

The research @LeaVerou suggests seems like a great first step for any such issue. Lifting some of the burden from libraries and frameworks is definitely something implementers are interested in.

annevk added the needs implementer interest Moving the issue forward requires implementers to express interest label Aug 23, 2016

annevk mentioned this issue Jan 15, 2017

Consider adding localNameFilter to MutationObserver #398

Open

annevk closed this as completed Mar 13, 2018

LeaVerou mentioned this issue Jan 31, 2019

Dynamic Awesomplete? LeaVerou/awesomplete#17174

Open

bkardell mentioned this issue Apr 30, 2024

Proposal: SelectorObserver #1285

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MutationObserver with querySelector for elements #77

MutationObserver with querySelector for elements #77

zbraniecki commented Sep 18, 2015

smaug---- commented Oct 9, 2015

smaug---- commented Oct 9, 2015

zbraniecki commented Oct 10, 2015

zbraniecki commented Oct 10, 2015

smaug---- commented Oct 10, 2015

heycam commented Oct 12, 2015

rniwa commented Oct 12, 2015

smaug---- commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

domenic commented Oct 12, 2015

smaug---- commented Oct 12, 2015

domenic commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

smaug---- commented Oct 12, 2015

domenic commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

domenic commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

domenic commented Oct 12, 2015

rniwa commented Oct 12, 2015

smaug---- commented Oct 12, 2015

zbraniecki commented Oct 13, 2015

ukari commented Jul 23, 2016

rniwa commented Jul 23, 2016

LeaVerou commented Mar 1, 2018 •

edited

Loading

annevk commented Mar 13, 2018

MutationObserver with querySelector for elements #77

MutationObserver with querySelector for elements #77

Comments

zbraniecki commented Sep 18, 2015

smaug---- commented Oct 9, 2015

smaug---- commented Oct 9, 2015

zbraniecki commented Oct 10, 2015

zbraniecki commented Oct 10, 2015

smaug---- commented Oct 10, 2015

heycam commented Oct 12, 2015

rniwa commented Oct 12, 2015

smaug---- commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

domenic commented Oct 12, 2015

smaug---- commented Oct 12, 2015

domenic commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

smaug---- commented Oct 12, 2015

domenic commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

domenic commented Oct 12, 2015

zbraniecki commented Oct 12, 2015

domenic commented Oct 12, 2015

rniwa commented Oct 12, 2015

smaug---- commented Oct 12, 2015

zbraniecki commented Oct 13, 2015

ukari commented Jul 23, 2016

rniwa commented Jul 23, 2016

LeaVerou commented Mar 1, 2018 • edited Loading

annevk commented Mar 13, 2018

LeaVerou commented Mar 1, 2018 •

edited

Loading