Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(utils): greatly improve the speed of querySelectorAll #3423

Merged
merged 14 commits into from
Jun 8, 2022
3 changes: 3 additions & 0 deletions lib/core/base/virtual-node/virtual-node.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import AbstractVirtualNode from './abstract-virtual-node';
import { isXHTML, validInputTypes } from '../../utils';
import { isFocusable, getTabbableElements } from '../../../commons/dom';
import cache from '../cache';
import { cacheNodeSelectors } from '../../utils/selector-cache';

let isXHTMLGlobal;
let nodeIndex = 0;
Expand Down Expand Up @@ -50,6 +51,8 @@ class VirtualNode extends AbstractVirtualNode {
if (cache.get('nodeMap')) {
cache.get('nodeMap').set(node, this);
}

cacheNodeSelectors(this);
}

// abstract Node properties so we can run axe in DOM-less environments.
Expand Down
8 changes: 8 additions & 0 deletions lib/core/core.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,10 @@ import v2Reporter from './reporters/v2';

import * as commons from '../commons';
import * as utils from './utils';
import {
cacheNodeSelectors,
getNodesMatchingExpression
} from './utils/selector-cache';

axe.constants = constants;
axe.log = log;
Expand All @@ -70,6 +74,10 @@ axe._thisWillBeDeletedDoNotUse.base = {
axe._thisWillBeDeletedDoNotUse.public = {
reporters
};
axe._thisWillBeDeletedDoNotUse.utils =
axe._thisWillBeDeletedDoNotUse.utils || {};
axe._thisWillBeDeletedDoNotUse.utils.cacheNodeSelectors = cacheNodeSelectors;
axe._thisWillBeDeletedDoNotUse.utils.getNodesMatchingExpression = getNodesMatchingExpression;

axe.imports = imports;

Expand Down
5 changes: 5 additions & 0 deletions lib/core/utils/matches.js
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ function convertAttributes(atts) {
return {
key: attributeKey,
value: attributeValue,
type: typeof att.value === 'undefined' ? 'attrExist' : 'attrValue',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added to tell the difference between the selectors [href] and [href=""] (both returned the same structure of value: '' even though one is only checking existence of the attribute and the other is checking for an empty string).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a test I think. Should have caught that first time around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the convertSelector API does not get exported to a public axe function and is only used internally. We could add it to the _thisWillBeDeletedDoNotUse test api, but I think adding a suite of tests for it should be a separate pr.

test: test
};
});
Expand Down Expand Up @@ -224,6 +225,10 @@ export function convertSelector(selector) {
* @returns {Boolean}
*/
function optimizedMatchesExpression(vNode, expressions, index, matchAnyParent) {
if (!vNode) {
return false;
}

const isArray = Array.isArray(expressions);
const expression = isArray ? expressions[index] : expressions;
let matches = matchExpression(vNode, expression);
Expand Down
12 changes: 12 additions & 0 deletions lib/core/utils/query-selector-all-filter.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { matchesExpression, convertSelector } from './matches';
import { getNodesMatchingExpression } from './selector-cache';

function createLocalVariables(
vNodes,
Expand Down Expand Up @@ -124,6 +125,17 @@ function matchExpressions(domTree, expressions, filter) {
function querySelectorAllFilter(domTree, selector, filter) {
domTree = Array.isArray(domTree) ? domTree : [domTree];
const expressions = convertSelector(selector);

// see if the passed in node is the root node of the tree and can
// find nodes using the cache rather than looping through the
// the entire tree
const nodes = getNodesMatchingExpression(domTree, expressions, filter);
if (nodes) {
return nodes;
}

// if the selector cache is not set up or if not passed the
// top level node we default back to parsing the whole tree
return matchExpressions(domTree, expressions, filter);
}

Expand Down
196 changes: 196 additions & 0 deletions lib/core/utils/selector-cache.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
import { matchesExpression } from './matches';
import tokenList from './token-list';

// since attribute names can't contain whitespace, this will be
// a reserved list for ids so we can perform virtual id lookups
const idsKey = ' [idsMap]';
let selectorMap = {};
straker marked this conversation as resolved.
Show resolved Hide resolved

/**
* Non-tag selectors use `*` for the tag name so a global selector won't have any other properties of the expression.
* @param {Object} exp Selector Expression
* @returns {Boolean}
*/
function isGlobalSelector(exp) {
return exp.tag === '*' && !exp.attributes && !exp.id && !exp.classes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about psuedo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't cache the content of a pseudo selector, they are still considered a global selector such that *:not([foo]) will use the global selector cache and then matchesExpression to match the pseudo selector.

}

/**
* Save a selector and vNode to the selectorMap
* @param {String} key
* @param {VirtualNode} vNode
* @param {Object} map
*/
function cacheSelector(key, vNode, map = selectorMap) {
map[key] = map[key] || [];
map[key].push(vNode);
}

/**
* Inner join two arrays (find all nodes in a that are also in b).
* @param {Mixed[]} a
* @param {Mixed[]} b
* @returns {Mixed[]}
*/
function innerJoin(a, b) {
straker marked this conversation as resolved.
Show resolved Hide resolved
if (!b.length) {
return a;
straker marked this conversation as resolved.
Show resolved Hide resolved
}

return a.filter(node => b.includes(node));
}

/**
* Cache selector information about a VirtalNode.
* @param {VirtualNode} vNode
*/
export function cacheNodeSelectors(vNode) {
if (vNode.props.nodeType !== 1) {
return;
}

// cache the selector map to the root node of the tree
if (vNode.nodeIndex === 0) {
selectorMap = {};
vNode._selectorMap = selectorMap;
}

cacheSelector(vNode.props.nodeName, vNode);
cacheSelector('*', vNode);

vNode.attrNames.forEach(attrName => {
// element ids are the only values we'll match
if (attrName === 'id') {
selectorMap[idsKey] = selectorMap[idsKey] || {};
tokenList(vNode.attr(attrName)).forEach(value => {
cacheSelector(value, vNode, selectorMap[idsKey]);
});
}

cacheSelector(`[${attrName}]`, vNode);
});
}

/**
* Get nodes from the selector cache that match the selector.
* @param {VirtualTree[]} domTree flattened tree collection to search
* @param {Object} expressions
* @param {Function} filter function (optional)
* @return {Mixed} Array of nodes that match the selector or undefined if the selector map is not setup
*/
export function getNodesMatchingExpression(domTree, expressions, filter) {
// check to see if the domTree is the root and has the selector
// map. if not we just return and let our QSA code do the finding
const selectorMap = domTree[0]._selectorMap;
straker marked this conversation as resolved.
Show resolved Hide resolved
if (!selectorMap) {
return;
}

const shadowId = domTree[0].shadowId;

// if the selector uses a global selector with a combinator
// (e.g. A *, A > *) it's actually faster to use our QSA code than
// getting all nodes and using matchesExpression
for (let i = 0; i < expressions.length; i++) {
if (
expressions[i].length > 1 &&
expressions[i].some(expression => isGlobalSelector(expression))
) {
return;
}
}

const nodeSet = new Set();

// find nodes that match just a part of the selector in order
// to speed up traversing the entire tree
expressions.forEach(expression => {
straker marked this conversation as resolved.
Show resolved Hide resolved
// use the last part of the expression to find nodes as it's more
// specific. e.g. for `body h1` use `h1` and not `body`
const exp = expression[expression.length - 1];
let nodes = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to start off with the entire list and then filter down. Rather than start with an empty list and then either add or remove depending on if the list is currently empty. The way you've done this now can create problems if at one point you filter down to 0 nodes half way through the operation.

For example, lets say you have a[name].foo.

  1. Start with an empty nodes array
  2. Put all a elements into nodes
  3. Filter nodes, removing any that doesn't have a name attribute
  4. Filter nodes, removing any that don't have a foo class

If during step 3, the nodes array becomes empty, in step 4 you'll get all elements with the foo class in your results. And because this isn't a complex selector, your matchExpression fallback won't catch this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch


// a complex selector is one that will require using
// matchesExpression to determine if it matches. these include
// pseudo selectors (:not), combinators (A > B), and any
// attribute value ([class=foo]).
let isComplexSelector =
expression.length > 1 || !!exp.pseudos || !!exp.classes;

if (isGlobalSelector(exp)) {
nodes = selectorMap['*'];
} else {
if (exp.id) {
// a selector must match all parts, otherwise we can just exit
// early
if (!selectorMap[idsKey] || !selectorMap[idsKey][exp.id]?.length) {
return;
}

// when using id selector (#one) we only find nodes that
// match the shadowId of the root
nodes = selectorMap[idsKey][exp.id].filter(
node => node.shadowId === shadowId
);
}

if (exp.tag && exp.tag !== '*') {
if (!selectorMap[exp.tag]?.length) {
return;
}

nodes = innerJoin(selectorMap[exp.tag], nodes);
}

if (exp.classes) {
if (!selectorMap['[class]']?.length) {
return;
}

nodes = innerJoin(selectorMap['[class]'], nodes);
}

if (exp.attributes) {
for (let i = 0; i < exp.attributes.length; i++) {
const attr = exp.attributes[i];

// an attribute selector that looks for a specific value is
// a complex selector
if (attr.type === 'attrValue') {
isComplexSelector = true;
}

if (!selectorMap[`[${attr.key}]`]?.length) {
return;
}

nodes = innerJoin(selectorMap[`[${attr.key}]`], nodes);
}
}
}

nodes.forEach(node => {
// for complex selectors we need to verify that the node
// actually matches the entire selector since we only have
// nodes that partially match the last part of the selector
if (isComplexSelector && !matchesExpression(node, expression)) {
return;
}

nodeSet.add(node);
});
});

// ie11 does not work with Array.from without a polyfill (missing
// `.entries`) but does have forEach. using a set saves time when
// wanting to make sure the same node isn't added twice (~3 seconds
// total over using array.includes)
let matchedNodes = [];
nodeSet.forEach(node => matchedNodes.push(node));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to return an array, why bother with a set? Wouldn't it be easier to just use an array and then filter it for unique values before you return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it was actually way more performant to use a Set and convert it to an array at the end than to use an array and check if the element already exists in the array before adding.


if (filter) {
matchedNodes = matchedNodes.filter(filter);
}

return matchedNodes.sort((a, b) => a.nodeIndex - b.nodeIndex);
}
14 changes: 14 additions & 0 deletions test/core/base/virtual-node/virtual-node.js
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,20 @@ describe('VirtualNode', function() {
assert.equal(vNode.props.nodeName, 'foobar');
});

it('should add selectorMap to root element', function() {
node.setAttribute('data-foo', 'bar');
var vNode = new VirtualNode(node);

assert.exists(vNode._selectorMap);
});

it('should not add selectorMap to non-root element', function() {
node.setAttribute('data-foo', 'bar');
var vNode = new VirtualNode(node, new VirtualNode(node.cloneNode()));

assert.notExists(vNode._selectorMap);
});

describe('attr', function() {
it('should return the value of the given attribute', function() {
node.setAttribute('data-foo', 'bar');
Expand Down
Loading