Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(utils): greatly improve the speed of querySelectorAll #3423

Merged
merged 14 commits into from
Jun 8, 2022

Conversation

straker
Copy link
Contributor

@straker straker commented Mar 24, 2022

Following the QSA proposal, this greatly improves the speed of QSA as it no longer has to scan the entire DOM tree for top-level queries. For large sites the speed increase is substantial.

For https://js.tensorflow.org/api/latest/, the top slowest performanceTimer results were as follow:

Timer Name Time (ms)
rule_heading-order#gather 136.40
rule_empty-heading#gather 137.20
rule_empty-heading 137.70
rule_list 139.30
rule_aria-command-name#gather 140.40
rule_aria-command-name 140.70
rule_aria-allowed-attr#gather 148.30
runchecks_heading-order 151.10
runchecks_page-has-heading-one 151.30
rule_page-has-heading-one 151.40
rule_identical-links-same-purpose#gather 172.60
runchecks_link-name 197.30
rule_valid-lang#gather 197.70
rule_valid-lang 198.10
rule_landmark-unique#gather 208.90
rule_landmark-unique 210.80
rule_aria-allowed-attr 216.40
rule_identical-links-same-purpose#matches 273.50
rule_link-name 278.60
rule_heading-order 287.60
runchecks_bypass 338.80
rule_bypass 405.80
rule_identical-links-same-purpose 575.70
runchecks_region 871.80
rule_color-contrast#matches 967.10
rule_region 997.40
runchecks_color-contrast 11,530.40
rule_color-contrast 12,497.60
audit_start_to_end 21,541.50

We can see that the #gather step is what causes most rules to be slow (and not so much the #runchecks step), causing axe-core to take 20 seconds to just run through all the rules.

By caching node information ahead of time, the top slowest results are now as follows:

Timer Name Time (ms)
rule_page-has-heading-one 31.90
runchecks_aria-hidden-body 31.90
rule_aria-hidden-body 32.00
rule_autocomplete-valid 36.60
rule_aria-valid-attr-value 36.70
rule_aria-valid-attr 38.80
runchecks_list 52.90
rule_list 53.90
rule_scrollable-region-focusable#matches 69.00
rule_nested-interactive#matches 71.30
runchecks_bypass 75.30
rule_bypass 75.50
rule_aria-allowed-attr#gather_axe.utils.isHidden 81.50
rule_nested-interactive 91.80
rule_aria-allowed-attr#gather 94.70
rule_scrollable-region-focusable 99.20
rule_region#gather 102.60
rule_aria-allowed-attr 115.80
runchecks_identical-links-same-purpose 121.90
rule_identical-links-same-purpose#matches 181.30
runchecks_link-name 186.10
rule_link-name 193.50
rule_identical-links-same-purpose 308.60
runchecks_region 806.00
rule_region 914.70
rule_color-contrast#matches 1,054.00
runchecks_color-contrast 11,780.90
rule_color-contrast 12,835.00
audit_start_to_end 15,093.50

This change shaves off about 7-10 seconds from the time needed to run all of axe rules on the site (audit_start_to_end).

As the page size increases, so too does the savings. Testing an extremely large site (152,000 nodes), this is the final time difference:

Timer Name Normal Run Time (ms) Cache Run Time (ms)
audit_start_to_end 62,128.60 45,251.90

This is about a 20 second savings. On medium size pages, the savings is minimal (almost no difference).

To ensure the results are the same, I ran axe-core against the tensorflow site, recorded the results, then ran the caching verison of axe core and compared the two results for equality. Here's the code I used.

const { AxePuppeteer } = require('@axe-core/puppeteer');
const puppeteer = require('puppeteer');
const fs = require('fs');
const chai = require('chai');
const processArgs = require('./process-args');

const { assert } = chai;
const args = processArgs(process.argv.slice(2));

if (!args.url) {
  throw new Error('Must include a url to test');
}
if (!args.numTests) {
  args.numTests = 1;
}

function assertShallowEqual(obj1, obj2, parentKey = '') {
  let keys;
  if (Array.isArray(obj1)) {
    keys = obj1.map((value, index) => index);

  }
  else {
    keys = Object.keys(obj1);
  }

  keys.forEach(key => {
    const valueObj1 = obj1[key];
    const valueObj2 = obj2[key];

    // don't compare html output, just selector
    if (key === 'html') return;

    // don't compare url props (can contain date-time...)
    if (key === 'urlProps') return;

    // don't compare axe testEngine since they are named differently
    // to ensure correct tests
    if (key === 'testEngine') return;

    // don't look for null
    if (valueObj1) {
      assert.exists(valueObj2, `${key} does not exist on ${parentKey}`);
    }

    if (valueObj1 && (Array.isArray(valueObj1) || typeof valueObj1 === 'object')) {
      assertShallowEqual(valueObj1, valueObj2, `${parentKey}.${key}`);
    }
    else {
      assert.equal(valueObj1, valueObj2, `${parentKey}.${key} not equal`);
    }
  });
}

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.setBypassCSP(true);

  const axe = new AxePuppeteer(page);
  const results = [];

  const axeCorePath = require.resolve('axe-core');
  const origAxeCore = fs.readFileSync(axeCorePath, 'utf8');
  const devAxeCore = fs.readFileSync('./axe-develop.js', 'utf8');
  const perfAxeCore = fs.readFileSync('./axe-perf.js', 'utf8');

  fs.rmSync('./control', { recursive: true, force: true });
  fs.rmSync('./test', { recursive: true, force: true });

  fs.mkdirSync('./control');
  fs.mkdirSync('./test');

  async function runAxe(dir) {
    for (let i = 0; i < args.numTests; i++) {
      page.goto(args.url);

      // wait for content to settle
      await new Promise((resolve) => {
        setTimeout(resolve, 5000)
      });

      const result = await axe.analyze();
      delete result.timestamp;
      fs.writeFileSync(`./${dir}/result-${i}.json`, JSON.stringify(result, null, 2), 'utf8');
      results.push(result);
    }
  }

  fs.writeFileSync(axeCorePath, devAxeCore, 'utf8');
  await runAxe('control');
  fs.writeFileSync(axeCorePath, perfAxeCore, 'utf8');
  await runAxe('test');
  fs.writeFileSync(axeCorePath, origAxeCore, 'utf8');

  try {
    for (let i = 0; i < args.numTests; i++) {
      assertShallowEqual(results[i], results[i+1]);
    }
  } finally {
    await page.close();
    await browser.close();
  }
})();

NOTE: This selector cache can still produce slower speeds with certain selectors. For example, the code short circuits the body * selector for the region rule, but won't do that for a similar selector such as body p. We don't currently have any such selector, but a user could write a custom rule that uses that. If we are so inclined, we could try to make make sure any selector is fast, but currently this is optimized for our own selectors for rules.

@straker straker requested a review from a team as a code owner March 24, 2022 22:48
) {
return node.actualNode.nodeName !== 'UL';

var tests = ['without cache', 'with cache'];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first 30 lines are new code to the file, the rest is whitespace difference.

// top-level node. but since the top-level node is the one
// with the cache, this only works when we are testing the full
// tree (i.e. without cache)
if (describeName === 'without cache') {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wrapping if is also new to handle the test for shadowDOM id querying (which only works when not using the cache for this particular test).

@@ -128,6 +128,7 @@ function convertAttributes(atts) {
return {
key: attributeKey,
value: attributeValue,
type: typeof att.value === 'undefined' ? 'attrExist' : 'attrValue',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added to tell the difference between the selectors [href] and [href=""] (both returned the same structure of value: '' even though one is only checking existence of the attribute and the other is checking for an empty string).

@straker straker changed the title fix(utils): greatly improve the speed of our querySelectorAll code fix(utils): greatly improve the speed of querySelectorAll Mar 25, 2022
* @returns {Boolean}
*/
function isGlobalSelector(exp) {
return exp.tag === '*' && !exp.attributes && !exp.id && !exp.classes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about psuedo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't cache the content of a pseudo selector, they are still considered a global selector such that *:not([foo]) will use the global selector cache and then matchesExpression to match the pseudo selector.

lib/core/utils/selector-cache.js Outdated Show resolved Hide resolved
lib/core/utils/selector-cache.js Outdated Show resolved Hide resolved
// use the last part of the expression to find nodes as it's more
// specific. e.g. for `body h1` use `h1` and not `body`
const exp = expression[expression.length - 1];
let nodes = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to start off with the entire list and then filter down. Rather than start with an empty list and then either add or remove depending on if the list is currently empty. The way you've done this now can create problems if at one point you filter down to 0 nodes half way through the operation.

For example, lets say you have a[name].foo.

  1. Start with an empty nodes array
  2. Put all a elements into nodes
  3. Filter nodes, removing any that doesn't have a name attribute
  4. Filter nodes, removing any that don't have a foo class

If during step 3, the nodes array becomes empty, in step 4 you'll get all elements with the foo class in your results. And because this isn't a complex selector, your matchExpression fallback won't catch this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

lib/core/utils/selector-cache.js Show resolved Hide resolved
// wanting to make sure the same node isn't added twice (~3 seconds
// total over using array.includes)
let matchedNodes = [];
nodeSet.forEach(node => matchedNodes.push(node));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to return an array, why bother with a set? Wouldn't it be easier to just use an array and then filter it for unique values before you return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it was actually way more performant to use a Set and convert it to an array at the end than to use an array and check if the element already exists in the array before adding.

lib/core/utils/selector-cache.js Outdated Show resolved Hide resolved
lib/core/utils/selector-cache.js Show resolved Hide resolved
test/core/utils/selector-cache.js Show resolved Hide resolved
lib/core/utils/get-flattened-tree.js Show resolved Hide resolved
lib/core/utils/selector-cache.js Outdated Show resolved Hide resolved
// use the last part of the expression to find nodes as it's more
// specific. e.g. for `body h1` use `h1` and not `body`
const exp = expression[expression.length - 1];
let nodes = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was addressed: #3423 (comment)

Copy link
Contributor Author

@straker straker May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the function no longer returns the empty array if passed in and only returns nodes that are in both arrays, it fixed the issue at the same time.

At step 3 when nothing is found, the empty array is passed to step 4, but since there are no nodes in the array the script won't add any nodes from the .foo array.

function getSharedValues(a, b) {
  return a.filter(node => b.includes(node));
}

getSharedValues([], ['a', 'b', 'c']);  // []
getSharedValues(['a', 'b', 'c'], []);  // []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is....? If somewhere along this algorithm you end up with the array getting emptied again, because of the ternary, you're not calling getSharedValues. You're doing nodes = cachedNodes, effectively.

This doesn't feel like a safe algorithm. It's hard to track. If you want this setup, have nodes be null until it is first assigned, and have your ternary be:

 nodes = nodes === null ? cachedNodes : getSharedValues(nodes, cachedNodes);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. Added a test to catch this problem now.

@@ -128,6 +128,7 @@ function convertAttributes(atts) {
return {
key: attributeKey,
value: attributeValue,
type: typeof att.value === 'undefined' ? 'attrExist' : 'attrValue',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a test I think. Should have caught that first time around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the convertSelector API does not get exported to a public axe function and is only used internally. We could add it to the _thisWillBeDeletedDoNotUse test api, but I think adding a suite of tests for it should be a separate pr.

test/core/utils/flattened-tree.js Show resolved Hide resolved
lib/core/utils/selector-cache.js Outdated Show resolved Hide resolved
// use the last part of the expression to find nodes as it's more
// specific. e.g. for `body h1` use `h1` and not `body`
const exp = expression[expression.length - 1];
let nodes = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is....? If somewhere along this algorithm you end up with the array getting emptied again, because of the ternary, you're not calling getSharedValues. You're doing nodes = cachedNodes, effectively.

This doesn't feel like a safe algorithm. It's hard to track. If you want this setup, have nodes be null until it is first assigned, and have your ternary be:

 nodes = nodes === null ? cachedNodes : getSharedValues(nodes, cachedNodes);

@straker straker merged commit 1cae5ea into develop Jun 8, 2022
@straker straker deleted the qsa-perf-2022 branch June 8, 2022 17:22
@straker
Copy link
Contributor Author

straker commented Jun 8, 2022

Tested latest changes using my compare script, tensorflow.org passed, and so did the few random sites I tested.

straker added a commit that referenced this pull request Jul 13, 2022
* fix(utils): greatly improve the speed of our querySelectorAll code

* fixes

* finalize

* tests

* fix ie11

* const...

* changes

* tie map to cache

* fix test

* remove test

* fixes

* revert rule descriptions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants