-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new_audit(hreflang): document has a valid hreflang code #3815
Changes from 3 commits
8b5f8d1
fdba9e4
c9f1afe
6860d84
6c3ff27
95699f5
a55fccd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,11 +31,14 @@ module.exports = [ | |
'is-crawlable': { | ||
score: true, | ||
}, | ||
'hreflang': { | ||
score: true, | ||
}, | ||
}, | ||
}, | ||
{ | ||
initialUrl: 'http://localhost:10200/seo/seo-failure-cases.html?status_code=403&extra_header=x-robots-tag:none', | ||
url: 'http://localhost:10200/seo/seo-failure-cases.html?status_code=403&extra_header=x-robots-tag:none', | ||
initialUrl: 'http://localhost:10200/seo/seo-failure-cases.html?status_code=403&extra_header=x-robots-tag:none&extra_header=link:' + encodeURI('<http://example.com>;rel="alternate";hreflang="xx"'), | ||
url: 'http://localhost:10200/seo/seo-failure-cases.html?status_code=403&extra_header=x-robots-tag:none&extra_header=link:' + encodeURI('<http://example.com>;rel="alternate";hreflang="xx"'), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is getting quite nasty. Maybe we should move headers to an external file (e.g. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@paulirish just brought up that sending the headers as request headers won't really work since smokehouse invokes lighthouse with just a URL...there's no way to add request headers (until/if #2746 :) This is a
|
||
audits: { | ||
'viewport': { | ||
score: false, | ||
|
@@ -72,6 +75,14 @@ module.exports = [ | |
}, | ||
}, | ||
}, | ||
'hreflang': { | ||
score: false, | ||
details: { | ||
items: { | ||
length: 3, | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
]; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
/** | ||
* @license Copyright 2017 Google Inc. All Rights Reserved. | ||
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 | ||
* Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. | ||
*/ | ||
'use strict'; | ||
|
||
const Audit = require('../audit'); | ||
const LinkHeader = require('http-link-header'); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. new dependency - https://github.com/jhermsmeier/node-http-link-header. It's a simple (~300LOC), well tested, link header value parser with no dependencies. |
||
const axeCore = require('axe-core'); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. turns out I can't import 'axe-core' as it wasn't imported anywhere else before and since it's huge, it blows our bundle budget 🤔 Since I'm using only a tiny part of 'axe-core' it seems like tree shaking should help here, but unfortunately we are not using import/export syntax ATM(which is required for tree shaking to work). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any chance we can just import the languages file?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. meta observation: there's only ~17k possible 3-character codes and ~8k of them are valid how much value is the list really providing over a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Didn't realize 45% of 3-character codes and 25% of 2-character codes are valid - I agree that it's a lot, but IMO it still leaves a huge margin for people to make "catchable" mistakes. We may also extend this audit with region and script codes validation in the future. @rviscomi WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not a fan of regex pattern matching as a substitute for whitelist checking. Too many false positives. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One option is putting it in a different data structure when browserifying. A quick test of a pretty naive trie got it down to about 1/2 the size raw, 1/5 the size after gzip (19915 -> 3751 bytes). Being more clever (char codes, exclusions vs inclusions) can get the trie down to about 1/4 the size of the current file, but the gzip size doesn't really budge after that first drop. Probably not worth pursuing at this point, but maybe worth leaving a note for when we go back looking for the low hanging fruit of shrinking bundle size. |
||
const LINK_HEADER = 'link'; | ||
const NO_LANGUAGE = 'x-default'; | ||
|
||
/** | ||
* @param {String} hreflang | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. lowercase There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated! Can't wait for your PR to land 👌 |
||
* @returns {Boolean} | ||
*/ | ||
function isValidHreflang(hreflang) { | ||
if (hreflang.toLowerCase() === NO_LANGUAGE) { | ||
return true; | ||
} | ||
|
||
// hreflang can consist of language-script-region, we are validating only language | ||
const [lang] = hreflang.split('-'); | ||
return axeCore.utils.validLangs().includes(lang.toLowerCase()); | ||
} | ||
|
||
/** | ||
* @param {String} headerValue | ||
* @returns {Boolean} | ||
*/ | ||
function headerHasValidHreflangs(headerValue) { | ||
const linkHeader = LinkHeader.parse(headerValue); | ||
|
||
return linkHeader.get('rel', 'alternate') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. always returns an array? strange api 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Single |
||
.every(link => link.hreflang && isValidHreflang(link.hreflang)); | ||
} | ||
|
||
class Hreflang extends Audit { | ||
/** | ||
* @return {!AuditMeta} | ||
*/ | ||
static get meta() { | ||
return { | ||
name: 'hreflang', | ||
description: 'Document has a valid `hreflang`', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. seems to be a property of an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point! @rviscomi WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's been a while since I wrote the text, but I think I was trying to strike a balance between the hreflang being used as both an HTTP header value and HTML attribute value. Other audits also use "document" as the owner of the thing being audited, for example:
So I'm ok with the text as it is currently written but open to suggestions. |
||
failureDescription: 'Document doesn\'t have a valid `hreflang`', | ||
helpText: 'hreflang allows crawlers to discover alternate translations of the ' + | ||
'page content. [Learn more]' + | ||
'(https://support.google.com/webmasters/answer/189077).', | ||
requiredArtifacts: ['Hreflang'], | ||
}; | ||
} | ||
|
||
/** | ||
* @param {!Artifacts} artifacts | ||
* @return {!AuditResult} | ||
*/ | ||
static audit(artifacts) { | ||
const devtoolsLogs = artifacts.devtoolsLogs[Audit.DEFAULT_PASS]; | ||
|
||
return artifacts.requestMainResource(devtoolsLogs) | ||
.then(mainResource => { | ||
/** @type {Array<{source: String|Object}>} */ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated, although it's actually |
||
const invalidHreflangs = []; | ||
|
||
if (artifacts.Hreflang) { | ||
artifacts.Hreflang.forEach(({href, hreflang}) => { | ||
if (!isValidHreflang(hreflang)) { | ||
invalidHreflangs.push({ | ||
source: { | ||
type: 'node', | ||
snippet: `<link name="alternate" hreflang="${hreflang}" href="${href}" />`, | ||
}, | ||
}); | ||
} | ||
}); | ||
} | ||
|
||
mainResource.responseHeaders | ||
.filter(h => h.name.toLowerCase() === LINK_HEADER && !headerHasValidHreflangs(h.value)) | ||
.forEach(h => invalidHreflangs.push({source: `${h.name}: ${h.value}`})); | ||
|
||
const headings = [ | ||
{key: 'source', itemType: 'code', text: 'Source'}, | ||
]; | ||
const details = Audit.makeTableDetails(headings, invalidHreflangs); | ||
|
||
return { | ||
rawValue: invalidHreflangs.length === 0, | ||
details, | ||
}; | ||
}); | ||
} | ||
} | ||
|
||
module.exports = Hreflang; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
/** | ||
* @license Copyright 2017 Google Inc. All Rights Reserved. | ||
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 | ||
* Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. | ||
*/ | ||
'use strict'; | ||
|
||
const Gatherer = require('../gatherer'); | ||
|
||
class Hreflang extends Gatherer { | ||
/** | ||
* @param {{driver: !Object}} options Run options | ||
* @return {!Promise<?Array<string>>} Array with hreflang and href values of all link[rel=alternate] nodes found in HEAD, or null | ||
*/ | ||
afterPass(options) { | ||
const driver = options.driver; | ||
|
||
return driver.querySelectorAll('head link[rel="alternate" i][hreflang]') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My bad, it always returns an array. I've updated the code (and jsdoc) accordingly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh ok, no worries! |
||
.then(nodes => nodes && | ||
Promise.all( | ||
nodes.map(node => Promise.all([node.getAttribute('href'), node.getAttribute('hreflang')])) | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have strong opinions here, so I'll defer to others evaluate async seems fine to me, but this also doesn't really seem that bad either |
||
).then(attributeValues => attributeValues && | ||
attributeValues.map(values => { | ||
const [href, hreflang] = values; | ||
return {href, hreflang}; | ||
}) | ||
); | ||
} | ||
} | ||
|
||
module.exports = Hreflang; | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,164 @@ | ||
/** | ||
* @license Copyright 2017 Google Inc. All Rights Reserved. | ||
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 | ||
* Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. | ||
*/ | ||
'use strict'; | ||
|
||
const HreflangAudit = require('../../../audits/seo/hreflang.js'); | ||
const assert = require('assert'); | ||
|
||
/* eslint-env mocha */ | ||
|
||
describe('SEO: Document has valid hreflang code', () => { | ||
it('fails when language code provided in hreflang via link element is invalid', () => { | ||
const hreflangValues = [ | ||
'xx', | ||
'XX-be', | ||
'XX-be-Hans', | ||
'', | ||
' es', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this invalid because of the whitespace? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, AFAIK it should be invalid in this case |
||
]; | ||
|
||
const allRuns = hreflangValues.map(hreflangValue => { | ||
const mainResource = { | ||
responseHeaders: [], | ||
}; | ||
const artifacts = { | ||
devtoolsLogs: {[HreflangAudit.DEFAULT_PASS]: []}, | ||
requestMainResource: () => Promise.resolve(mainResource), | ||
Hreflang: [{ | ||
hreflang: hreflangValue, | ||
href: 'https://example.com', | ||
}], | ||
}; | ||
|
||
return HreflangAudit.audit(artifacts).then(auditResult => { | ||
assert.equal(auditResult.rawValue, false); | ||
assert.equal(auditResult.details.items.length, 1); | ||
}); | ||
}); | ||
|
||
return Promise.all(allRuns); | ||
}); | ||
|
||
it('succeeds when language code provided via link element is valid', () => { | ||
const mainResource = { | ||
responseHeaders: [], | ||
}; | ||
const artifacts = { | ||
devtoolsLogs: {[HreflangAudit.DEFAULT_PASS]: []}, | ||
requestMainResource: () => Promise.resolve(mainResource), | ||
Hreflang: [ | ||
{hreflang: 'pl'}, | ||
{hreflang: 'nl-be'}, | ||
{hreflang: 'zh-Hans'}, | ||
{hreflang: 'x-default'}, | ||
{hreflang: 'FR-BE'}, | ||
], | ||
}; | ||
|
||
return HreflangAudit.audit(artifacts).then(auditResult => { | ||
assert.equal(auditResult.rawValue, true); | ||
}); | ||
}); | ||
|
||
it('succeeds when there are no rel=alternate link elements nor headers', () => { | ||
const mainResource = { | ||
responseHeaders: [], | ||
}; | ||
const artifacts = { | ||
devtoolsLogs: {[HreflangAudit.DEFAULT_PASS]: []}, | ||
requestMainResource: () => Promise.resolve(mainResource), | ||
Hreflang: null, | ||
}; | ||
|
||
return HreflangAudit.audit(artifacts).then(auditResult => { | ||
assert.equal(auditResult.rawValue, true); | ||
}); | ||
}); | ||
|
||
it('fails when language code provided in hreflang via header is invalid', () => { | ||
const linkHeaders = [ | ||
[ | ||
{name: 'Link', value: '<http://es.example.com/>; rel="alternate"; hreflang="xx"'}, | ||
], | ||
[ | ||
{name: 'link', value: '<http://es.example.com/>; rel="alternate"; hreflang=""'}, | ||
], | ||
[ | ||
{name: 'LINK', value: '<http://es.example.com/>; rel="alternate"'}, | ||
], | ||
[ | ||
{name: 'Link', value: '<http://es.example.com/>; rel="alternate"; hreflang="es",<http://xx.example.com/>; rel="alternate"; Hreflang="xx"'}, | ||
], | ||
[ | ||
{name: 'link', value: '<http://es.example.com/>; rel="alternate"; hreflang="es"'}, | ||
{name: 'Link', value: '<http://xx.example.com/>; rel="alternate"; hreflang="x"'}, | ||
], | ||
]; | ||
|
||
const allRuns = linkHeaders.map(headers => { | ||
const mainResource = { | ||
responseHeaders: headers, | ||
}; | ||
const artifacts = { | ||
devtoolsLogs: {[HreflangAudit.DEFAULT_PASS]: []}, | ||
requestMainResource: () => Promise.resolve(mainResource), | ||
Hreflang: null, | ||
}; | ||
|
||
return HreflangAudit.audit(artifacts).then(auditResult => { | ||
assert.equal(auditResult.rawValue, false); | ||
assert.equal(auditResult.details.items.length, 1); | ||
}); | ||
}); | ||
|
||
return Promise.all(allRuns); | ||
}); | ||
|
||
it('succeeds when language codes provided via Link header are valid', () => { | ||
const mainResource = { | ||
responseHeaders: [ | ||
{name: 'link', value: ''}, | ||
{name: 'link', value: 'garbage'}, | ||
{name: 'link', value: '<http://es.example.com/>; rel="example"; hreflang="xx"'}, | ||
{name: 'link', value: '<http://es.example.com/>; rel="alternate"; hreflang="es"'}, | ||
{name: 'Link', value: '<http://fr.example.com/>; rel="alternate"; hreflang="fr-be"'}, | ||
{name: 'LINK', value: '<http://es.example.com/>; rel="alternate"; hreflang="es",<http://fr.example.com/>; rel="alternate"; Hreflang="fr-be"'}, | ||
], | ||
}; | ||
const artifacts = { | ||
devtoolsLogs: {[HreflangAudit.DEFAULT_PASS]: []}, | ||
requestMainResource: () => Promise.resolve(mainResource), | ||
Hreflang: null, | ||
}; | ||
|
||
return HreflangAudit.audit(artifacts).then(auditResult => { | ||
assert.equal(auditResult.rawValue, true); | ||
}); | ||
}); | ||
|
||
it('returns all failing items', () => { | ||
const mainResource = { | ||
responseHeaders: [ | ||
{name: 'link', value: '<http://xx1.example.com/>; rel="alternate"; hreflang="xx1"'}, | ||
{name: 'Link', value: '<http://xx2.example.com/>; rel="alternate"; hreflang="xx2"'}, | ||
], | ||
}; | ||
const artifacts = { | ||
devtoolsLogs: {[HreflangAudit.DEFAULT_PASS]: []}, | ||
requestMainResource: () => Promise.resolve(mainResource), | ||
Hreflang: [{ | ||
hreflang: 'xx3', | ||
}, { | ||
hreflang: 'xx4', | ||
}], | ||
}; | ||
|
||
return HreflangAudit.audit(artifacts).then(auditResult => { | ||
assert.equal(auditResult.rawValue, false); | ||
assert.equal(auditResult.details.items.length, 4); | ||
}); | ||
}); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mind starting to mark these with
PASS(audit-name): details if possible and succinct
(orFAIL
) like e.g. inlighthouse/lighthouse-cli/test/fixtures/byte-efficiency/tester.html
Lines 69 to 73 in 6cd1b41
helpful for going back later and knowing what touches what test/audit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Added two comments here and one in the
seo-tester.html