Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new_audit(hreflang): document has a valid hreflang code #3815

Merged
merged 7 commits into from
Nov 29, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions lighthouse-cli/test/fixtures/seo/seo-failure-cases.html
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@
<meta name="viewport" content="invalid-content=should_have_looked_it_up">
<!-- no <meta name="description" content=""> -->
<meta name="robots" content="nofollow, NOINDEX, all">
<!-- FAIL(hreflang): invalid language code -->
<link rel="alternate" hreflang="xx" href="https://xx.example.com" />
<!-- FAIL(hreflang): spece before a valid code -->
<link rel="alternate" href="http://example.com/" hreflang=" x-default" />
</head>
<body>
<h1>SEO</h1>
Expand Down
6 changes: 6 additions & 0 deletions lighthouse-cli/test/fixtures/seo/seo-tester.html
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,12 @@
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0, minimum-scale=1.0">
<meta name="Description" content="The premiere destination for testing your SEO audit gathering">
<!-- PASS(hreflang): valid language codes -->
<link rel="alternate" hreflang="es" href="https://lat.example.com" />
<link rel="alternate" Hreflang="en-PH" href="https://ph.example.com" />
<LINK REL="ALTERNATE" HREFLANG="ru-RU" HREF="https://ru.example.com" />
<LINK REL="alternate" HREFLANG="zh-Hans-TW" HREF="https://zh.example.com" />
<link rel="alternate" href="http://example.com/" hreflang="x-default" />
</head>
<body>
<h1>SEO</h1>
Expand Down
2 changes: 1 addition & 1 deletion lighthouse-cli/test/fixtures/static-server.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ const path = require('path');
const fs = require('fs');
const parseQueryString = require('querystring').parse;
const parseURL = require('url').parse;
const HEADER_SAFELIST = new Set(['x-robots-tag']);
const HEADER_SAFELIST = new Set(['x-robots-tag', 'link']);

const lhRootDirPath = path.join(__dirname, '../../../');

Expand Down
34 changes: 30 additions & 4 deletions lighthouse-cli/test/smokehouse/seo/expectations.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,29 @@
* Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
*/
'use strict';
const BASE_URL = 'http://localhost:10200/seo/';

function headersParam(headers) {
return headers
.map(({name, value}) => `extra_header=${name}:${encodeURI(value)}`)
.join('&');
}

const failureHeaders = headersParam([{
name: 'x-robots-tag',
value: 'none',
}, {
name: 'link',
value: '<http://example.com>;rel="alternate";hreflang="xx"',
}]);

/**
* Expected Lighthouse audit values for seo tests
*/
module.exports = [
{
initialUrl: 'http://localhost:10200/seo/seo-tester.html',
url: 'http://localhost:10200/seo/seo-tester.html',
initialUrl: BASE_URL + 'seo-tester.html',
url: BASE_URL + 'seo-tester.html',
audits: {
'viewport': {
score: true,
Expand All @@ -31,11 +46,14 @@ module.exports = [
'is-crawlable': {
score: true,
},
'hreflang': {
score: true,
},
},
},
{
initialUrl: 'http://localhost:10200/seo/seo-failure-cases.html?status_code=403&extra_header=x-robots-tag:none',
url: 'http://localhost:10200/seo/seo-failure-cases.html?status_code=403&extra_header=x-robots-tag:none',
initialUrl: BASE_URL + 'seo-failure-cases.html?status_code=403&' + failureHeaders,
url: BASE_URL + 'seo-failure-cases.html?status_code=403&' + failureHeaders,
audits: {
'viewport': {
score: false,
Expand Down Expand Up @@ -72,6 +90,14 @@ module.exports = [
},
},
},
'hreflang': {
score: false,
details: {
items: {
length: 3,
},
},
},
},
},
];
112 changes: 112 additions & 0 deletions lighthouse-core/audits/seo/hreflang.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
/**
* @license Copyright 2017 Google Inc. All Rights Reserved.
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
*/
'use strict';

const Audit = require('../audit');
const LinkHeader = require('http-link-header');
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new dependency - https://github.com/jhermsmeier/node-http-link-header. It's a simple (~300LOC), well tested, link header value parser with no dependencies.

const VALID_LANGS = importValidLangs();
const LINK_HEADER = 'link';
const NO_LANGUAGE = 'x-default';

/**
* Import list of valid languages from axe core without including whole axe-core package
* This is a huge array of language codes that can be stored more efficiently if we will need to
* shrink the bundle size.
*/
function importValidLangs() {
const axeCache = global.axe;
global.axe = {utils: {}};
require('axe-core/lib/commons/utils/valid-langs.js');
const validLangs = global.axe.utils.validLangs();
global.axe = axeCache;

return validLangs;
}

/**
* @param {string} hreflang
* @returns {boolean}
*/
function isValidHreflang(hreflang) {
if (hreflang.toLowerCase() === NO_LANGUAGE) {
return true;
}

// hreflang can consist of language-script-region, we are validating only language
const [lang] = hreflang.split('-');
return VALID_LANGS.includes(lang.toLowerCase());
}

/**
* @param {string} headerValue
* @returns {boolean}
*/
function headerHasValidHreflangs(headerValue) {
const linkHeader = LinkHeader.parse(headerValue);

return linkHeader.get('rel', 'alternate')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always returns an array? strange api 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single Link header can have multiple rel=alternate links (e.g. <http://es.example.com/>; rel="alternate"; hreflang="es",<http://fr.example.com/>; rel="alternate"; Hreflang="fr-be"), so IMO it makes sense to always return an array.

.every(link => link.hreflang && isValidHreflang(link.hreflang));
}

class Hreflang extends Audit {
/**
* @return {!AuditMeta}
*/
static get meta() {
return {
name: 'hreflang',
description: 'Document has a valid `hreflang`',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to be a property of an alternate link rather than the entire document, or am I misunderstanding?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! @rviscomi WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while since I wrote the text, but I think I was trying to strike a balance between the hreflang being used as both an HTTP header value and HTML attribute value. Other audits also use "document" as the owner of the thing being audited, for example:

  • Document has a <meta name="viewport"> tag with width or initial-scale.
  • Document uses legible font sizes.
  • Document has a <title> element.
  • Document has a meta description.
  • Document has a valid rel=canonical.
  • Document avoids plugins.

So I'm ok with the text as it is currently written but open to suggestions.

failureDescription: 'Document doesn\'t have a valid `hreflang`',
helpText: 'hreflang allows crawlers to discover alternate translations of the ' +
'page content. [Learn more]' +
'(https://support.google.com/webmasters/answer/189077).',
requiredArtifacts: ['Hreflang'],
};
}

/**
* @param {!Artifacts} artifacts
* @return {!AuditResult}
*/
static audit(artifacts) {
const devtoolsLogs = artifacts.devtoolsLogs[Audit.DEFAULT_PASS];

return artifacts.requestMainResource(devtoolsLogs)
.then(mainResource => {
/** @type {Array<{source: string|{type: string, snippet: string}}>} */
const invalidHreflangs = [];

if (artifacts.Hreflang) {
artifacts.Hreflang.forEach(({href, hreflang}) => {
if (!isValidHreflang(hreflang)) {
invalidHreflangs.push({
source: {
type: 'node',
snippet: `<link name="alternate" hreflang="${hreflang}" href="${href}" />`,
},
});
}
});
}

mainResource.responseHeaders
.filter(h => h.name.toLowerCase() === LINK_HEADER && !headerHasValidHreflangs(h.value))
.forEach(h => invalidHreflangs.push({source: `${h.name}: ${h.value}`}));

const headings = [
{key: 'source', itemType: 'code', text: 'Source'},
];
const details = Audit.makeTableDetails(headings, invalidHreflangs);

return {
rawValue: invalidHreflangs.length === 0,
details,
};
});
}
}

module.exports = Hreflang;
2 changes: 2 additions & 0 deletions lighthouse-core/config/default.js
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ module.exports = {
'seo/meta-description',
'seo/crawlable-links',
'seo/meta-robots',
'seo/hreflang',
],
},
{
Expand Down Expand Up @@ -152,6 +153,7 @@ module.exports = {
'seo/http-status-code',
'seo/link-text',
'seo/is-crawlable',
'seo/hreflang',
],

groups: {
Expand Down
3 changes: 3 additions & 0 deletions lighthouse-core/config/seo.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@ module.exports = {
'seo/meta-description',
'seo/crawlable-links',
'seo/meta-robots',
'seo/hreflang',
],
}],
audits: [
'seo/meta-description',
'seo/http-status-code',
'seo/link-text',
'seo/is-crawlable',
'seo/hreflang',
],
groups: {
'seo-mobile': {
Expand Down Expand Up @@ -47,6 +49,7 @@ module.exports = {
{id: 'http-status-code', weight: 1, group: 'seo-crawl'},
{id: 'link-text', weight: 1, group: 'seo-content'},
{id: 'is-crawlable', weight: 1, group: 'seo-crawl'},
{id: 'hreflang', weight: 1, group: 'seo-content'},
],
},
},
Expand Down
32 changes: 32 additions & 0 deletions lighthouse-core/gather/gatherers/seo/hreflang.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/**
* @license Copyright 2017 Google Inc. All Rights Reserved.
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
*/
'use strict';

const Gatherer = require('../gatherer');

class Hreflang extends Gatherer {
/**
* @param {{driver: !Object}} options Run options
* @return {!Promise<!Array<{href: string, hreflang: string}>>} Array with hreflang and href values of all link[rel=alternate] nodes found in HEAD
*/
afterPass(options) {
const driver = options.driver;

return driver.querySelectorAll('head link[rel="alternate" i][hreflang]')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

querySelectorAll returns null instead of []!? we should fix that 😆

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, it always returns an array. I've updated the code (and jsdoc) accordingly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, no worries!

.then(nodes => Promise.all(nodes.map(node =>
Promise.all([node.getAttribute('href'), node.getAttribute('hreflang')]))
)
).then(attributeValues => attributeValues &&
attributeValues.map(values => {
const [href, hreflang] = values;
return {href, hreflang};
})
);
}
}

module.exports = Hreflang;

Loading