Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: no-missing-label-refs should not crash on undefined labels #290

Merged
merged 5 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 40 additions & 69 deletions src/rules/no-missing-label-refs.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,106 +21,77 @@ import { findOffsets, illegalShorthandTailPattern } from "../util.js";
// Helpers
//-----------------------------------------------------------------------------

const labelPatterns = [
// [foo][bar]
/\]\[([^\]]+)\]/u,

// [foo][]
/(\]\[\])/u,

// [foo]
/\[([^\]]+)\]/u,
];

const shorthandTailPattern = /\]\[\]$/u;

/**
* Finds missing references in a node.
* @param {TextNode} node The node to check.
* @param {string} docText The text of the node.
* @param {string} nodeText The text of the node.
* @returns {Array<{label:string,position:Position}>} The missing references.
*/
function findMissingReferences(node, docText) {
function findMissingReferences(node, nodeText) {
const missing = [];
let startIndex = 0;
const offset = node.position.start.offset;
const nodeStartLine = node.position.start.line;
const nodeStartColumn = node.position.start.column;

/*
* This loop works by searching the string inside the node for the next
* label reference. If there is, it reports an error.
* It then moves the start index to the end of the label reference and
* continues searching the text until the end of the text is found.
* Matches substrings like "[foo]", "[]", "[foo][bar]", "[foo][]", "[][bar]", or "[][]".
* `left` is the content between the first brackets. It can be empty.
* `right` is the content between the second brackets. It can be empty, and it can be undefined.
*/
while (startIndex < node.value.length) {
const value = node.value.slice(startIndex);
const labelPattern = /\[(?<left>[^\]]*)\](?:\[(?<right>[^\]]*)\])?/dgu;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull this regex out into the module scope? There doesn't seem to be anything context-sensitive that would require it to be in here. I'd rather parse regexes once vs. every time the function is called.

Also, it would be helpful to add a comment explaining what the regex does and how it is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it would be helpful to add a comment explaining what the regex does and how it is used.

Added in 60d5735.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull this regex out into the module scope? There doesn't seem to be anything context-sensitive that would require it to be in here. I'd rather parse regexes once vs. every time the function is called.

The g flag makes it mutable - each .exec() modifies lastIndex. I think it's a good practice to keep those regex instances scoped to the search instance that's using them, especially if they are used in a loop, because a small change in the code, like adding an early exit in the loop, can cause bugs that are difficult to reproduce and figure out.


const match = labelPatterns.reduce((previous, pattern) => {
if (previous) {
return previous;
}

return value.match(pattern);
}, null);

// check for array instead of null to appease TypeScript
if (!Array.isArray(match)) {
break;
}
let match;

/*
* This loop searches the text inside the node for sequences that
* look like label references and reports an error for each one found.
*/
while ((match = labelPattern.exec(nodeText))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining how this loop works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in bcee538.

// skip illegal shorthand tail -- handled by no-invalid-label-refs
if (illegalShorthandTailPattern.test(match[0])) {
startIndex += match.index + match[0].length;
continue;
}

// Calculate the match index relative to just the node.
let columnStart = startIndex + match.index;
let label = match[1];

// need to look backward to get the label
if (shorthandTailPattern.test(match[0])) {
// adding 1 to the index just in case we're in a ![] and need to skip the !.
const startFrom = offset + startIndex + 1;
const lastOpenBracket = docText.lastIndexOf("[", startFrom);

if (lastOpenBracket === -1) {
startIndex += match.index + match[0].length;
continue;
}

label = docText
.slice(lastOpenBracket, match.index + match[0].length)
.match(/!?\[([^\]]+)\]/u)?.[1];
columnStart -= label.length;
} else if (match[0].startsWith("]")) {
columnStart += 2;
const { left, right } = match.groups;

// `[][]` or `[]`
if (!left && !right) {
continue;
}

let label, labelIndices;

if (right) {
label = right;
labelIndices = match.indices.groups.right;
} else {
columnStart += 1;
label = left;
labelIndices = match.indices.groups.left;
}

const { lineOffset: startLineOffset, columnOffset: startColumnOffset } =
findOffsets(node.value, columnStart);

const startLine = nodeStartLine + startLineOffset;
const startColumn = nodeStartColumn + startColumnOffset;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Columns were wrong in some cases, for two reasons:

  1. TextNode#value is not always the raw text of the node - some lines are trimmed left, and some aren't.
  2. findOffsets(): columnOffset is the column offset only when it's on the same line. For other lines, columnOffset is 0-based column number, not offset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point on 2. Can you add comments to findOffsets() to document that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note on how columnOffset should be used in 63d3b4d.

findOffsets(nodeText, labelIndices[0]);
const { lineOffset: endLineOffset, columnOffset: endColumnOffset } =
findOffsets(nodeText, labelIndices[1]);

missing.push({
label: label.trim(),
position: {
start: {
line: startLine,
column: startColumn,
line: nodeStartLine + startLineOffset,
column:
startLineOffset > 0
? startColumnOffset + 1
: nodeStartColumn + startColumnOffset,
},
end: {
line: startLine,
column: startColumn + label.length,
line: nodeStartLine + endLineOffset,
column:
endLineOffset > 0
? endColumnOffset + 1
: nodeStartColumn + endColumnOffset,
},
},
});

startIndex += match.index + match[0].length;
}

return missing;
Expand Down Expand Up @@ -164,7 +135,7 @@ export default {

text(node) {
allMissingReferences.push(
...findMissingReferences(node, sourceCode.text),
...findMissingReferences(node, sourceCode.getText(node)),
);
},

Expand Down
3 changes: 3 additions & 0 deletions src/util.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ export const illegalShorthandTailPattern = /\]\[\s+\]$/u;
* @param {string} text The text to search.
* @param {number} offset The offset to find.
* @returns {{lineOffset:number,columnOffset:number}} The location of the offset.
* Note that `columnOffset` should be used as an offset to the column number
* of the given text in the source code only when `lineOffset` is 0.
* Otherwise, it should be used as a 0-based column number in the source code.
*/
export function findOffsets(text, offset) {
let lineOffset = 0;
Expand Down
182 changes: 182 additions & 0 deletions tests/rules/no-missing-label-refs.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,15 @@ ruleTester.run("no-missing-label-refs", rule, {
"[ foo ][]\n\n[foo]: http://bar.com/image.jpg",
"[foo][ ]\n\n[foo]: http://bar.com/image.jpg",
"[\nfoo\n][\n]\n\n[foo]: http://bar.com/image.jpg",
"[]",
"][]",
"[][]",
"[] []",
"[foo",
"foo]",
"foo][bar]\n\n[bar]: http://bar.com",
"foo][bar][baz]\n\n[baz]: http://baz.com",
"[][foo]\n\n[foo]: http://foo.com",
],
invalid: [
{
Expand Down Expand Up @@ -149,5 +158,178 @@ ruleTester.run("no-missing-label-refs", rule, {
},
],
},
{
code: "foo][bar]\n\n[baz]: http://baz.com",
errors: [
{
messageId: "notFound",
data: { label: "bar" },
line: 1,
column: 6,
endLine: 1,
endColumn: 9,
},
],
},
{
code: "foo][bar][baz]\n\n[bar]: http://bar.com",
errors: [
{
messageId: "notFound",
data: { label: "baz" },
line: 1,
column: 11,
endLine: 1,
endColumn: 14,
},
],
},
{
code: "[foo]\n[foo][bar]",
errors: [
{
messageId: "notFound",
data: { label: "foo" },
line: 1,
column: 2,
endLine: 1,
endColumn: 5,
},
{
messageId: "notFound",
data: { label: "bar" },
line: 2,
column: 7,
endLine: 2,
endColumn: 10,
},
],
},
Comment on lines +187 to +207
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was another bug: the regex that looks for [foo][bar] matched first and then the loop just skipped the previous code so only one error was reported.

{
code: "[Foo][foo]\n[Bar][]",
errors: [
{
messageId: "notFound",
data: { label: "foo" },
line: 1,
column: 7,
endLine: 1,
endColumn: 10,
},
{
messageId: "notFound",
data: { label: "Bar" },
line: 2,
column: 2,
endLine: 2,
endColumn: 5,
},
],
},
{
code: "[Foo][]\n[Bar][]",
errors: [
{
messageId: "notFound",
data: { label: "Foo" },
line: 1,
column: 2,
endLine: 1,
endColumn: 5,
},
{
messageId: "notFound",
data: { label: "Bar" },
line: 2,
column: 2,
endLine: 2,
endColumn: 5,
},
],
},
{
code: "[Foo][foo]\n[Bar][bar]\n[Hoge][]",
errors: [
{
messageId: "notFound",
data: { label: "foo" },
line: 1,
column: 7,
endLine: 1,
endColumn: 10,
},
{
messageId: "notFound",
data: { label: "bar" },
line: 2,
column: 7,
endLine: 2,
endColumn: 10,
},
{
messageId: "notFound",
data: { label: "Hoge" },
line: 3,
column: 2,
endLine: 3,
endColumn: 6,
},
],
},
{
code: "[Foo][]\n[Bar][bar]\n[Hoge][hoge]",
errors: [
{
messageId: "notFound",
data: { label: "Foo" },
line: 1,
column: 2,
endLine: 1,
endColumn: 5,
},
{
messageId: "notFound",
data: { label: "bar" },
line: 2,
column: 7,
endLine: 2,
endColumn: 10,
},
{
messageId: "notFound",
data: { label: "hoge" },
line: 3,
column: 8,
endLine: 3,
endColumn: 12,
},
],
},
{
code: "[][foo]",
errors: [
{
messageId: "notFound",
data: { label: "foo" },
line: 1,
column: 4,
endLine: 1,
endColumn: 7,
},
],
},
{
code: " foo\n [bar]",
errors: [
{
messageId: "notFound",
data: { label: "bar" },
line: 2,
column: 4,
endLine: 2,
endColumn: 7,
},
],
},
],
});