Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Desktop: Fixes #10946: Stop crashing HTML/MD importer when content has link with very long name #10947

Closed
wants to merge 12 commits into from
23 changes: 23 additions & 0 deletions packages/lib/services/interop/InteropService_Importer_Md.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -195,4 +195,27 @@ describe('InteropService_Importer_Md', () => {
// The invalid image is imported as-is
expect(resource.title).toBe('invalid-image.jpg');
});

it.each([
'https://example.com',
'http://example.com',
'https://example.com/image.png',
'mailto:[email protected]?subject=test',
'onenote:Title of the note',
'tel:554799992292910',
])('should filter paths to external files', async (link: string) => {
const importer = new InteropService_Importer_Md();
expect(importer.isLinkToLocalFile(link)).toBe(false);
});

it.each([
'asdfasf',
'asdfasf.png',
'base/path/asdfasf.png',
'./base/path/asdfasf.png',
'/base/path/asdfasf.pdf',
])('should consider local file', async (link: string) => {
const importer = new InteropService_Importer_Md();
expect(importer.isLinkToLocalFile(link)).toBe(true);
});
});
85 changes: 47 additions & 38 deletions packages/lib/services/interop/InteropService_Importer_Md.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ import htmlUtils from '../../htmlUtils';
import { unique } from '../../ArrayUtils';
const { pregQuote } = require('../../string-utils-common');
import { MarkupToHtml } from '@joplin/renderer';
import { isDataUrl } from '@joplin/utils/url';
import { stripBom } from '../../string-utils';

export default class InteropService_Importer_Md extends InteropService_Importer_Base {
Expand Down Expand Up @@ -112,48 +111,47 @@ export default class InteropService_Importer_Md extends InteropService_Importer_
for (const encodedLink of fileLinks) {
const link = decodeURI(encodedLink);

if (isDataUrl(link)) {
// Just leave it as it is. We could potentially import
// it as a resource but for now that's good enough.
} else {
// Handle anchor links appropriately
const trimmedLink = this.trimAnchorLink(link);
const attachmentPath = filename(`${dirname(filePath)}/${trimmedLink}`, true);
const pathWithExtension = `${attachmentPath}.${fileExtension(trimmedLink)}`;
const stat = await shim.fsDriver().stat(pathWithExtension);
const isDir = stat ? stat.isDirectory() : false;
if (stat && !isDir) {
const supportedFileExtension = this.metadata().fileExtensions;
const resolvedPath = shim.fsDriver().resolve(pathWithExtension);
let id = '';
// If the link looks like a note, then import it
if (supportedFileExtension.indexOf(fileExtension(trimmedLink).toLowerCase()) >= 0) {
// If the note hasn't been imported yet, do so now
if (!this.importedNotes[resolvedPath]) {
await this.importFile(resolvedPath, parentFolderId);
}

id = this.importedNotes[resolvedPath].id;
} else {
const resource = await shim.createResourceFromPath(pathWithExtension, null, { resizeLargeImages: 'never' });
id = resource.id;
// Just leave it as it is. We could potentially import 'data:' links
// as a resource but for now that's good enough.
if (!this.isLinkToLocalFile(link)) continue;

// Handle anchor links appropriately
const trimmedLink = this.trimAnchorLink(link);
const attachmentPath = filename(`${dirname(filePath)}/${trimmedLink}`, true);
const pathWithExtension = `${attachmentPath}.${fileExtension(trimmedLink)}`;
const stat = await shim.fsDriver().stat(pathWithExtension);
const isDir = stat ? stat.isDirectory() : false;
if (stat && !isDir) {
const supportedFileExtension = this.metadata().fileExtensions;
const resolvedPath = shim.fsDriver().resolve(pathWithExtension);
let id = '';
// If the link looks like a note, then import it
if (supportedFileExtension.indexOf(fileExtension(trimmedLink).toLowerCase()) >= 0) {
// If the note hasn't been imported yet, do so now
if (!this.importedNotes[resolvedPath]) {
await this.importFile(resolvedPath, parentFolderId);
}

// The first is a normal link, the second is supports the <link> and [](<link with spaces>) syntax
// Only opening patterns are consider in order to cover all occurrences
// We need to use the encoded link as well because some links (link's with spaces)
// will appear encoded in the source. Other links (unicode chars) will not
const linksToReplace = [this.trimAnchorLink(link), this.trimAnchorLink(encodedLink)];
id = this.importedNotes[resolvedPath].id;
} else {
const resource = await shim.createResourceFromPath(pathWithExtension, null, { resizeLargeImages: 'never' });
id = resource.id;
}

// The first is a normal link, the second is supports the <link> and [](<link with spaces>) syntax
// Only opening patterns are consider in order to cover all occurrences
// We need to use the encoded link as well because some links (link's with spaces)
// will appear encoded in the source. Other links (unicode chars) will not
const linksToReplace = [this.trimAnchorLink(link), this.trimAnchorLink(encodedLink)];

for (let j = 0; j < linksToReplace.length; j++) {
const linkToReplace = pregQuote(linksToReplace[j]);
for (let j = 0; j < linksToReplace.length; j++) {
const linkToReplace = pregQuote(linksToReplace[j]);

// Markdown links
updated = markdownUtils.replaceResourceUrl(updated, linkToReplace, id);
// Markdown links
updated = markdownUtils.replaceResourceUrl(updated, linkToReplace, id);

// HTML links
updated = htmlUtils.replaceResourceUrl(updated, linkToReplace, id);
}
// HTML links
updated = htmlUtils.replaceResourceUrl(updated, linkToReplace, id);
}
}
}
Expand Down Expand Up @@ -184,4 +182,15 @@ export default class InteropService_Importer_Md extends InteropService_Importer_

return this.importedNotes[resolvedPath];
}

public isLinkToLocalFile(path: string) {
try {
new URL(path);
Copy link
Owner

@laurent22 laurent22 Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way I've just checked c:\\test.txt or c:/test.txt and new URL doesn't throw an error which means they won't be considered link to files. Not to mention that file:///path/to/file.txt is both a valid URL and valid link to a local file. Actually I'm wondering what we're doing here testing for URLs when we are not looking for URLs?

Are there no other reasonable ways to know if something look like a valid local path? I feel like this is a solved problem and there must be plenty of libraries or code snippets out there that can do this. Maybe even in Joplin source code we already have this.

} catch (error) {
// if it fails it probably is a relative path (local file)
if (error && error.code === 'ERR_INVALID_URL') return true;
throw error;
}
return false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your logic now is that if the protocol is defined you return false, and otherwise you return false. Basically I don't think your check on the protocol is doing anything

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way is the URL object defined on mobile?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that if the link has a protocol it isn't a local file (https:, mailto:, tel:, onenote:, etc), so it would fail if we tried to instantiate the URL object.

I don't think it is an issue if the URL exists on mobile or not because this Importer_Md is one of the importer classes that are imported dynamically (because it isn't used by mobile anyways, we only use Importer/Exporter for Jex on mobile). Should I avoid using it anyway?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I avoid using it anyway?

I don't know, would you mind checking? Do we use URL elsewhere on mobile, how do we usually parse URLs on mobile?

The idea was that if the link has a protocol it isn't a local file (https:, mailto:, tel:, onenote:, etc), so it would fail if we tried to instantiate the URL object.

That's not what your code does though - it instantiate URL and either returns true if it fails, or false otherwise. Protocol has no relevance to your code.

}
}
1 change: 1 addition & 0 deletions packages/tools/cspell/dictionary2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,7 @@ ondoctype
onedrive
onelink
onend
onenote
onfoo
onformat
onmatch
Expand Down
4 changes: 0 additions & 4 deletions packages/utils/url.ts
Original file line number Diff line number Diff line change
Expand Up @@ -97,10 +97,6 @@ export const fileUriToPath = (path: string, platform = 'linux') => {
return output;
};

export const isDataUrl = (path: string) => {
return path.startsWith('data:');
};

export const hasProtocol = (url: string, protocol: string | string[]) => {
if (!url) return false;

Expand Down
Loading