Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is any escaping of URIs within "3.11.6 Messages with embedded links" needed? #657

Open
davidmalcolm opened this issue Aug 19, 2024 · 4 comments

Comments

@davidmalcolm
Copy link

"3.11.6 Messages with embedded links" has:

link destination = ? Any valid URI ?;
embedded link = "[", link text, "](", link destination, ")";

My reading of the spec if that no escaping is needed/done on the link destination, and thus it is implicitly required that URIs do not contain ) so that a consumer can detect where the link destination ends.

Am I correct? Is this lack of ) guaranteed by RFC 3896, an additional constraint in SARIF, or is some kind of matching of ( and ) pairs assumed for SARIF consumers? (and is that guaranteed by RFC 3896?)

@davidmalcolm davidmalcolm changed the title Escaping of URIs within "3.11.6 Messages with embedded links" Is any escaping of URIs within "3.11.6 Messages with embedded links" needed? Aug 19, 2024
@sthagen
Copy link
Contributor

sthagen commented Aug 30, 2024

Interesting question, thank you. I added the question and to be discussed labels, as I think the TC should consider this question.

@davidmalcolm
Copy link
Author

"On RFC 3896: I don’t believe lack of ) is guaranteed. e.g., https://en.wikipedia.org/wiki/Parenthesis_(disambiguation)"

@sthagen
Copy link
Contributor

sthagen commented Nov 14, 2024

Clarification (attempt): Talking URIs we will want to compare notes with RFC 3986 "Uniform Resource Identifier (URI): Generic Syntax" and not with the (twisted digits?) RFC 3896 "Definitions of Managed Objects for the DS3/E3 Interface Type".

@sthagen
Copy link
Contributor

sthagen commented Nov 14, 2024

Answer (attempt): @davidmalcolm

My reading of the spec if that no escaping is needed/done on the link destination,

yes, as the path part of a URI can (and will cf. e.g. wikipedia links ...) contain unescaped parentheses.
These can even be unbalanced (imagine some link shortening service utilizing such characters in the base set without the semantics of "embracing" something else).

Solving some [unbalanced](https://example.org/aFgH)x_) class of problems is the doable task for the markdown parser consuming the text. Markdown is intentionally able to convey enough layout by spacing and minimal extra tokens to be rendered in our brains when scanned through our eyes.
Also, there are many ways that markdown parsers add magic in amending what could be URIs (per guessing, accepting HTML syntax, also I think angle brackets, ...)

and thus it is implicitly required that URIs do not contain ) so that a consumer can detect where the link destination ends.

no, as the semantiocs of unescaped parentheses and other "interesting" characters in the path portion of a URI are left as an exercise to the reader (consumer).

Does that make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants