Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leave more characters unescaped for segments #75

Closed
wants to merge 1 commit into from

Conversation

Guria
Copy link
Contributor

@Guria Guria commented Apr 19, 2016

I am using path-to-regexp in url-mapper which allows to map almost any json compatible object to url with path and query. The key feature is preserving type of values, so assert.equal(parse(stringify(object)), object).
Routes defined with path-to-regexp format, keys of object defined in route stringified with path-to-regexp, rest of keys goes to query part with help of urlon.

Tonic demo.

url-mapper allows map only simple values (strings, numbers and booleans) to path parameters. To allow preserving type for numbers and booleans they are prepended with colon character. Here is where issue starts: path-to-regexp encodes segments with encodeURIComponent function and converts colons to '%3A', which makes url to look more cryptic in path part.

Tonic demo

This PR tries to address this problem by treat ;,:@&=+$-_.!~*() characters as safe to be part of segment. According to MDN:

To avoid unexpected requests to the server, you should call encodeURIComponent on any user-entered parameters that will be passed as part of a URI.

It means there is no need to use encodeURIComponent to encode segment part, because it's too strict. I have introduced extended version on encodeURI with escaping /?#'" characters and added some tests as well.

I am suppose, that it should be treated as major change because it potentially could break matching and expectations.

treat `;,:@&=+$-_.!~*()` characters as safe to be part of segment
@blakeembrey
Copy link
Member

blakeembrey commented Apr 20, 2016

@Guria So what's the core issue here? It was a lot of text, but the expectation is that escaping is looser (to make it prettier with the colon)? I have no strong opinions, but I'd like to get another set of eyes on it first.

@Guria
Copy link
Contributor Author

Guria commented Apr 21, 2016

@Guria
Copy link
Contributor Author

Guria commented Apr 21, 2016

I'd like to get another set of eyes on it first.

Of course, no hurry here. We have a recommendation on how to deal with it to avoid %3A to appear in url. It is nice to have feature, so I don't mind that it will take some time to prove it harmless.

Also I was think about keep current escaping behaviour and introduce less strict as option. That's way it could be introduced without any harm to existing users.

@maxfrigge
Copy link

I personally think that readability in urls is a good thing - even if it just creates joy amongst developers :)

Don't really see why it should break matching though?

@blakeembrey
Copy link
Member

@maxfrigge Any change to a public facing API is a breaking change, someone may be relying on (however unlikely) the fact that : becomes %3A currently. An option allows this to be a minor opt-change, however.

@Guria You could also manage stringify yourself, for now - it's pretty simple to do and the current code doing it was very trivial to implement - just loop over the parsed string and concat together throwing errors on invalid data.

@blakeembrey
Copy link
Member

@dougwilson Any thoughts from you here?

I'm not particularly keen on merging, it feels like a simple application left to user-land for now. That said, it is a very trivial fix and the only possible bugs I can think it may introduce on users is if they are parsing generated URLs (it'd be possible to confuse the URL parser if it was previously a relative URL segment at the beginning by putting a colon in the segment).

@Guria
Copy link
Contributor Author

Guria commented Apr 27, 2016

Ohh, thanks for pointing that out. Seems I really can use own tokensToFunction implementation. Though it will be almost completely copypasted.

So no we have several options here:

  • accept fix as is to be available for every user. may require major release bump.
  • rewrite fix to be optional: pathToRegexp.compile(route, { softEscaping: true })
  • leave path-to-regexp as is and patched stringify function factory on my side: softTokensToFunction(parse(route))

I can start from 3rd option now, but will leave a pr open in case you will decide that this fix might be useful for other users too.

@dougwilson
Copy link

@blakeembrey so I have seen various people (online and in person) who do not like the encodeURIComponent behavior, simply because it's less aesthetically pleasing. Of course, the proposed reduction in this PR is technically valid (I believe) because when considered in the full URL context, those entities are only suggested to be encoded. Encoding them helps prevent ambiguity when you only have a fragment of the URL (which you pointed out), at the expensive of less aesthetics.

That said, I would suggest keeping the current safe behavior, but do not see a reason why adding an option for this new behavior would be bad (since it's optional, and more importantly, opt-in). I would leave the ultimate decision to you, but that is my opinion :)

blakeembrey pushed a commit that referenced this pull request May 8, 2016
treat `;,:@&=+$-_.!~*()` characters as safe to be part of segment
@blakeembrey
Copy link
Member

blakeembrey commented May 8, 2016

Merged as an option. Use { pretty: true } to get prettier URL outputs. Thanks @Guria! 😄

@blakeembrey blakeembrey closed this May 8, 2016
@Guria Guria deleted the escape-segment branch August 5, 2016 15:46
Guria added a commit to cerebral/url-mapper that referenced this pull request Nov 21, 2017
URLON updated to 3.0.1, which introduces improved readability by loose escaping,
better safety by fixing undefined and compatibility with utm tracking links.
See https://github.com/cerebral/urlon/releases/tag/3.0.0 for details.

path-to-regex update allows avoid unnecessary of colons used to preserve types.
See pillarjs/path-to-regexp#75 for details.

`null` values is now supprted in path parts.

BREAKING CHANGE:
Stringify format changed. Parsing values generated by previous version isn't possible anymore.

fixes #27, fixes #28
Guria added a commit to cerebral/url-mapper that referenced this pull request Nov 21, 2017
URLON updated to 3.0.1, which introduces improved readability by loose escaping,
better safety by fixing undefined and compatibility with utm tracking links.
See https://github.com/cerebral/urlon/releases/tag/3.0.0 for details.

path-to-regex update allows avoid unnecessary of colons used to preserve types.
See pillarjs/path-to-regexp#75 for details.

`null` values is now supprted in path parts.

BREAKING CHANGE:
Stringify format changed. Parsing values generated by previous version isn't possible anymore.

fixes #27, fixes #28
MUDev1994 pushed a commit to MUDev1994/path-to-regexp that referenced this pull request May 26, 2023
treat `;,:@&=+$-_.!~*()` characters as safe to be part of segment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants