Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate URL encoding/decoding #8

Closed
CathalMullan opened this issue Aug 2, 2024 · 0 comments · Fixed by #110
Closed

Investigate URL encoding/decoding #8

CathalMullan opened this issue Aug 2, 2024 · 0 comments · Fixed by #110

Comments

@CathalMullan
Copy link
Contributor

CathalMullan commented Aug 2, 2024

See: tokio-rs/axum#2678
See: tokio-rs/axum#2729
See: https://github.com/actix/actix-web/blob/master/actix-router/src/quoter.rs
See: dotnet/aspnetcore#11544

Insert

How should we handle encoding during inserts?

e.g.

  • router.insert("/😊");
  • router.insert("/%F0%9F%98%8A");

The above 2 routes should be considered identical, and therefore conflict.

Do we want to check if the inserted path is already encoded (even if only partially), and error? Can we know for sure a path has been encoded?

Do we only want to care about the inserted path when we have percent-encoding enabled for our router, or should we always care?

If we always care (and force non-encoded inserts), we could provide 2 functions for matching? Downside is that we couldn't support hand-encoded routes (would that ever be needed?).

Do we want to normalize the path, and just continue the insert process?

Our options are:

  1. do nothing
  2. enforce non-encoded paths
  3. normalize path

Going to go with 2 for now.

Handling Encoded '/'

Naive approach of decoding the entire path may result in unexpected behaviour when dealing with encoded '/' characters.

Literal '/' is meant to be a seperator.
Encoded '/' is not.

URL Spec vs RFC 3986

Neither of the two options feel more correct necessarily. And I don't want to force the user into one or the other.

Instead of having a bool to enable/disable percent-encoding, maybe we pass an enum that allows choosing which 'style' on encoding?

As mentioned on the README, allowing people to easily port APIs from other languages is a explicit goal of this library, even if it results in extra complexity.

Can always set a 'sane' default.

Do we even need to handle encoding at all?

PHP

JS

Python

Decoding Params

Return both raw and decoded?

@CathalMullan CathalMullan linked a pull request Aug 15, 2024 that will close this issue
@CathalMullan CathalMullan pinned this issue Aug 16, 2024
@CathalMullan CathalMullan unpinned this issue Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant