Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat URL! as normal strings with no encoding behavior #655

Merged
merged 1 commit into from
Nov 20, 2017
Merged

Treat URL! as normal strings with no encoding behavior #655

merged 1 commit into from
Nov 20, 2017

Conversation

hostilefork
Copy link
Member

While Rebol2, R3-Alpha, and Red attempted to apply some amount of
decoding (e.g. how %20 is "space" in http:// URL!s), this changes to
leave URLs "as-is".

This serves the goal that a URL may be copied from a web browser bar,
printed/molded out, and pasted back round-trip. It also means that the
URL may be used with custom schemes (odbc://...) that have different
ideas of the meaning of characters like %.

The current concept is that URL!s typically represent the decoded
forms, and thus express unicode codepoints normally...preserving
either of:

 https://duckduckgo.com/?q=hergé+&+tintin
 https://duckduckgo.com/?q=hergé+%26+tintin

Then, the encoded forms with UTF-8 bytes expressed in %XX form would be
converted as STRING!, where their datatype suggests the encodedness:

 {https://duckduckgo.com/?q=herg%C3%A9+%26+tintin}

(This is similar to how local FILE!s, where e.g. slashes become
backslash on Windows, are expressed as STRING!.)

While Rebol2, R3-Alpha, and Red attempted to apply some amount of
decoding (e.g. how %20 is "space" in http:// URL!s), this changes to
leave URLs "as-is".

This serves the goal that a URL may be copied from a web browser bar,
printed/molded out, and pasted back round-trip.  It also means that the
URL may be used with custom schemes (odbc://...) that have different
ideas of the meaning of characters like `%`.

The current concept is that URL!s typically represent the *decoded*
forms, and thus express unicode codepoints normally...preserving
either of:

     https://duckduckgo.com/?q=hergé+&+tintin
     https://duckduckgo.com/?q=hergé+%26+tintin

Then, the encoded forms with UTF-8 bytes expressed in %XX form would be
converted as STRING!, where their datatype suggests the encodedness:

     {https://duckduckgo.com/?q=herg%C3%A9+%26+tintin}

(This is similar to how local FILE!s, where e.g. slashes become
backslash on Windows, are expressed as STRING!.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant