-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PSR-7] Scheme delimiters clarification #480
[PSR-7] Scheme delimiters clarification #480
Conversation
Pinging @simensen for review. |
👍 |
* provided includes the "://" delimiter, it MUST be removed. | ||
* provided includes either the closing scheme delimiter (":") or the | ||
* combination of the closing scheme delimiter and the opening authority | ||
* delimiter ("://"), these strings should be removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is should
written lowercase here but in other places in capital letters, like 2 lines below?
I really wonder why withScheme and getScheme enforce handling of these separators. RFC 3986 does explicitly never include these separators in the definition of the scheme, e.g. |
If this trimming is added, why not also add trimming for whitespace and so forth? This shows that this is arbitrary IMO. |
Also an URI must not have an authority part. So |
@Tobion Thanks for the feedback; I'll wait for others before incorporating any changes. I like the idea of just not mentioning the delimiters at all; it would simplify it greatly and remove ambiguity. |
@weierophinney do you know why this was incorparated in the first place? Never seen this handling before in any http library. |
What I would instead add, is that
This would actually be quite important for the usability. As a user you can simply do |
If this is still the direction we are heading, I'm 👍 on removing the discussion of the delimiters altogether and possibly moving that to the metadoc. |
@weierophinney The verbiage "If no scheme is present, this method MUST return an empty string." seems to conflict with section 3: "The scheme and path components are required, though the path may be empty (no characters)" Should a URI always have a scheme? I have an implementation that I was working on some time ago. I would be glad to share it if you are interested. In my case, I followed the relative resolution algorithm from the RFC to make a static constructor. Once the Uri instance is created, it had to have a scheme. I agree that value objects like these should be immutable as well. Thanks for all of your hard work :) Something like this:
The "strict" argument in my class was a flag for this part of the algorithm in 5.2.2:
|
Should also be removed IMO as a path is required but can be empty. So this sentence makes no sense to me. |
@weierophinney I've review the changes and @Tobion seems to highlights the same problems that I had with the previous text. My original take was to allow alert(window.location.protocol); //will returns 'http:' or 'https:' So allowing the latter representation may ease working with output from other languages. On the other hand, the This way we can ensure that:
|
Since UriInterface is made to be failure tolerant, I agree that withScheme should only remove a trailing
Edit: I just realize withPath without start slash is optional (MAY) which is kinda inconsistent. Either we do enforce failure tolerance everywhere or nowhere. |
@Tobion maybe adding
|
@nyamsprod yes why not. In theory |
A [discussion on the mailing list](https://groups.google.com/d/msgid/php-fig/db3d5421-30a5-4135-9f9e-fe432f03d719%40googlegroups.com?utm_medium=email&utm_source=footer) raised the point that `:` is the actual scheme delimiter. In reviewing [RFC 3986 section 3](https://tools.ietf.org/html/rfc3986#section-3), I noted that `:` is the end delimiter for a scheme, and `//` is the opening delimiter for the authority. Since the `UriInterface` does not allow setting the authority in a single go, the only time the `//` delimiter will likely be used is as part of the scheme, making the current wording valid. However, the wording should also note that a trailing `:` delimiter MUST be removed as well. Created an "Errata" section that adds this; we can vote on it post-acceptance.
Instead of errata, we'll now address this directly in the specification.
d1fea64
to
aec885a
Compare
- Addressed feedback from @Tobion about specifying that `getScheme()` MUST normalize to lowercase. - Removed information about `//` authority delimiter altogether. Kept information about `:` delimiter per feedback from @nyamsprod. - Removed details abou scheme delimiter and case from `withScheme()`, and kept it specifically in only `getScheme()`.
@Tobion and @nyamsprod — I've incorporated your feedback at this time; please review! :) |
👍 |
* an instance that contains the specified scheme. If the scheme | ||
* provided includes the "://" delimiter, it MUST be removed. | ||
* This method MUST retain the state of the current instance, and return a | ||
* new instance that contains the specified scheme. | ||
* | ||
* Implementations SHOULD restrict values to "http", "https", or an empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this restriction is pointless. If it's about validation then it's useless since one cannot rely on it as it may accept other schemes as well.
So I would reverse the logic: "Implementations MUST support schemes "http", "https" and an empty string but MAY accept other schemes if required."
- MUST support http and https and empty schemes. Credit to @Tobion
@Tobion : Updated. |
* an instance that contains the specified scheme. If the scheme | ||
* provided includes the "://" delimiter, it MUST be removed. | ||
* This method MUST retain the state of the current instance, and return a | ||
* new instance that contains the specified scheme. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this wrongly reverts stuff: new vs an
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh - bad rebase. I'll get it fixed.
- `s/a new instance/an instance/`
@Tobion updated; please review. |
@@ -1161,10 +1161,14 @@ interface UriInterface | |||
* Implementations SHOULD restrict values to "http", "https", or an empty | |||
* string but MAY accommodate other schemes if required. | |||
* | |||
* If no scheme is present, this method MUST return an empty string. | |||
* The string MUST omit the closing scheme delimiter (":"), if present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be moved to withScheme to be consistent with withQuery and withFragment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question, @Tobion: should I move it there, or remove that verbiage from withQuery()/withFragment() (as it's now in the getQuery()/getFragment() methods)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Tobion I chose to remove the verbiage from each of withQuery()
and withFragment()
, as normalization is already properly detailed in the get*()
methods for each component.
- Per comment from @Tobion, removing the verbiage for trimming the leading delimiter from each of the `withQuery()` and `withFragment()` methods, as the relevant location for detailing normalization is in the `get*()` methods (and the verbiage already exists there).
- for consistency with RFC 3986
The more I think about this the more I'm sure we should NOT add this auto-removal of delimiters. But when you add the auto-trimming there are 4 different possible outcomes which will result in different parameters:
So IMO such magic does not belong in a standard. And if this trimming is about helping users minimize errors, then it's a different layer where such thing can be handled. For example asking if one really intends to use a question mark in the query string, e.g. in a route. Or logging a warning or whatever. |
@Tobion Respectfully, I disagree on this point. The point is to ensure that you can construct the full URI and know that the delimiters are not present when you fetch each component. In other words, the data present is the component data only, no delimiters. What we're specifically suggesting trimming here are the delimiters as specified for the given component, in the given position only. This is, in fact, precisely what $url = 'https://mwop.net:9993/path?foo=bar#fragment';
var_export(parse_url($url));
/*
Results in:
array (
'scheme' => 'https',
'host' => 'mwop.net',
'port' => 9993,
'path' => '/path',
'query' => 'foo=bar',
'fragment' => 'fragment',
)
*/ I'm leaving this verbiage in, with the changes as noted in this PR. |
If it's just about saying that the query does not include the |
I guess you misunderstood me. Otherwise this whole discussion so far doesn't make sense. |
@Tobion Then please clarify your position. The thread started with @nyamsprod 's suggestion that we strip a trailing Additionally, the discussion of normalization was moved only to the I've continued to make changes as you find new issues. However, honestly, it feels like any change I make, you're specifically trying to find a reason why it's wrong or how I misinterpreted your intent. As such: please clarify EXACTLY what you would like to see. It may be best to do this in a new pull request, so that there can be no room for me misinterpreting your intent. I already have a 👍 from @nyamsprod , and, considering this was raised on their behalf, I consider it ready. Give me a PR with the changes you would like to see so we can compare. |
I'll provide a PR. |
Closing in favor of #503. |
A discussion on the mailing list raised the point that
:
is the actual scheme delimiter. In reviewing RFC 3986 section 3, I noted that:
is the end delimiter for a scheme, and//
is the opening delimiter for the authority. Since theUriInterface
does not allow setting the authority in a single go, the only time the//
delimiter will likely be used is as part of the scheme, making the current wording valid. However, the wording should also note that a trailing:
delimiter MUST be removed as well.This patch updates the
UriInterface
'swithScheme()
andgetScheme()
methods to clarify that either the:
or://
, when present, MUST be removed.