Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

percent-encoding: Allow to remove character from AsciiSet #528

Merged
merged 4 commits into from
Aug 5, 2019

Conversation

DoumanAsh
Copy link
Contributor

@DoumanAsh DoumanAsh commented Jul 24, 2019

Allow to remove character from AsciiSet

I liked trait approach better, but if you're going with something like AsciiSet it would be better to allow to remove bytes, as sometimes it is easier to take NON_ALPHANUMERIC and remove few necessary characters


This change is Reviewable

@DoumanAsh DoumanAsh changed the title oercent-encoding: Allow to remove character from AsciiSet percent-encoding: Allow to remove character from AsciiSet Jul 24, 2019
@SimonSapin
Copy link
Member

remove looks good to me. What’s the motivation for exposing contains? If you’d like this on crates.io, please also bump the version number to 2.1.0.

@DoumanAsh
Copy link
Contributor Author

DoumanAsh commented Jul 24, 2019

@SimonSapin for contains I have use case when I want to determine if I'm going to use percent_encoding or not at all at runtime.
And previous major version had such ability, so I think it would be convenient to leave it out for user to use too.

Yes, I'll bump version too

UPD: To clarify, my use case is determining whether I should use filename or filename* in Content-Disposition

@SimonSapin
Copy link
Member

Sorry, I still have a hard time seeing how calling contains is useful. Are you writing functions or methods that take a &AsciiSet parameter as input?

What does filename* refer to?

@DoumanAsh
Copy link
Contributor Author

filename referes to https://tools.ietf.org/html/rfc6266#section-4.3

If you're planning to encode file name in content-disposition header you should use different attribute instead of regular filename and provide encoding information.
I only want to percent encode if file name doesn't contain valid set of symbols for path and/or outside of ASCII.
So originally I always checked ahead if I'm planning to encode file name https://github.com/DoumanAsh/yukikaze/blob/1.0/src/header/content_disposition.rs#L38 like that for 1.0 version

@SimonSapin
Copy link
Member

Simply making contains public and using it in the code above would be incorrect as you’d also need to look for non-ASCII bytes.

It looks like what you’re trying to do is find out whether percent_encode would make any change to a given input string. To do that you could also use std::borrow::Cow::<str>::from(percent_encode(…)). If you get a Cow::Borrowed, the output is the same as the input and none of the bytes are encoded.

@SimonSapin
Copy link
Member

Oh sorry that’s not accurate. That conversion can also return Cow::Borrowed if the input is a single byte that needs to be encoded. (In that case it borrows from a static string.)

Would branching on output == input work for you?

@DoumanAsh
Copy link
Contributor Author

@SimonSapin Yes, I'd need to check both for non-ASCII and for AsciiSet::contains.

I guess if comparing output with input would be the only choice I'd go for it too, but it would require for me to percent ahead instead of doing it later.
Which may be ok in this case, but I'm not sure about other cases.

Nevertheless, if you feel like you don't want to expose this method, I can remove it from PR.

@DoumanAsh
Copy link
Contributor Author

@SimonSapin ping

@benesch
Copy link
Contributor

benesch commented Aug 1, 2019

This would simplify the AttrCharEncodeSet in seanmonstar/reqwest#583 quite a bit, so I'm greatly in support of this landing as well.

@SimonSapin
Copy link
Member

Looks good, thanks!

@bors-servo r+

@bors-servo
Copy link
Contributor

📌 Commit ed7b991 has been approved by SimonSapin

@bors-servo
Copy link
Contributor

⌛ Testing commit ed7b991 with merge 0e874dd...

bors-servo pushed a commit that referenced this pull request Aug 5, 2019
percent-encoding: Allow to remove character from AsciiSet

Allow to remove character from AsciiSet

I liked trait approach better, but if you're going with something like AsciiSet it would be better to allow to remove bytes, as sometimes it is easier to take `NON_ALPHANUMERIC` and remove few necessary characters

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/528)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

☀️ Test successful - checks-travis
Approved by: SimonSapin
Pushing 0e874dd to master...

@bors-servo bors-servo merged commit ed7b991 into servo:master Aug 5, 2019
@SimonSapin
Copy link
Member

@DoumanAsh DoumanAsh deleted the ascii_set_remove branch August 5, 2019 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants