Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add examples to ParseOptions #308

Closed
brson opened this issue May 4, 2017 · 14 comments
Closed

Add examples to ParseOptions #308

brson opened this issue May 4, 2017 · 14 comments

Comments

@brson
Copy link
Contributor

brson commented May 4, 2017

And an explanation of why you would use it. Seems like it's mostly for hooking into parse warnings with log_syntax_violation. Not sure what base_url would be used for - some performance optimization? Ask on #servo.

Might want to do this as the same time as #301

@SimonSapin
Copy link
Member

log_syntax_violation is something that is implemented because it exists in the spec but that, as far as I know, nobody has every used. I could in theory be used in an URL "validator" / conformance checker.

A base URL has nothing to do with performance. It is required to resolve relative URL references like ../foo.png. Parsing an input string like this without specifying a base URL returns an error.

ParseOptions exists to enable a builder pattern, which is a way to leave the possibility of adding more options/parameters to URL parsing without breaking the API. This in turn is a work-around for the lack of named an defaulted parameters in the language.

Since base_url is the only option really in use today and it can be more conveniently specified with the join method (some_base.join(some_string) is the same as Url::options().base_url(&some_base).parse(some_string)), I expect ParseOptions to be used very rarely.

@marti1125
Copy link

Hello @brson I would like to contribute with this =D

@dtolnay
Copy link
Contributor

dtolnay commented May 12, 2017

Thanks @marti1125! Let us know here or in the #servo IRC channel if you need any clarification.

@brson
Copy link
Contributor Author

brson commented May 13, 2017

@marti1125 Awesome! I saw you pinged my on IRC today. Sorry I missed you. IRC is the best way to get ahold of me usually, though note I'll be on vacation next week. Please work with @dtolnay.

@marti1125
Copy link

ok =D @dtolnay what is you nick at irc ?

@dtolnay
Copy link
Contributor

dtolnay commented May 13, 2017

dtolnay

@marti1125
Copy link

at #servo ? I don't find you =(

@brson
Copy link
Contributor Author

brson commented Jun 3, 2017

@marti1125 no you can find @dtolnay on #rust-libs though.

@marti1125
Copy link

@SimonSapin Can we talk at irc for fix this issue ? my nick is marti_

@dekellum
Copy link
Contributor

dekellum commented Jan 16, 2018

Hi, I'm (mostly) successfully using log_syntax_violation in some crawling-related code of my own, so I can offer a (perhaps unexpected) usage example and some usage feedback which I hope you will consider.

I'm normalizing/validating URLs beyond what is strictly required by the spec, and in some cases this means wanting to promote a syntax violation warning to an Error result for filtering purposes. To work within the Fn closure limitations, I've had to use a Cell as shown below, and I'm only grabbing the last violation found. Not shown, but to grab all the violations, I'm flurting with an unsafe push into a Vec, which is awkward at best using a RefCell<Vec<String>> instead.

    /// Parse raw_url with optional base using parse options to get the
    /// last syntax violation, if any.
    fn parse_v(base: Option<&Url>, raw_url: &str)
        -> Result<(Url,Option<String>)>
    {
        let violation: Cell<Option<String>> = Cell::new(None);
        let url = {
            let vfn = & |s: &str| {
                violation.set(Some(s.to_owned()));
            };
            let options = Url::options().
                log_syntax_violation(Some(vfn)).
                base_url(base);
            options.parse(raw_url)
        }?;
        Ok((url,violation.into_inner()))
    }

So I could extract and PR doc example from this if desired? Usage feedback:

  • This could be more usable for a validator as suggested, as well as help my use, if the violations were simply enumerated, not strings, and as a bonus, included an offset into the parsed original raw url str.

  • Consider dropping the closure interface for an alternative parse like:

fn parse_with_violations(input: &str) -> Result<(Url,Vec<Violation>), ParseError>

Its all related, but let me know if any of this would be better in its own issue and I can create it. Thanks for your efforts!

@SimonSapin
Copy link
Member

@dekellum You might be the first to actually use log_syntax_vidolation. In retrospect, it’s obvious that Fn rather than FnMut is rather inflexible for users. (I just tried changing it to &mut FnMut but besides being a breaking change I quickly hit many borrowck issue in the parser implementation.) As you noted, the best is probably to use RefCell.

A dedicated method goes against the idea of the ParseOptions struct with its builder pattern: log and encoding and base are multiple options available, and you might want to use any combination of them. But we don’t want a combinatorial explosion of the number of method by adding a specific method for each combination.

@dekellum
Copy link
Contributor

Thanks @SimonSapin. I understand the builder pattern. I was only looking for a way around the awkward Fn usage for this one case. Thanks also for pointing out the encoding_override option. That's not in the rustdoc, presumably because its feature gated? Can that feature possibly be turned on for rustdoc?

Could you/someone offer hope that a PR would be considered, if I attempted to improve the general form and flexibility of this syntax violation feature? Not much risk of breakage, if I'm the only one using it? Right? :)

bors-servo pushed a commit that referenced this issue Jan 21, 2018
Enable rustdoc for feature query_encoding on docs.rs

So that ParseOptions::encoding_override shows up on docs.rs  This relates to and was requested by me in #308.  I believe this is how to implement.

See onur/docs.rs@62fdcf605 for the docs.rs support of this config.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/427)
<!-- Reviewable:end -->
bors-servo pushed a commit that referenced this issue Feb 16, 2018
Improve syntax violations interface and preserve Copy

This is an alternative implementation of #430 which avoids introducing an `Rc<Fn>` and preserves `Copy` for `ParseOptions`.   Thus, based on my current understanding with the prior review, it is completely backward compatible and with no breaking changes.  Its a little less elegant in that, without the `Rc<Fn>`, I can not simple wrap the deprecated log_syntax_violation Fn with an adapting closure.  But by moving the two possible function references into a new enum `ViolationFn` it is reasonably well contained.  The interface for users is effectively the same, so I think the backward compatibility advantage makes this preferable to #430.  Updated summary:

### Summary of changes

This makes programmatic use of syntax violations more practical while further centralizing this concern in the parser, which should be a maintenance win.

* New `SyntaxViolation` enum with descriptions the same as the prior violation static strings.  This enables programmatic use of these violations by variant, e.g. ignoring some, warning on others, providing translated descriptions, etc. Enum implementation uses the same macro expansion pattern as was used for `ParseError`.

* New `syntax_violation_callback` signature in `ParseOptions` and `Parser`.

* Deprecated `log_syntax_violation` and re-implemented, passing the same violation descriptions for backward compatibility.

* Unit tests for old `log_syntax_violation` compatibility, and new `syntax_violation_callback`

* Includes rustdoc for old and new interfaces including tested example usage of new callback, as requested in #308.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/433)
<!-- Reviewable:end -->
@dekellum
Copy link
Contributor

The docs for ParseOptions have generally been improved, incl. for the query_encoding optional feature (#427) and for the syntax violation callback, with example usage (#433). Unless there is something actionable remaining here that I'm missing, may I suggest this be closed?

@SimonSapin
Copy link
Member

Closing as fixed, thanks @dekellum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants