Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include data to help match partial CSS values using a feature #3366

Closed
antross opened this issue Feb 1, 2019 · 20 comments
Closed

Include data to help match partial CSS values using a feature #3366

antross opened this issue Feb 1, 2019 · 20 comments
Labels
data:css 🎨 Compat data for CSS features. https://developer.mozilla.org/docs/Web/CSS schema ⚙️ Isses or pull requests regarding the JSON schema files used in this project.

Comments

@antross
Copy link
Contributor

antross commented Feb 1, 2019

In webhint we've recently started integrating BCD to automatically warn users about potential cross-browser support issues.

One place we currently have a blind spot is on support for features which can make up part of a CSS value (e.g. 3D transforms). We know other efforts have previously maintained their own lists to work around this, but we'd like to try doing this in a way that other tools can benefit from as well.

As such I'm proposing adding a matches annotation (or similar) containing a regular expression that can be run against a CSS value to identify if that value uses a given feature. For 3D transforms this might look something like:

{
  "css": {
    "properties": {
      "transform": {
        ...
        "3d": {
          "__compat": {
            "description": "3D support",
            "matches": "\\b(matrix3d|perspective|rotate3d|rotateZ|scale3d|scaleZ|translate3d|translateZ)\\b",
            ...

Alternatively (or additionally), perhaps even just a list of keywords to look for in a value would be sufficient (depending on what other CSS properties might need):

    "keywords": [
        "matrix3d",
        "perspective",
        ...
    ]
@Elchi3 Elchi3 added schema ⚙️ Isses or pull requests regarding the JSON schema files used in this project. data:css 🎨 Compat data for CSS features. https://developer.mozilla.org/docs/Web/CSS labels Feb 4, 2019
@Elchi3
Copy link
Member

Elchi3 commented Feb 4, 2019

I think this would be a great addition and would maybe help us make more features machine-readable / machine-friendly.

@eduardoboucas tried to work around this by introducing "transforms" https://github.com/eduardoboucas/compat-report#transforms. See also eduardoboucas/compat-report#4 for CSS sub-features that have this problem.

Looks like webhint.io runs into a similar problem now, so we should fix this in our data schema so that hopefully data consumers like these two can work with a solution here.

@eduardoboucas do you have thoughts on the "matches" proposal above?

@eduardoboucas
Copy link

Hi! The proposal seems quite similar to the concept I put together at the time – which, sadly, I haven't had a chance to work on since. The main difference appears to be the format. I was going with individual JS files at the time, because I was worried that a regular expression just wouldn't cut it for certain edge cases, but maybe that's not the case.

If we're talking about adding this information to BCD instead of keeping it somewhere else, then a regular expression probably is the best approach, as it keeps things simple and language-agnostic.

At the time, I made a list of all the sub-features that would require some form of detection. Perhaps it's worth going through it and seeing if a regular expression would do the job for most cases?

TL;DR – Looks great! 👍

@jpmedley
Copy link
Contributor

jpmedley commented Feb 4, 2019 via email

@antross
Copy link
Contributor Author

antross commented Feb 4, 2019

@eduardoboucas Thanks for sharing the list!

Looking through I do see that a regular expression against full values would have certain edge cases for situations where a user-defined string can be included (and CSS variables likely extend this issue to most properties).

That said, I'd expect to continue to use something like postcss-value-parser as you do in compat-report and look at specific sub-values contained there as opposed to the entire value string (in which case the keywords part of my proposal is probably a better fit for most properties except things like alpha hex values).

@jpmedley I agree including tests for regular expressions would be a good idea, though if we preferred using the simpler list of keywords syntax and only used matches where absolutely necessary (e.g. alpha hex values) we'd probably only need tests for the latter.

I should note some items would still be difficult to cover (e.g. those which depend on the context in which they are applied such as transform-origin in SVG), but I think increasing the amount of coverage would be worth it even if some cases are still left out.

TL;DR I'd like to use keywords for simpler cases with the expectation that machine consumers are tokenizing values to match keywords, and use matches only when needed for cases like alpha hex values in background-color or the three-value-syntax for transform-origin.

@ddbeck
Copy link
Collaborator

ddbeck commented Feb 11, 2019

Hi @antross, @Elchi3 and I discussed this last week and we think this looks a promising addition to BCD. He asked me to take steps forward on this.

While I work on a more formal proposal for changing the schema, I wanted to ask: can you share any more example regexes or keyword lists that you're using with webhint? I'm guessing this is in the webhint source somewhere, I'm just not sure where to look.

In the meantime, I'm going to take another look at @eduardoboucas's prior art and start making a plan to bring this to BCD. Thanks everybody!

@antross
Copy link
Contributor Author

antross commented Feb 12, 2019

I wanted to ask: can you share any more example regexes or keyword lists that you're using with webhint?

Currently webhint only analyses property names and ignores values so it doesn't include any regexes or keyword lists for this scenario. My hope was to help get the data exposed in BCD before we had to start maintaining our own lists on the side.

@ddbeck
Copy link
Collaborator

ddbeck commented Feb 18, 2019

Proposal: Add feature matching hints to the schema with the matches property

OK, I've been mulling this over for a couple days (it turns out writing something down really brings some issues to light) and here's what I've come up with. I've leaned heavily on your proposal, @antross, but thinking of BCD as a whole, I've given it a bit of a twist. I'd love to have feedback (especially from @antross and @Elchi3)!

Schema changes

I propose we change the schema to allow simple support statements to have an optional property, matches, that defines an object containing properties containing an array of keywords or a regular expression. In JSON, it’d look like this:

Keywords

"description": "3D support",
"matches": {
  "keywords": [
    "matrix3d",
    "perspective",
    "and-so-on"
  ]
},

Regex

"description": "3D support",
"matches": {
  "regex": "\\b(matrix3d|perspective|and-so-on)\\b"
},

Formally, simple support statements may have a property named matches. If it exists, its value must be an object. The matches object must have one property, either keywords or regex. The value of keywords must be an array of one or more strings (and those strings shouldn’t be regular expressions). The value of regex must be a string containing a regular expression.

At least initially, I’d like to restrict the matches property to the css.properties.* hierarchy, until we’ve had some experience creating and reviewing this data.

Semantics

The meaning of the matches property should be understood (and documented) to be a hint about what source might correspond to the feature. Source matched against the information in the matches object should be considered best effort (i.e., we’re not making any guarantees, about matching or not matching the feature).

More specifically, we should define keywords or regexes that make sense at the narrowest scope for a given feature. For CSS, the matching data for a feature that corresponds to some possible values of a property should match against the list of tokens in a list of values, not a whole declaration (e.g., the regex above, not something like transform: (matrix3d|perspective|and-so-on)).

Finally, matches shouldn't be a substitute for just looking up the name of the feature itself. For example, it feels pretty pointless to do something like this:

{
  "css": {
    "properties": {
      "background-image": {
        "__compat": {
          "matches": {
            "keywords": ["background-image"]
          }

Let's not do that.

Other considerations

Testing

To sort of avoid the problem, we should probably avoid regexes where possible. In the event we must add a regex, I think we should require tests. Those tests should live in the /test/ directory. I think this requires less planning and may be easier to deduplicate (e.g., you only need one set of tests for the regex for #RRGGBBAA color values, but that regex will appear in multiple features).

Alternative: Use matches (or regex) and keywords as top-level simple support statement properties

This was tempting, as it's a flatter hierarchy, but I figured adding multiple top-level options to simple support statements would make it harder to find out if a feature has any matching data (i.e., you have to check for the existence of two properties). Also, if we wanted to add some additional way of matching, we'd have to add yet another top-level property.

Alternative: Use string or array types for the matches value

This would, again, be flatter, though at the expense of being slightly less self-descriptive. Also I think it might be slightly error prone for reviewing (imagine a single regex accidentally put inside an array).

Additionally, specifying a type (with the key in matches) gives us the opportunity to introduce additional ways of matching source to BCD features. For example, we could add hints for JavaScript BCD features with ESTree interface names or ESLint selectors without overloading strings or arrays with additional meaning (I’m not saying we should necessarily do those things but it seems easy to not immediately foreclose on the option).

@antross
Copy link
Contributor Author

antross commented Feb 19, 2019

I like it.

Do you think we might need two different types of regexes: one that's intended to match a full value (e.g. three-value syntax for transform-origin) and one that's intended to just match a token within a value (e.g. #RRGGBBAA)?

The latter could still pair with something like postcss-value-parser and would give tools more flexibility to avoid false positives when custom strings like CSS variable names may also be present in the full value.

@ddbeck
Copy link
Collaborator

ddbeck commented Feb 20, 2019

Hmm, that's a good point. And I like the idea of being specific about the scope of the regex. Then we'd have three options for matches objects: the two regexes and keywords.

I suppose the hard part is naming them. 😆 What do you think of regex_value and regex_token?

@antross
Copy link
Contributor Author

antross commented Feb 20, 2019

What do you think of regex_value and regex_token?

Sounds good to me. 👍

@jpmedley
Copy link
Contributor

jpmedley commented Feb 20, 2019 via email

@ddbeck
Copy link
Collaborator

ddbeck commented Feb 21, 2019

@jpmedley, I'd probably write up the currently discussed allowed properties of the matches object something like this:

matches

A matches object may have one of the following properties (in order of preference):

  • keywords: an array of literal strings that correspond to the feature.

    In CSS, these may be, for example, a list of values for a property (matrix3d, perspective).

  • regex_to_be_named_1: a string containing a regular expression that matches a single token (i.e., text delimited by characters that are excluded from the text to be matched) corresponding to the feature.

    For example, in CSS, these may be regular expressions that match component value types (1px, #00ff00, 2fr).

  • regex_to_be_named_2: a string containing a regular expression that matches a value corresponding to the feature.

    For example, in CSS, these may be regular expressions that match declaration values (1px solid #00ff00) or media query expressions (max-width: 1000px).


Having written this out, I can see a few more possibilities for naming these:

  • (as before) keywords, regex_token, regex_value
  • keywords, regex_token, regex (showing increasing generality)
  • keywords, regex_word, regex_value (borrowing from postcss-value-parser)

Or we could go really narrow and name for the specific scope: component_value_keywords, component_value_regex, value_regex, media_query_keywords, media_query_expression_regex, etc. I don't love this idea—it comes at the expense of having a lot more options for the matches object and partially reiterating the hierarchy of the data itself—but I thought I'd mention it for completeness.

@jpmedley
Copy link
Contributor

jpmedley commented Feb 21, 2019 via email

@ddbeck
Copy link
Collaborator

ddbeck commented Feb 22, 2019

I'm happy to discuss this stuff in advance, @jpmedley. LIke you, I'd rather find problems at this stage than after we've added to the schema.

This concerns me because I expect that many decisions about what to include will be based on too brief a glance at the definitions and not enough thought about what they mean.

Yep, that's a good point and we can certainly refine the wording here. On top of that, I plan to cross reference real data, once we have some.

Also, on the reviewing side of things, it should help that we'll require tests for regular expressions. That will force submitters to illustrate the regex's scope with some example input to match against (and not match against).

In any case, I'd like your position on this: is this OK to proceed to a PR? If you think that there is a distinction between the two regex forms that I just haven't written out well enough, then the answer is probably yes (and we can work out the final wording in the PR). If you think we have not yet made the case that there is or should be a distinction, then some more discussion is probably in order.

@jpmedley
Copy link
Contributor

If you plan to change the wording of the component example, and I hope you are, I'd like to see that. If you plan to reflect it in the PR that will be fine.

@ddbeck
Copy link
Collaborator

ddbeck commented Feb 22, 2019

OK, thank you! I definitely plan to improve the wording on the example (in fact, I was thinking that the examples might be helped by showing the actual regexes, but I didn't want to get bogged down in writing regexes for non-existent features). And of course, I'll be interested in your feedback on the PR, when it's ready.

And unless there are no further questions at this stage, I think the next step is to get started on that PR. Thanks everyone. 😄

@ddbeck
Copy link
Collaborator

ddbeck commented Mar 13, 2019

A heads up for anyone following this issue: I've opened PR #3631 to carry out the proposal I put forward before.

@Elchi3
Copy link
Member

Elchi3 commented Mar 25, 2019

The schema changes from #3631 landed. Thanks so much for making that happen, @ddbeck 👍

@eduardoboucas has a list of features that he thinks require matches (transforms in his project). See eduardoboucas/compat-report#4. Should we file a BCD issue so that we can work on getting matches into our data? Do you have a different list for WebHint, @antross? How can we prioritize these?

@antross
Copy link
Contributor Author

antross commented Mar 25, 2019

Fantastic! 🎉

We're not tracking anything for webhint that's not already covered by the list @eduardoboucas put together, so that seems like the right one to base an issue on.

I've filed webhintio/hint#2114 to track work to start consuming the data in webhint.

@Elchi3
Copy link
Member

Elchi3 commented Mar 25, 2019

Thanks @antross! I've filed #3709 for adding more matches. It is based on the issue by @eduardoboucas.

Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:css 🎨 Compat data for CSS features. https://developer.mozilla.org/docs/Web/CSS schema ⚙️ Isses or pull requests regarding the JSON schema files used in this project.
Projects
None yet
Development

No branches or pull requests

5 participants