Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Breaking] Make Reffy skip discontinued specs by default #1341

Merged
merged 2 commits into from
Aug 1, 2023
Merged

Conversation

tidoust
Copy link
Member

@tidoust tidoust commented Jul 26, 2023

If we start to track specs that move to other places more thoroughly in browser-specs as described in w3c/browser-specs#1006, it is likely that browser-specs will start listing more "discontinued" entries, and that some of them will redirect to other specs (e.g., HTML). While we'll want to continue crawling a handful of discontinued specs in the context of Webref for historical reasons, there should be no point crawling these specs by default, and doing so might mean crawling the same target spec multiple times.

This update makes Reffy skip discontinued specs by default. Reffy users may still force crawl of discontinued specs simply by listing their shortnames explicitly (on top of specifying all):

node reffy.js --output report/ed --spec all --spec DOM-Level-2-Style

This is a breaking change, but only in the sense that Reffy now no longer crawls by default the 5 specs in browser-specs that are currently flagged as "discontinued": DOM-Level-2-Style, selectors-non-element-1, tracking-dnt, wpub-ann, wpub

If we start to track specs that move to other places more thoroughly in
browser-specs as described in w3c/browser-specs#1006,
it is likely that browser-specs will start listing more "discontinued" entries,
and that some of them will redirect to other specs (e.g., HTML). While we'll
want to continue crawling a handful of discontinued specs in the context of
Webref for historical reasons, there should be no point crawling these specs by
default, and doing so might mean crawling the same target spec multiple times.

This update makes Reffy skip discontinued specs by default. Reffy users may
still force crawl of discontinued specs simply by listing their shortnames
explicitly (on top of specifying `all`):

  $ node reffy.js --output report/ed --spec all --spec DOM-Level-2-Style

This is a breaking change, but only in the sense that Reffy now no longer crawls
by default the 5 specs in browser-specs that are currently flagged as
"discontinued": DOM-Level-2-Style, selectors-non-element-1, tracking-dnt,
wpub-ann, wpub
tidoust added a commit to w3c/webref that referenced this pull request Jul 31, 2023
This leverages the `standing` property in browser-specs to exclude specs that
don't have a good standing from data curation.

This makes it possible to add specs to browser-specs at an earlier level for
cross-referencing purpose (Specref and terms) without having to worry too much
about its impact on CSS, elements, events, and IDL definitions. It also makes it
possible to keep discontinued specs in browser-specs without having to worry
about extracts becoming obsolete, invalid, or conflictual.

This is intended to replace #712 with a different exclusion logic. In #712,
specs that were excluded from data curation were the ones that did not target
browsers, based on the `categories` property. This does not help with the main
source of CSS, events and IDL hiccups, which are more common in early API
proposals. Plus I still think that filtering specs based on their categories is
not the right approach.

This will remove the following curated extracts:
- CSS extract of CSS Conditional Values Module Level 1
- CSS extract of Non-element Selectors Module Level 1
- IDL extract of Direct Sockets API. If we want to keep the IDL, the right
mechanism would be to drop the "pending" standing, but the spec itself says that
it is an unofficial draft.
- IDL extract of Web Publications, which seems a good thing given that the spec
has been discontinued.
- IDL extract of Document Object Model (DOM) Level 2 Style, which we were
previously doing through a patch.

This does not add any mechanism to create exceptions to the rule. That is on
purpose. Let's be optimistic ;)

Note the plan to also make Reffy skip "discontinued" specs by default, in
w3c/reffy#1341. With these two updates, the workflow
becomes:
1. Specs that are in good standing are crawled and curated
2. Specs that have a pending standing are crawled but not curated
3. Specs that have been discontinued are not crawled by default (but may be for
legacy/cross-referencing purpose) and not curated.

The rule for inclusion in NPM packages does not change: only specs targeted at
browsers are included.
@tidoust tidoust merged commit 0eeb789 into main Aug 1, 2023
@tidoust tidoust deleted the discontinued branch August 1, 2023 09:04
tidoust added a commit that referenced this pull request Aug 1, 2023
Breaking change:
- Make Reffy skip discontinued specs by default ([#1341](#1341))

Specs in browser-specs will be more consistently preserved in the list, even when they get abandoned or replaced by other proposals, so that browser-specs can act as a useful source for Specref. Reffy will no longer crawl specs that have a "discontinued" standing in browser-specs. At the time of the change, this affects 5 specs, which used to be crawled by default, and no longer are: DOM-Level-2-Style, selectors-non-element-1, tracking-dnt, wpub-ann, wpub.

Feature patches:
- Bump action versions in job (#1342)
- [tests] Adapt to mock headers structure (#1343)

Dependency bumps:
- Bump rollup from 3.26.2 to 3.27.0 (#1345)
- Bump semver from 7.5.3 to 7.5.4 (#1330)
- Bump respec from 34.1.4 to 34.1.6 (#1339)
- Bump webidl2 from 24.4.0 to 24.4.1 (#1332)
- Bump puppeteer from 20.8.0 to 20.9.0 (#1338)
- Bump web-specs from 2.63.0 to 2.65.0 (#1346)
tidoust added a commit to w3c/webref that referenced this pull request Aug 1, 2023
This leverages the `standing` property in browser-specs to exclude specs that
don't have a good standing from data curation.

This makes it possible to add specs to browser-specs at an earlier level for
cross-referencing purpose (Specref and terms) without having to worry too much
about its impact on CSS, elements, events, and IDL definitions. It also makes it
possible to keep discontinued specs in browser-specs without having to worry
about extracts becoming obsolete, invalid, or conflictual.

This is intended to replace #712 with a different exclusion logic. In #712,
specs that were excluded from data curation were the ones that did not target
browsers, based on the `categories` property. This does not help with the main
source of CSS, events and IDL hiccups, which are more common in early API
proposals. Plus I still think that filtering specs based on their categories is
not the right approach.

This will remove the following curated extracts:
- CSS extract of CSS Conditional Values Module Level 1
- CSS extract of Non-element Selectors Module Level 1
- IDL extract of Direct Sockets API. If we want to keep the IDL, the right
mechanism would be to drop the "pending" standing, but the spec itself says that
it is an unofficial draft.
- IDL extract of Web Publications, which seems a good thing given that the spec
has been discontinued.
- IDL extract of Document Object Model (DOM) Level 2 Style, which we were
previously doing through a patch.

This does not add any mechanism to create exceptions to the rule. That is on
purpose. Let's be optimistic ;)

Note the plan to also make Reffy skip "discontinued" specs by default, in
w3c/reffy#1341. With these two updates, the workflow
becomes:
1. Specs that are in good standing are crawled and curated
2. Specs that have a pending standing are crawled but not curated
3. Specs that have been discontinued are not crawled by default (but may be for
legacy/cross-referencing purpose) and not curated.

The rule for inclusion in NPM packages does not change: only specs targeted at
browsers are included.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants