Disambiguating uncaught errors #10853

jugglinmike · 2018-05-04T18:15:34Z

Single page tests are convenient because they shorten test code by two lines (i.e. test(function(){ and });). However, they also introduce an ambiguity which makes differentiating harness errors and harness errors difficult (or maybe impossible).

The ambiguity is apparent any time a test produces an uncaught error without defining any tests. This may represent one of two states:

a valid "single page test" that has failed
a test that has been invalidated by exceptional behavior outside of a sub-test

At this moment, it is not possible for testharness.js to differentiate between these two states. Today, it conservatively assumes the former. This is a problem because it means WPT cannot detect (or reject) tests that are invalid in this way. gh-10706 is the most recent example we have identified.

As far as I know, "single page tests" are only offered as a convenience to test authors. The ambiguity described above trumps the convenience, so if the motivation really is so simple, then we should remove the feature.

There may be cases where the semantics under test necessitate that functionality. In that case, we could preserve the feature and eliminate the ambiguity by requiring an explicit "opt-in" pattern. This might take the form of metadata in the test code (e.g. // META: single=true) or a filename-based feature pattern (e.g. .single.html).

In either case, all "single page tests" defined in WPT today would need to be updated. My attempts to estimate the number of single-page tests have been woefully misguided. I think I'll need to instrument WPT and do a complete run to get an accurate number. Before investing that time, I'd like to get feedback from the folks who would know better than me.

@jgraham @gsnedders @zcorpan @foolip What do you think about eliminating single page tests? Or requiring explicit opt-in?

The text was updated successfully, but these errors were encountered:

jgraham · 2018-05-04T18:24:16Z

Single page tests were added as a convenience specifically because we got feedback about tests being too verbose compared to other test harnesses. The assertion that "they shorten test code by two lines" misses the case where the test is for something async, in which case you can avoid all function wrapping and significantly increase the convenience.

I think the statement that the confusing results are more significant than the convenince requires substantially more support. I think this kind of problem is common when there's no visibility into cross-browser results and uncommon when authors get feedback about their tests.

If we can find a way to disambiguate that doesn't depend on undermining the reason for the feature by adding boilerplate, that seems like a reasonable improvement. Otherwise I think that we should prioritise authoring experience over infra concerns.

jugglinmike · 2018-05-04T18:32:35Z

Would you consider requiring metadata as undermining the motivation for the feature?

jgraham · 2018-05-04T20:54:23Z

Yes, I would expect people to forget, it is by definition impossible to lint for (or we could just use that criteria rather than the metadata), and in any case requiring extra verbosity in the mode specifically designed to reduce verbosity seems to defeat the point.

jugglinmike · 2018-05-04T21:53:06Z

I would expect people to forget, it is by definition impossible to lint for (or we could just use that criteria rather than the metadata)

As part of this change, testharness.js could report a harness error for any test that did not define at least one sub-test. If a contributor expecting single-page test behavior forgot to include metadata, their test would not receive an "OKAY" status in any browser.

As a side-benefit, this would also improve results accuracy for tests that only define sub-tests if the implementation is available. Currently, tests like that simply time out.

This discussion has made me realize that this can't be implemented with the // META pattern I initially suggested. It still seems possible to do it via file name, but I don't like cramming so much information into the file name. Another alternative for opting in would be via a new property passed to the global setup function.

in any case requiring extra verbosity in the mode specifically designed to reduce verbosity seems to defeat the point

If we were trying to save two lines of code as I originally suggested, then I would agree. Your feedback that the savings can be more substantial (both in line count and in cognitive overhead) is well-taken, though. I don't think the additional configuration detracts from this.

zcorpan · 2018-05-06T12:01:22Z

We could have an easier to remember opt in like single()

foolip · 2018-05-07T14:21:26Z

@jgraham, do you think that a very simple opt-in would be an acceptable burden? I lean towards favoring this, because not being able to tell harness errors from failing single-page tests means that it'll be very hard to put an end to harness errors by means of CI checks. (What I'd like is for Travis to fail if there are harness errors in Chrome or Firefox.)

zcorpan · 2018-05-07T19:14:20Z

I made an issue for that, #10877

jgraham · 2018-05-14T19:07:31Z

I really dislike the idea of forcing an opt-in syntax on people who are intentionally using the features designed to reduce verbosity. From a global perspective I think it makes sense to favour the needs of test authors over the needs of people working on infrastructure.

I can imagine putting information about whether the test used single-page mode in the results, so sufficiently clever infrastructure could infer that something producing a single subtest with an ERROR in a subset of browsers, and in single page mode, but multiple or different subtests in other browsers and not in single-page-mode is probably actually an uncaught exception. Or it might be possible to propogate the error to the harness for a single page test where no assert function was called, but I didn't think too hard about that.

jugglinmike · 2018-05-15T17:14:46Z

I really dislike the idea of forcing an opt-in syntax on people who are intentionally using the features designed to reduce verbosity. From a global perspective I think it makes sense to favour the needs of test authors over the needs of people working on infrastructure.

As noted above, the implicit behavior also allows people to use the feature unintentionally. In other words: this would also serve the needs of test authors.

That said, it seems like an over-simplification to label this issue a conflict between test authors and infrastructure maintainers. That suggests that the issue is a subjective matter of convenience between the two groups. The issue we're discussing is more fundamental because it concerns ambiguity in the way tests are interpreted. It limits the project's ability to identify errors, and that effects test authors just as much as infrastructure maintainers.

zcorpan · 2018-05-15T19:51:45Z

I'll also note that an opt-in makes the feature discoverable and searchable, which serves test authors yet unfamiliar with the feature.

jgraham · 2018-05-16T08:49:54Z

I think that something like #11024 achieves what you want here.

foolip · 2018-05-16T11:54:30Z

As @zcorpan said on the PR, it's an improvement, but the ability to actually have tests fail when there are exceptions that should fail the test would be even better. I think a one-liner opt-in like single(), file_is_test() or <meta something> would be better overall. Perhaps this is because I never liked single-page tests and was always confused when I saw them in review, but how many people really understand that we have this feature in testharness.js and how to use it correctly?

Fixes #10853. Closes #11024.

jgraham · 2018-06-05T22:38:03Z

I don't think that adding extra syntax really helps people understand the feature or helps them use it correctly. On the other hand it does directly undermine the major reason for having the feature which is to reduce boilerplate in cases where that is possible.

I am much more inclined to proceed with the zero-boilerplate improvements to address the most serious issues than to take this approach.

domenic · 2018-06-05T22:50:53Z

I was asked for my opinion as a test author in IRC.

I am happy to use a small opt-in of this sort if I use single-page tests. However, I have been leaning away from single-page tests because it's already hard enough to remember to add done() for sync tests or set the <title> appropriately to get good error messages.

zcorpan · 2018-06-05T23:18:00Z

How do you prevent reviewers from being confused when reviewing single page tests, when the reviewer is not familiar with the feature? How do you prevent tests from accidentally invoking the feature (a failing assert outside test) and fail to report the intended tests?

I understand that the use case is to minimize boilerplate, but I think this is pretty minor overhead and there are several advantages.

jgraham · 2018-06-06T10:39:36Z

So the points that I have seen brought up here are

People don't really understand single page tests (in general, or when reviewing others' work)
The implicit definition of single-page tests means that we can get an ERROR status on the overall test file if no assert has yet run (assuming that Don't assume file_is_test when there's a top-level non-assert error #11024 lands)
Single page tests are designed to reduce the boilerplate for test authors, particularly Gecko devs who are used to mochitest which is much more lightweight than a typical web-platform-test
The boilerplate of single page tests is too high, and authors forget to write done and to add a title.
Adding explicit syntax might provide a hook for searching the docs

My interpretation of this is that in terms of the way errors are handled and displayed, single page tests need not be any worse than ordinary tests. If #11024 lands then both tests will be in exactly the same situation where if you do something like

<script src="/resources/testharness.js"></script>
<script src="/resources/testharnessreport.js"></script>
<script>
var obj = new ConstructorThatFailsInSomeBrowsers()
test(() => assert_true(obj.prop);
// or just assert_true(obj.prop) in the single page case
</script>

then you will get a harness-level error. Of course there are ways to avoid that; you can put more of the test inside test() in the non-single-page case or use setup(() => {}) in either case. But in practice there are many many tests in the repo that use this pattern and will produce a harness-level error in the case where the feature is entirely unsupported, so I don't think we are going to eliminate that problem. I don't see single page tests as being significantly worse in this regard.

The remainder of the points seem to be about the overall ergonomics of the single page test feature. Feedback indicating that the, as far as I'm aware unavoidable, burden of having to signal when the test is done and provide a meaningful title put people off using the feature doesn't suggest to me that adding more required syntax is going to be helpful. Note also that if we are surfacing test results to authors, both of the errors mentioned result in obvious errors; either the test unexpectedly times out, or the test has no title. So I don't think the ability to record an ERROR when the test doesn't have the required boilerplate will be significantly helpful making the feature more widely used.

Whilst I agree that the presence of syntax is more obvious than its absence in terms of catching people's attention, if there is a widespread belief that the feature is simply not valuable enough that people have taken to using it then it seems like another option that should be considered is dropping it entirely. That at least has the considerable advantage of reducing complexity of implementation.

zcorpan · 2018-06-06T11:26:26Z

I'm OK with dropping the feature if there are too few to care about the feature. I'd also be OK with making the title optional and use the file name as the test name in that case, instead of "Untitled".

I've done some work to fix tests using the problematic pattern your showed in #11269, and #10877 is about avoiding it in new tests.

jgraham · 2018-06-07T10:31:17Z

I think that the scope of #10877 is necessarily smaller; I don't think we can block tests on ERROR, just have a warning in that case. Even if we could it would only catch cases that happen to error in the browser configurations we run in CI.

In general I think the desire to get rid of all ERRORs is over-focussing on getting clean metrics and won't provide value proportional to the effort required. A test ERROR isn't a sure sign of a bug in the test, and there are plenty of tests that are buggy but have some non-ERROR status. As a result browser developers have to look at ERROR and FAIL equally. The situation is of course very different for tests that ERROR or TIMEOUT across multiple browsers, including ones that are expected to implement a feature — which is the scope of #11269 — but again, the same is true of FAIL; it's just there are fewer tests that ERROR so it seems like a more tractable problem to start by cleaning up the legacy of bad tests there.

It is annoying if a test throws an error that prevents it running some subtests that would otherwise have passed. But since that case explicitly doesn't apply to single-page tests, for the purposes of this issue it just balances the fact that it's slightly easier to end up with ERROR in a single page test than in a normal test.

It's important to remember that web-platform-tests still have relatively limited visibility for browser developers; for example we still don't have a dashboard showing test results in development versions. Given that, it's unsurprising that tests written 10 years ago for features that aren't in active devlopment haven't recieved much attention and are often rather broken. As the tests become more tightly integrated into development workflows and the visibility of results in multiple browsers continues to improve, I would expect the overall quality of the tests to trend upwards. I expect there to be a lot fewer tests written in 2018 that are broken in 2025 than there are tests written in 2011 broken today.

I think the idea of falling back to the filename if <title> is missing is good, and would be happy to accept a PR for that.

zcorpan · 2018-06-07T10:38:16Z

Title is #11403 -- I'll also do another PR to get the title from // META: title= in the worker case.

foolip · 2019-02-27T13:14:17Z

This was a problem in #15577, where an error in setup script caused a test to be treated as a single page test, clearly unintentionally. With an opt in to single page tests, that would instead have been treated as a harness error.

Deleting these lines would also solve the problem, however:

wpt/resources/testharness.js

Lines 3304 to 3306 in a9f087c

    
           if (tests.tests.length === 0 && !tests.allow_uncaught_exception) { 
        
               tests.set_file_is_test(); 
        
           }

commit 0032bc0 Author: Simon Pieters <[email protected]> Date: Sun Jun 10 12:28:37 2018 +0200 Add functional selftests, set status to ERROR when mixing single_test and test commit c3c3625 Author: Simon Pieters <[email protected]> Date: Sun Jun 10 09:56:40 2018 +0200 Docs commit bff2d71 Author: Simon Pieters <[email protected]> Date: Sun Jun 10 09:44:46 2018 +0200 Rename single() to single_test() and allow passing test name commit 0bc6450 Author: Simon Pieters <[email protected]> Date: Tue Jun 5 16:59:23 2018 +0200 [testharness.js] Implement `single()` opt-in to single page test Fixes #10853. Closes #11024.

jugglinmike added the infra label May 4, 2018

jugglinmike mentioned this issue May 4, 2018

IndexedDB/keypath-special-identifiers.htm failure in Edge is misleading #10706

Open

foolip added the priority:backlog label May 8, 2018

foolip mentioned this issue May 9, 2018

Make harness errors block PRs (with override) #10877

Open

zcorpan added a commit that referenced this issue Jun 5, 2018

[testharness.js] Implement single() opt-in to single page test

0bc6450

Fixes #10853. Closes #11024.

zcorpan mentioned this issue Jun 5, 2018

[testharness.js] Implement single_test() opt-in to single page test #11364

Closed

jugglinmike mentioned this issue Sep 18, 2019

RFC: setup({single_test: true}) opt-in web-platform-tests/rfcs#28

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disambiguating uncaught errors #10853

Disambiguating uncaught errors #10853

jugglinmike commented May 4, 2018

jgraham commented May 4, 2018

jugglinmike commented May 4, 2018

jgraham commented May 4, 2018

jugglinmike commented May 4, 2018

zcorpan commented May 6, 2018

foolip commented May 7, 2018

zcorpan commented May 7, 2018

jgraham commented May 14, 2018

jugglinmike commented May 15, 2018

zcorpan commented May 15, 2018

jgraham commented May 16, 2018

foolip commented May 16, 2018

jgraham commented Jun 5, 2018

domenic commented Jun 5, 2018

zcorpan commented Jun 5, 2018

jgraham commented Jun 6, 2018

zcorpan commented Jun 6, 2018

jgraham commented Jun 7, 2018

zcorpan commented Jun 7, 2018

foolip commented Feb 27, 2019

Disambiguating uncaught errors #10853

Disambiguating uncaught errors #10853

Comments

jugglinmike commented May 4, 2018

jgraham commented May 4, 2018

jugglinmike commented May 4, 2018

jgraham commented May 4, 2018

jugglinmike commented May 4, 2018

zcorpan commented May 6, 2018

foolip commented May 7, 2018

zcorpan commented May 7, 2018

jgraham commented May 14, 2018

jugglinmike commented May 15, 2018

zcorpan commented May 15, 2018

jgraham commented May 16, 2018

foolip commented May 16, 2018

jgraham commented Jun 5, 2018

domenic commented Jun 5, 2018

zcorpan commented Jun 5, 2018

jgraham commented Jun 6, 2018

zcorpan commented Jun 6, 2018

jgraham commented Jun 7, 2018

zcorpan commented Jun 7, 2018

foolip commented Feb 27, 2019