Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Test Waiters #581

Merged
merged 16 commits into from
Feb 21, 2020
Merged

New Test Waiters #581

merged 16 commits into from
Feb 21, 2020

Conversation

scalvert
Copy link
Contributor

@scalvert scalvert commented Jan 14, 2020

Copy link
Member

@rwjblue rwjblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on the RFC for this, I'm very excited to see this move forward!!

I've left a few suggestions in line, but I've also got a few additional (non-inline) questions/tweaks:

  • Should we suggest migrating ember-test-waiters to @ember/test-waiters? I personally would prefer that, but not sure what your thoughts are...
  • The RFC should explain the layered API approach that we took in the addon:
    • IWaiter - This is the lowest level primitive in the addon
    • buildWaiter - The current content
    • waitForPromise - Nice "sugary" API for the very common "hey, wait for this promise" case
  • The RFC should describe the behaviors with addons
    • That they should include it as a dependency
    • Usages of ember-test-waiters should be left in the build (but the cost is effectively nothing)

text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
@rwjblue rwjblue self-assigned this Jan 14, 2020
@scalvert scalvert requested a review from rwjblue January 15, 2020 02:41
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
@scalvert scalvert requested a review from rwjblue January 15, 2020 18:36
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
text/0000-new-test-waiters.md Outdated Show resolved Hide resolved
@scalvert scalvert requested a review from rwjblue January 15, 2020 20:20
@scalvert
Copy link
Contributor Author

scalvert commented Feb 1, 2020

Thanks for the feedback, @jenweber and @mixonic. Updated the RFC accordingly.

@jherdman
Copy link

jherdman commented Feb 1, 2020

I recently had to use the current test waiter system with a third party evented API, e.g.

SomeSystem.on('someEvent', function() {
  // ...
});

SomeSystem.on('error', function() {
  // ...
});

SomeSystem.asyncThing(arg1, arg2);

In this case you could call the async API function many times and end up with either the error or the handler hook. Maybe my imagination is failing me a little at this moment, but I'm not really sure how you connect one async call to a waiter token. An example might be helpful for this sort of problem.

@scalvert
Copy link
Contributor Author

scalvert commented Feb 3, 2020

@jherdman this is a really good question, and a tricky problem to solve.

The test waiter system is predicated on the supposition that you are expected to execute balanced calls. That means that there needs to be a reasonable expectation of the endAsync call happening after beginAsync. Depending on the library, that's not always the case.

I think for your example above, the primitives in the ember-test-waiters package could be utilized to come up with test waiter implementation that was more fault tolerant. You'd want to make sure that whatever the implementation, your waiter reasonably covers your async behaviors without unintentionally timing your tests out (endAsync not being called for a specific beginAsync).

If there was some identifying feature of the unique calls, such as unique parameters, I could see creating a mechanism that stores tokens identified by call. This could be a simple map that's keyed on the call, and retrieves the token to call endAsync on.

If there isn't an identifying feature of the unique calls, then this poses a trickier problem.

Copy link

@OmShiv OmShiv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This addon is great given a complex state of unmanaged component behavior. I however feel that this add-on will encourage devs to go more upwards in the testing pyramid, widening the middle (Integration) and top (Acceptance) layers.
For eg., a developer may be encouraged to only test the parent component from an app-user's consumption point of view, where the test is waiting for some blackbox functionality (using this waiter) to complete before asserting behavior. For this we already have application tests. Thoughts?!

@OmShiv
Copy link

OmShiv commented Feb 3, 2020

This addon is great given a complex state of unmanaged component behavior. I however feel that this add-on will encourage devs to go more upwards in the testing pyramid, widening the middle (Integration) and top (Acceptance) layers.
For eg., a developer may be encouraged to only test the parent component from an app-user's consumption point of view, where the test is waiting for some blackbox functionality (using this waiter) to complete before asserting behavior. For this we already have application tests. Thoughts?!

This was a summarization. Left comments on specific lines in the .md file, but can't see them. LMK if you can see, else, can paste here.

@scalvert
Copy link
Contributor Author

scalvert commented Feb 3, 2020

@OmShiv I don't believe this to be the case.

The test waiter system already exists; we're further refining it to make it more usable and to provide a better path to debugging async behavior.

While I believe that what you say could happen, we can intentionally call this out in the documentation. That said, testing and the philosophy people use to test their apps, can be a contentious discussion. My preference would be to provide great tools and APIs to help developers succeed, without being overly prescriptive about how they get there.

@scalvert scalvert requested review from jenweber and mixonic February 6, 2020 18:05
@OmShiv
Copy link

OmShiv commented Feb 6, 2020

@scalvert
Totally. I guess better would have been to ask this:
With this pull-request, are you not expecting review on the proposition itself and taking for granted that it has inherent value, while only looking for review on the implementation/detail/documentation?

In other words, are we taking for granted that "New Test Waiters" do add value - and this fact itself is exempt from this RFC? If YES, ignore from here...

If NO, read on - here's my concern.

It's debatable that this add-on adds value to developers concerned with async problems in tests. Also, it's a 3-cost jump:

  1. This add-on doesn't fix async tests (as we saw in our previous efforts), but only helps debugging. The existing waiter support from Ember is sufficient to pin-point the async and apply the fix
  2. It adds test-related code that you have to ship in Production
  3. Most importantly, it will encourage writing Integration tests more top-to-bottom (acceptance like) - misbalancing the test pyramid by widening the middle and top layers, and shrinking the Unit layer at the bottom. It prevents devs from refactoring and decoupling the async - that in many cases (like we saw in ours) can be the root cause of tests timing out.

Consider this:

// Parent component
<ChildImageComponent props=imageProps/>

// Child Image component
init() {
  workOnReceivedProps(); // async operation
  drawImage(); // async operation
}

// Parent test
await render(hbs`<ParentComponent .../>`)
"do some new test waiter magic" // since we can't control child's async
assert.dom(image was drawn...) // likely to timeout 

Such tests should be broken down into

  • parent component only tests it's own stuff
  • parent tests that child was rendered (but not the drawing of the image)
  • child, in its own separate test, verifies that when options are given, image is drawn
  • if possible, child should (in some cases, must) mock the async when testing.

This add-on however will promote testing more of child behavior directly from parent. For a stable test pyramid, you should be able to break tests from Acceptance into Integration, and further break them into more unit tests - which is best both for faster build time and rugged tests. (Imagine a pyramid with wider base and thinner apex, extremely difficult to fall and is a rock stable structure, compared to a thin base pyramid).

With any tool/methodology, a little prescription (not a lot) may be needed, and ember test strategy shouldn't be an exception. Otherwise it can be the case that async timeout issues are a result of greatly deviating from the prescription in the first place and we may be overlooking the opportunity to fix them by promoting wait helpers.

@scalvert
Copy link
Contributor Author

scalvert commented Feb 7, 2020

@OmShiv

Precisely. This isn't really a debate about the merits of test waiters themselves. They're an established tool that currently exist in the test infrastructure, and are already used by a number of libraries to manage async.

As to the latter part, this library doesn't profess to fix any async in tests magically, but merely to provide a tool, as previously stated, that helps you address patterns in your code that aren't easily handled another way. The waiters don't merely help with debugging; they also help with signaling to the test system when to wait in a deterministic fashion.

I don't disagree with your assertion about the correct orientation of the test system, and I also think that discussion and what follows it is best reserved for a separate forum.

@rwjblue
Copy link
Member

rwjblue commented Feb 7, 2020

Are we taking for granted that "New Test Waiters" do add value - and this fact itself is exempt from this RFC?

I don't think things are quite this simple. Test waiters are already a concept in Ember (and have been for quite a long time), and that is basically not going to change (though this RFC does make significant enhancements).

It's debatable that this add-on adds value to developers concerned with async problems in tests. Also, it's a 3-cost jump:

  • This add-on doesn't fix async tests (as we saw in our previous efforts), but only helps debugging. The existing waiter support from Ember is sufficient to pin-point the async and apply the fix

Nothing we can do will magically fix all async issues in tests. We shouldn't even really strive for that. Hiding async (e.g. via mocking) is fundamentally flawed, and will become a never ending game of whack-a-mole. The goal of the current waiter system in Ember is to allow the specific unit that introduces asynchrony to properly instruct the test environment when it is completed. If everything in the system that introduces it's own async leverages the waiter system (either the current one, or the one proposed in this RFC) the tests will not exhibit issues stemming from uncontrolled async.

  • It adds test-related code that you have to ship in Production

Indeed, this is true. The size increase this infers is mitigated by ensuring that we ship as little code as possible in production. One thing to keep in mind here is that, while this does propose actually shipping some test related bits in your app (or addon) code that "cost" has a very specific and distinct benefit: we will be able to make it possible to properly run your tests in production builds (which we cannot do with today's system).

  • Most importantly, it will encourage writing Integration tests more top-to-bottom (acceptance like) - misbalancing the test pyramid by widening the middle and top layers, and shrinking the Unit layer at the bottom. It prevents devs from refactoring and decoupling the async - that in many cases (like we saw in ours) can be the root cause of tests timing out.

I think this is actually massively misleading, and possibly the largest point of disagreement here. Let's be super clear: NOTHING in this RFC suggests that tests should not be properly composed. Extracting logic to simpler pieces (stand alone functions, simple classes, etc) that are directly unit tested is absolutely a great practice, and this RFC certainly does not suggest otherwise. That sort of extraction leads to very nice composition in actual use and enables the very test pyramid that you are describing.

Consider this:

// Parent component
<ChildImageComponent props=imageProps/>

// Child Image component
init() {
  workOnReceivedProps(); // async operation
  drawImage(); // async operation
}

// Parent test
await render(hbs`<ParentComponent .../>`)
"do some new test waiter magic" // since we can't control child's async
assert.dom(image was drawn...) // likely to timeout 

I don't really understand the inference you are making here. If the child component encapsulates the async that it is introducing, no changes are needed in the parents tests (e.g. that await render(hbs('<ParentComponent ... />')) will properly wait until the child component's async is completed). Specifically, the two inline comments RE: "do some new test waiter magic" and "likely to timeout" seem invalid.

@scalvert - We should consider how to update the RFC to make sure readers do not fall into this faulty assumption...

Such tests should be broken down into

I do not believe this is the correct way to break down tests for this, and in fact if you did do this your tests would be extremely brittle (where a refactoring that resulted in the same DOM but different underlying implementation, would fail many of these tests).

Let's break it down in detail though.

  • parent component only tests it's own stuff
  • parent tests that child was rendered (but not the drawing of the image)

A significant part of the contract provided by the parent component is that it renders the child component. Testing its own stuff means rendering the child component by definition. If you refactored these tests to provide a replacement for ChildComponent (which is pretty easy to do, and has nothing to do with this RFC at all), you would very easily end up in situations where you would do an internal refactor to ParentComponents template that results in passing tests (because you've essentially provided a stub for ChildComponent) but a completely broken reality, or alternatively that the tests all fail but things actually do work properly. Both cases are a sign of brittle tests.

  • child, in its own separate test, verifies that when options are given, image is drawn

Totally agree! Components should test what they do. 😃

  • if possible, child should (in some cases, must) mock the async when testing.

🙀 This sounds good, but is actually a bit concerning and takes care to do properly. I do agree with some form of statement around the child "mocking async", but I think we disagree with what that means. What I mean is that you mock your API requests with something like Pretender, what I suspect you mean is that you monkey patch the methods on your component to avoid doing the async in tests. These two approaches are very different.


@OmShiv - Thank you very much for chatting through these things here. I think it is extremely good to have these types of conversations, so that hopefully others can learn from our mistakes (and from what we do correctly). In the end, this RFC is not about the theology of testing or about what kinds of tests that you write (which is where it seems that our disagreement lies), but is instead about providing good primitives that allow each piece of the system to properly indicate to the rest of the test infrastructure that there is currently async underway. Nothing in this RFC prevents you from authoring tests in the style that you seem to prefer (though as you might imagine, I strongly suggest you rethink that style).

function buildWaiter(name: string): ITestWaiter
```

In anything but a production build, this function will return a `TestWaiter` instance. When in production mode, we make this instance _inert_ and essentially no cost to invoke. Since test waiters are intended to be called from application or addon code, but are only required to be _active_ when in tests, this process of making the instance _inert_ is important. Even though code is still invoked, this has a negligible impact on performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we feel about the plausibility of a system which would make this configurable. Basically users who want to optimize for test-ability of production builds can leave the inert code, and users who want to optimize for every byte can strip the inert code out?

Or is this already designed in such a way that it would be trivial to author an addon which stripped the inert code from production builds, allowing byte-watchers to opt-in to that size optimization? (I think it would be easy, but I'd like to hear from the authors)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two parts to this:

  1. ensuring that the addon's code itself is not shipped outside of testing
  2. authoring a babel plugin that allows the statements in app and addon code to be removed

Both are doable and should be investigated during implementation (though I don't think it's particularly important to happen before we land the RFC).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good callouts, @mixonic. Early on @rwjblue and I discussed the possibility of stripping code from prod builds that weren't destined to be tested. I think as Rob describes above, we could absolutely do this.

We also need to add a separate flag other than DEBUG, such as FOR_TESTING (don't worry about the name; naming is hard yada yada), but that will allow us to do prod builds that will allow for testing. Obviously this is a broader change that impacts a number of things, but is something we've also discussed.

@rwjblue
Copy link
Member

rwjblue commented Feb 7, 2020

After review and discussion with the core team at today's meeting, we are moving this RFC into final comment period.

@scalvert
Copy link
Contributor Author

@jherdman I wanted to circle back on your comment, as my reply to your comment really only touched on what a possible solution could be WRT addressing using waiters with unique async call patterns.

I've also opened a PR on ember-test-waiters to ensure that endAsync won't error if called repeatedly after a previously used token is used again. Hopefully this will help address some of your concerns when using this library with evented systems.

@dgeb
Copy link
Member

dgeb commented Feb 14, 2020

Sorry for the last minute comment, but the core team discussed the names of the interfaces used in this RFC during today's meeting. It occurred to me that we are almost exclusively using interfaces in both Glimmer and Ember that avoid the I prefix (It seems that an IRegistry interface has snuck in to the codebase, but it's not exposed as a public interface).

There is clearly a healthy debate in the TS community (and many others) about this topic, and there are even linting rules that favor each side. Hopefully we can all agree that consistency is the most important part of this decision. And in the Ember ecosystem we've favored never-prefix.

@rwjblue @scalvert Can we explore refactoring IWaiter -> Waiter and ITestWaiter -> TestWaiter? The implication then is that the TestWaiter class would need to be renamed. A standard but perhaps unergonomic choice would be to append Impl to class names that otherwise conflict with interfaces. Perhaps BaseTestWaiter would differentiate this class better? The point is that there's a reason that all implementations of the TestWaiter interface are not also instances of this class, and that's the distinction that the name should get at. I'm trying to not be too prescriptive or opinionated here - just trying to enforce consistency. Hopefully we can arrive at a nice alternative for the class names.

@rwjblue
Copy link
Member

rwjblue commented Feb 14, 2020

The classes themselves are not exposed as public api, only the buildWaiter which returns (currently) the ITestWaiter interface. I don’t think removing the I prefix is a substantial change.

@scalvert
Copy link
Contributor Author

@dgeb I'm totally happy to go with whatever convention the rest of the codebase dictates. Interestingly, @stefanpenner and I just had a discussion regarding I vs. no I for interface prefixes. I largely do it because I have a decade of C# standards drilled into me, but I can absolutely see the argument both ways.

In this case, we'll rename it.

@dgeb
Copy link
Member

dgeb commented Feb 14, 2020

@rwjblue @scalvert thank you both for your quick responses and your willingness to be flexible here! I'm glad that neither of you think this will be a substantial change. Hopefully we'll be able to restart the FCP very soon.

@scalvert
Copy link
Contributor Author

scalvert commented Feb 14, 2020

@dgeb emberjs/ember-test-waiters#117. Once we finalize the new class name for TestWaiter, I'll update this RFC to represent the new names including removing I from interfaces.

@OmShiv
Copy link

OmShiv commented Feb 15, 2020

@rwjblue, agree with all of your thoughts here, with ONLY disagreement around brittleness. Firstly, I appreciate you giving detailed explanation, and appropriate consideration to my concerns. My disagreement probably is best suited for a different forum. But pardon my intent to consider this as an opportunity to describe here.


In short, I differ here per my understanding:

Testing its own stuff means rendering the child component by definition.

If you compared how child is tested for rendering v/s how parent consumes it - the markup is same (by design. In fact, former is more robust/thorough)

<div> parent's dom ...
{{!How parent uses child 👇}}
<ChildComponent prop1={{this.parentProp}}/>
// How child's test looks like: child-test.js
await render(hbs`<ChildComponent prop1={{testProp}}/>)

The way child is EVER going to be used, specifically, <ChildComponent prop1={{testProp}}/> - is thoroughly tested at child (including behavior). This includes parent's usage (what I assume you called "reality") as a subset of all possible usages. Otherwise, it would be un-scalable, i.e., more than 1 parent consumers testing child end-to-end, including covering for child's async via a waiter-mechanism. That is both slow and non DRY. Also, one can't keep that test in a particular component of choice either, and call that all contracts are verified because Parent-1's contract is verified.


With this approach, however, I agree we are missing one thing - the specific contract with given parent. Verification of that contract should move a layer up into Acceptance, coz otherwise the contract doesn't make sense. In my opinion, any waiter mechanism is best suited there, including this new Test waiters - where I definitely see value!

More so, all such contracts, since they are only important as long as the user story is, should be compacted into even slimmer acceptance criteria, constituting the basis for Test Pyramid where top layer is thinner. Acceptance should be broken into Integration, and further, Integration into more units since a wider pyramid at base is more stable and infallible.

"Acceptance tests should be written using the standard agile framework of a user story"
-BDD, Wikipedia


Following this, these statements stand incorrect acc. to me. Implementation wise:

a refactoring that resulted in the same DOM but different underlying implementation

Reason: As long as my user story (acceptance crit.) is verified. I don't care if underlying implementation was different. Then next, "Reality" wise:

you would do an internal refactor ... that results in passing tests ... but a completely broken reality

Reason: The acceptance test should fail if reality was changed. If it doesn't, I shouldn't care.

...alternatively that the tests all fail but things actually do work properly

Won't happen either. Because if all tests fail, I must have broken down a single user story into integration, and further into unit tests. They are failing is a hint of things are not working properly.


Brittleness, and mocking. Pardon me for further exemplifying this with a known dummy example.

  • testing an add function that takes in two numbers, downloads an image asynchronously, and then returns the sum
    • my understanding: test waiter provides mechanism that helps the add function to notify its consumers correctly when the image download is complete
    • my argument: this is helpful, but not needed for thorough testing
    • I would mock the image download service here, and test it separately
    • my test for add will only await the mocked download promise (which is resolved) and immediately verify the result
      • probably we are in agreement here
    • this is sufficient
      • probably we differ here
  • testing another function multiplyAfterAdd which consumes add
    • I would also mock add entirely and take for granted that it works
      • probably we differ here, and this is what you are pointing as brittleness - correct me if wrong

One may argue that the signature of add can change while tests for only add are updated. So all tests collectively pass, but things have changed, which is worrisome. This is incorrect for an application with dependencies like this, because if add function changed signature, it no longer adds. It is highly unlikely in an application that multiplyAfterAdd is not updated to use a different dependency. Because it no longer multiplies after adding, and it should use a different dependency.

In other words, to circumvent brittleness, tests for multiplyAfterAdd can not be the ideal place. I am understanding test-waiter acting as an aid to devs to write application-like tests for multiplyAfterAdd by correctly tracking add's async. I strongly believe it's not needed and a more scalable approach is to move up the pyramid to verify all add contracts at a single top-level place, once (may be as part of user story) where a test-waiter including existing ones can prove useful.

@rwjblue
Copy link
Member

rwjblue commented Feb 21, 2020

@OmShiv - Thanks for adding more detail there. It sounds like we agree on quite a few things, which is great!

The primary thing that I think we will continue to disagree about is leveraging mocks to replace internal collaborators of a given unit. I absolutely agree that it is important to be able to do this (and would consider blockers to this ability to be a bug that must be fixed), but I likely could not disagree much more with the idea that this is something that should be done in common practice.

The good news here is that I don't need to convince you that I'm right (and you don't need to convince me that you are right), because we both see the requirement for the system proposed in this RFC. We only differ in the specific circumstances that we'd individually reach for it. I'm glad that we were able to iron out that agreement here, thank you again for your time in explaining your position.

@rwjblue
Copy link
Member

rwjblue commented Feb 21, 2020

We discussed this at todays core team meeting, and are very much in favor of moving forward!

@rwjblue rwjblue merged commit e3d18e6 into emberjs:master Feb 21, 2020
@scalvert scalvert deleted the new-test-waiters branch February 23, 2020 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants