Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mock time during integration tests #507

Merged
merged 7 commits into from
May 23, 2022

Conversation

olivierphi
Copy link

@olivierphi olivierphi commented May 13, 2022

Finally, it seems to work! (even on Windows, even though it was harder there because of asyncio 😔 )

Why I used classic mocking rather than dependencies injection

In the end I didn't opt for a passing of a get_time function in our class constructors, because in order to be able to simulate the time we also have to mock await asyncio.wait([duration]), so that would be one more function to pass to all these constructors which would probably make the code less nice to work with.

That's why I opted for good old mocks instead, which limits the impact of this PR on existing code - the only changes I had to make were to factorise a bit more our calls to time.monotonic().

No use of time_machine for the moment

I could have used time_machine; however, we're clearly not in the case of a classic stateless Web backend, where the state is entirely in databases rather than in the code.
In our own case, on top of calls to time.monotonic() we also use await asyncio.wait([duration]) to control the time, which to me makes tools such a time_machine less relevant?

closes #499
closes #500

@olivierphi olivierphi force-pushed the try-mocking-time-during-integration-tests branch 6 times, most recently from d54cc11 to a194c29 Compare May 17, 2022 10:39
from typing import Any, Callable, TypeVar

from dataclasses import dataclass

from . import log
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't mind me, removing some unused imports while I'm there... 😄

@@ -3,14 +3,11 @@
from abc import ABC, abstractmethod
import asyncio
import sys
from time import monotonic
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing occurrences of this as much as possible in the code, so we have less stuff to mock during our integration tests

@@ -186,10 +179,13 @@ def animate(
raise AttributeError(
f"Can't animate attribute {attribute!r} on {obj!r}; attribute does not exist"
)
assert not all(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just an additional check while I was there

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both values could be 0. This check should probably be explicit when checking for None. You probably don't want to use all for something like this, an expressions is probably more efficient.

@@ -139,10 +136,6 @@ def __init__(self, target: MessageTarget, frames_per_second: int = 60) -> None:
pause=True,
)

def get_time(self) -> float:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, I don't think we have a reason to maintain this as a public method?

@@ -5,4 +5,9 @@
if TYPE_CHECKING:
from .app import App


class NoActiveAppError(RuntimeError):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're catching plain Python LookupError in some of our code in order to not interrupt what we're doing when there is no "active App" registered in the current asyncio context - this is the raw error that is raised by ContextVar.get() under the hood.

It works of course, but as I was running in cases where this was happening in the integration tests I noticed that it was not super clear for me as to why we do this in the code that does this errors interception, like there:

        try:
            self.app.stylesheet.update(self.app, animate=True)
            self.refresh()
        except LookupError: # why would a LookupError pop here? Not crystal clear (to me at least 😅)
            pass

That's why I changed that to to the following:

        try:
            self.app.stylesheet.update(self.app, animate=True)
            self.refresh()
        except NoActiveAppError: # more explicit. hopefully?
            pass

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be necessary to catch this case, except internally. But it could leak out, and I like that it's explicit

[SCREEN_SIZE, 10, "placeholder_7", False, 0.01, (3, 4, 5, 6, 7), "others"],
[SCREEN_SIZE, 10, "placeholder_9", False, 0.01, (5, 6, 7, 8, 9), "others"],
[SCREEN_SIZE, 10, None, None, 0.01, (0, 1, 2, 3, 4)],
[SCREEN_SIZE, 10, "placeholder_3", False, 0.01, (0, 1, 2, 3, 4)],
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that this test gives deterministic results, thanks to our mocking of the time, we can remove this last_screen_expected_out_of_viewport_placeholder_ids parameter form the test params, and simply check that all the elements that are not in the "supposed to be visible" list are well and truly not visible :-)

tests/test_integration_scrolling.py Show resolved Hide resolved
run_task = asyncio.create_task(run_app())

# We have to do this because `run_app()` is running in its own async task, and our test is going to
# run in this one - so the app must also be the active App in our current context:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took me a while to figure this one out! 😅

# > The resolution of the monotonic clock on Windows is usually around 15.6 msec.
# > The best resolution is 0.5 msec.
# @link https://docs.python.org/3/library/asyncio-platforms.html:
ASYNCIO_EVENTS_PROCESSING_REQUIRED_PERIOD = 0.025 if WINDOWS else 0.002
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very elegant, but it definitely seem that the granularity of asyncio is not the same for Windows than for Unix-y OSes - as specific by the Python documentation.
So after we triggered some Timers in our "mocked time" machinery it seems that we have to await asyncio.wait() for a much longer period in Windows if we want asyncio pending events to be processed during this time.

And uvloop is not even available for Windows, so we cannot rely on it to make the integration tests faster on Windows 😔
(which is a bit ironic as uvloop is a Python wrapper of libuv, which was especially created to make Node.js work on Windows, back in the days)


ticks_count = ceil(seconds * ticks_granularity_fps)
activated_timers_count_total = 0
for tick_counter in range(ticks_count):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naive implementation for the moment: for each simulated second we loop 60 times, and for each tick we check if we have some mocked calls to asyncio.wait([duration]) to trigger

@olivierphi olivierphi force-pushed the try-mocking-time-during-integration-tests branch from a194c29 to 6669216 Compare May 17, 2022 11:09

return activated_timers_count

with mock.patch("textual._timer._TIMERS_CAN_SKIP", new=False), mock.patch(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we end up with 3 things to mock in our tests to simulate time in Textual; not too bad! 🙂

@olivierphi olivierphi marked this pull request as ready for review May 17, 2022 11:31
Copy link
Collaborator

@willmcgugan willmcgugan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a discussion, I think what we need is a Clock class with two methods; get_time and sleep which by default would return the monotonic time and call asyncio.sleep. This would essentially replicate the current behaviour.

The get_time method should probably be a coroutine.

When testing we can replace this with a idealize clock which can be mocked. An update_time method would advance the time, and the sleep method would wake coroutines according to the test time.

This should help us avoid some ackward logic in tests.

@@ -186,10 +179,13 @@ def animate(
raise AttributeError(
f"Can't animate attribute {attribute!r} on {obj!r}; attribute does not exist"
)
assert not all(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both values could be 0. This check should probably be explicit when checking for None. You probably don't want to use all for something like this, an expressions is probably more efficient.

animation_keys = list(self._animations.keys())
for animation_key in animation_keys:
animation = self._animations[animation_key]
if animation(animation_time):
del self._animations[animation_key]

def _get_time(self) -> float:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@@ -5,4 +5,9 @@
if TYPE_CHECKING:
from .app import App


class NoActiveAppError(RuntimeError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be necessary to catch this case, except internally. But it could leak out, and I like that it's explicit

@olivierphi olivierphi force-pushed the try-mocking-time-during-integration-tests branch from 303f964 to f5969fb Compare May 17, 2022 16:15
@olivierphi
Copy link
Author

@willmcgugan As discussed, I added a new _clock module, exposing 3 functions that should now be the single source of truth for anything related to time in Textual: f5969fb 🙂

My initial implementation was made of a Clock class in this module, and a singleton of it exposed as _clock.clock .
However, the issue with this is that mocking the singleton was not effective, because other Python modules were using from ._clock import clock, according to the existing code style where we use direct imports from modules.
Because of this it was too late for the patch, as a direct reference to this non-patched singleton were already created in all these modules here and there - that would not be impacted by the mocking of the singleton later on.

To solve this I opted for a slightly different approach, where I expose only 3 plain functions in this _clock module, which are simple proxys for calls to such a _clock singleton - the difference being that the singleton and its underlying class are hidden APIs, so we can mock the singleton and be sure that other Python modules didn't have any direct reference to it before the patch.
What do you think?

@olivierphi olivierphi requested a review from willmcgugan May 17, 2022 16:19
Copy link
Collaborator

@willmcgugan willmcgugan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some requests...

return _clock.get_time()


async def aget_time() -> float:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not keen on the name. I have a policy of no abbreviations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan at all either to be honest 😅 - I just opted for that a prefix because it seems to become the idiomatic way to make things "asyncio-related" in Python?

i.e. object.__enter__ and object.__exit__ become object.__aenter__ and object.__aexit__ for async contexts in the Python data model.

And we find the same pattern in Django now: aget_or_create, acount, aaggregate, etc .

django/django#14843

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re absolutely right, but I think we can avoid it in Textual.

Perhaps get_time for the coroutine and get_time_no_wait for the sync function? It’s consistent with some other Textual methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using the "await me maybe" pattern 👀

https://simonwillison.net/2020/Sep/2/await-me-maybe/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using the "await me maybe" pattern eyes

https://simonwillison.net/2020/Sep/2/await-me-maybe/

This pattern is good, and we use it already in Textual (for event listeners for example) 🙂 , but the problem with it is that the caller must be in an async context itself to use it - which is not the case for some calls we need to make to these clock-related functions, as they can be called in some classes' constructors for example

# (which can only be "unlocked" by calls to `move_clock_forward()`)
await event.wait()

async def move_clock_forward(self, *, seconds: float) -> tuple[float, int]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest "advance_clock"

Does it need to be a keyword only argument? We generally exclusively use seconds for time.


# let's make the time advance _very_ slightly between 2 consecutive calls of this function,
# within the same order of magnitude than 2 consecutive calls to ` timer.monotonic()`:
self._current_time += 1.1e-06
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not keen on this. I think it may create spin loops and it may create non-deterministic output.

The get_time should bean idealized version of time that advances in predictable increments.

return # the app may not have a screen yet

# We artificially tell the Compositor that the whole area should be refreshed
screen._compositor._dirty_regions = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I might expose this on the Screen object to avoid digging in to the internals.

self._pending_sleep_events.append((target_event_monotonic_time, event))
# Ok, let's wait for this Event
# (which can only be "unlocked" by calls to `move_clock_forward()`)
await event.wait()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@olivierphi
Copy link
Author

@willmcgugan here you go, feedback hopefully addressed! 🙂 ca6d7be

event.set()
activated_timers_count += 1
# ...and remove it from our pending sleep events list:
del self._pending_sleep_events[i]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mutating a list being iterated over, which may cause unexpected side-effects. Deleting indexes from a list can be problematic. Suggest a different structure, dict or set...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I converted the list into a dict, which is also a much more elegant design 👌 0ee9c57

Oh and while I was there I also (arbitrarily) increased the non-Windows duration of the "atomic waiting time to let asyncio do its stuff after we advanced the clock", hopefully this will make the test suite pass on your local own environment! 🤞

Olivier Philippon added 6 commits May 20, 2022 14:49
I had to add a flag in the `_timer` module that allows us to completely disable the  "skip" feature of Timers, though - but it shouldn't cause too much trouble 🤞
…time

This makes time quite easier to mock during integration tests :-)
@olivierphi olivierphi force-pushed the try-mocking-time-during-integration-tests branch from ca6d7be to 0ee9c57 Compare May 20, 2022 13:51
@olivierphi olivierphi requested a review from willmcgugan May 20, 2022 13:57
@olivierphi
Copy link
Author

@willmcgugan could that be the last commit of this PR before its merging? Time will tell! 😄 631f7a5

@willmcgugan
Copy link
Collaborator

Thanks for sticking with it!

@willmcgugan willmcgugan merged commit c212fd5 into css May 23, 2022
@willmcgugan willmcgugan deleted the try-mocking-time-during-integration-tests branch May 23, 2022 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants