Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preload teardown failure interferes with scheduler idle timeout #6437

Closed
jrbourbeau opened this issue May 25, 2022 · 0 comments · Fixed by #6458
Closed

Preload teardown failure interferes with scheduler idle timeout #6437

jrbourbeau opened this issue May 25, 2022 · 0 comments · Fixed by #6458
Labels
bug Something is broken

Comments

@jrbourbeau
Copy link
Member

Earlier today I ran into a situation where a scheduler preload script had a teardown method failure which caused the scheduler's automatic idle timeout to not work. Here's a quick reproducer in the form of a test:

import asyncio

from distributed.utils_test import gen_cluster
from distributed.core import Status

bad_preload_text = """
def dask_setup(scheduler):
    scheduler.foo = 'setup'

def dask_teardown(scheduler):
    raise ValueError('ASDF')
"""

@gen_cluster(
    client=True,
    config={
        "distributed.scheduler.idle-timeout": "5s",
        "distributed.scheduler.preload": [bad_preload_text],
    },
)
async def test_idle_timeout_preload_error(c, s, a, b):
    assert s.foo == "setup"  # Confirm preload was run on setup

    while s.status != Status.closed:
        print(f"{s.status = }")
        await asyncio.sleep(0.1)

Currently this test hangs due to the ValueError raised in the preload dask_teardown method. If you replace, for example, raise ValueError('ASDF') with a simple return then the test passes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant