Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resumed->rescheduled is an invalid transition #6685

Closed
hendrikmakait opened this issue Jul 7, 2022 · 1 comment · Fixed by #6913
Closed

resumed->rescheduled is an invalid transition #6685

hendrikmakait opened this issue Jul 7, 2022 · 1 comment · Fixed by #6913
Assignees

Comments

@hendrikmakait
Copy link
Member

hendrikmakait commented Jul 7, 2022

def test_resumed_executing_task_releases_resources_on_reschedule(ws_with_running_task):
    ws = ws_with_running_task

    ws.handle_stimulus(FreeKeysEvent("cancel", ["x"]))
    assert ws.tasks["x"].state == "cancelled"
    assert ws.available_resources == {"R": 0}

    instructions = ws.handle_stimulus(
        ComputeTaskEvent.dummy(
            key="y",
            who_has={"x": ["127.0.0.1:1235"]},
            nbytes={"x": 8},
            stimulus_id="compute",
        )
    )
    assert ws.tasks["x"].state == "resumed"
    assert not instructions

    ws.handle_stimulus(
        RescheduleEvent(key="x", stimulus_id="reschedule")
    )
    assert ws.tasks["x"].state == "rescheduled"
    assert ws.available_resources == {"R": 1}

fails with distributed.worker_state_machine.InvalidTransition: InvalidTransition: x :: resumed->rescheduled. However, this might be a legitimate series of events to my understanding.

@crusaderky crusaderky changed the title resumed->rescheduled is an invalid transition. resumed->rescheduled is an invalid transition Jul 8, 2022
@crusaderky
Copy link
Collaborator

  1. task x is started on w1
  2. task x is cancelled
  3. task x is started on w2 and terminates successfully
  4. task y, which depends on x, is started on w1 -> task x is transitioned to resumed(fetch)
  5. task x calls raise Reschedule()

There's no such thing as a ("resumed" "rescheduled") transition.
The worker will kill itself off with @fail_hard, losing all data stored on itself.

This is tightly related to #6709.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants