Skip to content

Commit

Permalink
[test] smoke test fixes for managed jobs (#4217)
Browse files Browse the repository at this point in the history
* [test] don't wait for old pending jobs controller messages

`sky jobs queue` used to output a temporary "waiting" message while the managed
jobs controller was still being provisioned/starting. Since #3288 this is not
shown, and instead the queued jobs themselves will show PENDING/STARTING.

This also requires some changes to tests to permit the PENDING and STARTING
states for managed jobs.

* fix default aws region

* [test] wait for RECOVERING more quickly

Smoke tests were failing because some managed jobs were fulling recovering back
to the RUNNING state before the smoke test could catch the RECOVERING case (see
e.g. #4192 `test_managed_jobs_cancellation_gcp`). Change tests that manually
terminate a managed job instance, so that they will wait for the managed job to
change away from the RUNNING state, checking every 10s.

* address PR comments

* fix
  • Loading branch information
cg505 authored Oct 30, 2024
1 parent 276035b commit 8568ac4
Show file tree
Hide file tree
Showing 2 changed files with 108 additions and 115 deletions.
2 changes: 1 addition & 1 deletion tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ def enable_all_clouds(monkeypatch: pytest.MonkeyPatch) -> None:
@pytest.fixture
def aws_config_region(monkeypatch: pytest.MonkeyPatch) -> str:
from sky import skypilot_config
region = 'us-west-2'
region = 'us-east-2'
if skypilot_config.loaded():
ssh_proxy_command = skypilot_config.get_nested(
('aws', 'ssh_proxy_command'), None)
Expand Down
Loading

0 comments on commit 8568ac4

Please sign in to comment.