You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's been a while since I built flux on this ubuntu 22.04 LTS desktop but I am reliably hitting this test failure (on @grondosmake-deb-fix branch for #1082):
30/85 Test #30: t1024-alloc-check.t ...................***Failed 22.05 sec
expecting success:
load_test_resources ${excl_1N1B}
{"version":1,"execution":{"R_lite":[{"rank":"0","children":{"core":"0-15"}}],"starttime":0.0,"expiration":0.0,"nodelist":["cab1234"]}}
ok 1 - load test resources
expecting success:
load_resource &&
load_qmanager_sync
{
"tx": {
"request": 14,
"response": 0,
"event": 0,
"control": 0
},
"rx": {
"request": 1,
"response": 5,
"event": 0,
"control": 0
}
}
ok 2 - load fluxion modules
expecting success:
flux config load <<-EOT &&
[job-manager]
epilog.command = [ "flux", "perilog-run", "epilog", "-e", "sleep,2" ]
EOT
flux jobtap load perilog.so
ok 3 - configure epilog with delay
expecting success:
(for i in $(seq 5); do \
flux run -N1 -x -t1s sleep 30 || true; \
done) 2>joberr
ok 4 - submit node-exclusive jobs that exceed their time limit
expecting success:
grep "job.exception type=timeout" joberr
1.029s: job.exception type=timeout severity=0 resource allocation expired
1.027s: job.exception type=timeout severity=0 resource allocation expired
1.061s: job.exception type=timeout severity=0 resource allocation expired
1.030s: job.exception type=timeout severity=0 resource allocation expired
1.028s: job.exception type=timeout severity=0 resource allocation expired
ok 5 - some jobs received timeout exception
expecting success:
test_must_fail grep "job.exception type=alloc-check" joberr
0.023s: job.exception type=alloc-check severity=0 resources already allocated
0.057s: job.exception type=alloc-check severity=0 resources already allocated
test_must_fail: command succeeded: grep job.exception type=alloc-check joberr
not ok 6 - no jobs received alloc-check exception
#
# test_must_fail grep "job.exception type=alloc-check" joberr
#
expecting success:
flux job cancelall -f &&
flux queue idle &&
(flux resource undrain 0 || true)
flux-job: Canceled 2 jobs (0 errors)
0 jobs
flux-resource: ERROR: rank 0 not drained
ok 7 - clean up
expecting success:
(for i in $(seq 10); do \
flux run --ntasks=1 --cores-per-task=8 -t1s sleep 30 || true; \
done) 2>joberr2
ok 8 - submit non-exclusive jobs that exceed their time limit
expecting success:
grep "job.exception type=timeout" joberr2
1.029s: job.exception type=timeout severity=0 resource allocation expired
1.028s: job.exception type=timeout severity=0 resource allocation expired
1.026s: job.exception type=timeout severity=0 resource allocation expired
1.027s: job.exception type=timeout severity=0 resource allocation expired
1.027s: job.exception type=timeout severity=0 resource allocation expired
1.027s: job.exception type=timeout severity=0 resource allocation expired
1.029s: job.exception type=timeout severity=0 resource allocation expired
1.027s: job.exception type=timeout severity=0 resource allocation expired
1.028s: job.exception type=timeout severity=0 resource allocation expired
1.028s: job.exception type=timeout severity=0 resource allocation expired
ok 9 - some jobs received timeout exception
expecting success:
test_must_fail grep "job.exception type=alloc-check" joberr2
0.024s: job.exception type=alloc-check severity=0 resources already allocated
0.023s: job.exception type=alloc-check severity=0 resources already allocated
test_must_fail: command succeeded: grep job.exception type=alloc-check joberr2
not ok 10 - no jobs received alloc-check exception
#
# test_must_fail grep "job.exception type=alloc-check" joberr2
#
expecting success:
cleanup_active_jobs
Scheduling is stopped
flux-job: Canceled 2 jobs (0 errors)
0 jobs
ok 11 - clean up
expecting success:
remove_qmanager &&
remove_resource
ok 12 - remove fluxion modules
# failed 2 among 12 test(s)
1..12
Sep 28 19:19:10.826176 broker.err[0]: rc2.0: sh /home/garlick/proj/flux-sched/t/t1024-alloc-check.t --verbose Exited (rc=1) 21.4s
flux-start: 0 (pid 3410903) exited with rc=1
The text was updated successfully, but these errors were encountered:
Looks like you're right, I was thinking of something else. It makes me a bit wary about this. I haven't been able to reproduce the failure though so I'm not sure where to go with it. Maybe close pending a reappearance?
It's been a while since I built flux on this ubuntu 22.04 LTS desktop but I am reliably hitting this test failure (on @grondos
make-deb-fix
branch for #1082):The text was updated successfully, but these errors were encountered: