`test_statistical_profiling` flaky #4974

jrbourbeau · 2021-06-24T22:46:28Z

Details:

2021-06-24T21:59:01.3710970Z ================================== FAILURES ===================================
2021-06-24T21:59:01.3711706Z _________________________ test_statistical_profiling __________________________
2021-06-24T21:59:01.3712225Z 
2021-06-24T21:59:01.3712809Z outer_args = (), kwargs = {}, result = None
2021-06-24T21:59:01.3713822Z coro = <function gen_cluster.<locals>._.<locals>.test_func.<locals>.coro at 0x000002382A9FA040>
2021-06-24T21:59:01.3714465Z 
2021-06-24T21:59:01.3716034Z     def test_func(*outer_args, **kwargs):
2021-06-24T21:59:01.3717249Z         result = None
2021-06-24T21:59:01.3718008Z         workers = []
2021-06-24T21:59:01.3719088Z         with clean(timeout=active_rpc_timeout, **clean_kwargs) as loop:
2021-06-24T21:59:01.3719816Z     
2021-06-24T21:59:01.3720239Z             async def coro():
2021-06-24T21:59:01.3720748Z                 with dask.config.set(config):
2021-06-24T21:59:01.3721286Z                     s = False
2021-06-24T21:59:01.3721700Z                     for _ in range(60):
2021-06-24T21:59:01.3722157Z                         try:
2021-06-24T21:59:01.3722618Z                             s, ws = await start_cluster(
2021-06-24T21:59:01.3723145Z                                 nthreads,
2021-06-24T21:59:01.3723597Z                                 scheduler,
2021-06-24T21:59:01.3724072Z                                 loop,
2021-06-24T21:59:01.3724534Z                                 security=security,
2021-06-24T21:59:01.3725155Z                                 Worker=Worker,
2021-06-24T21:59:01.3726682Z                                 scheduler_kwargs=scheduler_kwargs,
2021-06-24T21:59:01.3727288Z                                 worker_kwargs=worker_kwargs,
2021-06-24T21:59:01.3727795Z                             )
2021-06-24T21:59:01.3728251Z                         except Exception as e:
2021-06-24T21:59:01.3728805Z                             logger.error(
2021-06-24T21:59:01.3729335Z                                 "Failed to start gen_cluster: "
2021-06-24T21:59:01.3730129Z                                 f"{e.__class__.__name__}: {e}; retrying",
2021-06-24T21:59:01.3730653Z                                 exc_info=True,
2021-06-24T21:59:01.3731100Z                             )
2021-06-24T21:59:01.3732331Z                             await asyncio.sleep(1)
2021-06-24T21:59:01.3732819Z                         else:
2021-06-24T21:59:01.3733282Z                             workers[:] = ws
2021-06-24T21:59:01.3733727Z                             args = [s] + workers
2021-06-24T21:59:01.3734189Z                             break
2021-06-24T21:59:01.3734597Z                     if s is False:
2021-06-24T21:59:01.3735241Z                         raise Exception("Could not start cluster")
2021-06-24T21:59:01.3735769Z                     if client:
2021-06-24T21:59:01.3736237Z                         c = await Client(
2021-06-24T21:59:01.3736685Z                             s.address,
2021-06-24T21:59:01.3737161Z                             loop=loop,
2021-06-24T21:59:01.3737680Z                             security=security,
2021-06-24T21:59:01.3738828Z                             asynchronous=True,
2021-06-24T21:59:01.3739355Z                             **client_kwargs,
2021-06-24T21:59:01.3739758Z                         )
2021-06-24T21:59:01.3740184Z                         args = [c] + args
2021-06-24T21:59:01.3740577Z                     try:
2021-06-24T21:59:01.3741107Z                         future = func(*args, *outer_args, **kwargs)
2021-06-24T21:59:01.3741622Z                         if timeout:
2021-06-24T21:59:01.3742225Z                             future = asyncio.wait_for(future, timeout)
2021-06-24T21:59:01.3742809Z                         result = await future
2021-06-24T21:59:01.3743321Z                         if s.validate:
2021-06-24T21:59:01.3743839Z                             s.validate_state()
2021-06-24T21:59:01.3745087Z                     finally:
2021-06-24T21:59:01.3745674Z                         if client and c.status not in ("closing", "closed"):
2021-06-24T21:59:01.3746358Z                             await c._close(fast=s.status == Status.closed)
2021-06-24T21:59:01.3747031Z                         await end_cluster(s, workers)
2021-06-24T21:59:01.3747667Z                         await asyncio.wait_for(cleanup_global_workers(), 1)
2021-06-24T21:59:01.3748216Z     
2021-06-24T21:59:01.3748658Z                     try:
2021-06-24T21:59:01.3749148Z                         c = await default_client()
2021-06-24T21:59:01.3749664Z                     except ValueError:
2021-06-24T21:59:01.3750156Z                         pass
2021-06-24T21:59:01.3750581Z                     else:
2021-06-24T21:59:01.3751801Z                         await c._close(fast=True)
2021-06-24T21:59:01.3752249Z     
2021-06-24T21:59:01.3752638Z                     def get_unclosed():
2021-06-24T21:59:01.3753259Z                         return [c for c in Comm._instances if not c.closed()] + [
2021-06-24T21:59:01.3753784Z                             c
2021-06-24T21:59:01.3754324Z                             for c in _global_clients.values()
2021-06-24T21:59:01.3755400Z                             if c.status != "closed"
2021-06-24T21:59:01.3755856Z                         ]
2021-06-24T21:59:01.3756225Z     
2021-06-24T21:59:01.3756560Z                     try:
2021-06-24T21:59:01.3756995Z                         start = time()
2021-06-24T21:59:01.3757567Z                         while time() < start + 60:
2021-06-24T21:59:01.3758069Z                             gc.collect()
2021-06-24T21:59:01.3758546Z                             if not get_unclosed():
2021-06-24T21:59:01.3759031Z                                 break
2021-06-24T21:59:01.3759510Z                             await asyncio.sleep(0.05)
2021-06-24T21:59:01.3760018Z                         else:
2021-06-24T21:59:01.3760454Z                             if allow_unclosed:
2021-06-24T21:59:01.3761060Z                                 print(f"Unclosed Comms: {get_unclosed()}")
2021-06-24T21:59:01.3761719Z                             else:
2021-06-24T21:59:01.3762303Z                                 raise RuntimeError("Unclosed Comms", get_unclosed())
2021-06-24T21:59:01.3762948Z                     finally:
2021-06-24T21:59:01.3763447Z                         Comm._instances.clear()
2021-06-24T21:59:01.3764060Z                         _global_clients.clear()
2021-06-24T21:59:01.3764490Z     
2021-06-24T21:59:01.3764968Z                     return result
2021-06-24T21:59:01.3765344Z     
2021-06-24T21:59:01.3765783Z >           result = loop.run_sync(
2021-06-24T21:59:01.3766369Z                 coro, timeout=timeout * 2 if timeout else timeout
2021-06-24T21:59:01.3766950Z             )
2021-06-24T21:59:01.3767201Z 
2021-06-24T21:59:01.3767626Z distributed\utils_test.py:976: 
2021-06-24T21:59:01.3768161Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-06-24T21:59:01.3768939Z C:\Miniconda3\envs\dask-distributed\lib\site-packages\tornado\ioloop.py:530: in run_sync
2021-06-24T21:59:01.3769786Z     return future_cell[0].result()
2021-06-24T21:59:01.3770335Z distributed\utils_test.py:935: in coro
2021-06-24T21:59:01.3770884Z     result = await future
2021-06-24T21:59:01.3771615Z C:\Miniconda3\envs\dask-distributed\lib\asyncio\tasks.py:494: in wait_for
2021-06-24T21:59:01.3772331Z     return fut.result()
2021-06-24T21:59:01.3772822Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-06-24T21:59:01.3773082Z 
2021-06-24T21:59:01.3773555Z c = <Client: No scheduler connected>
2021-06-24T21:59:01.3774311Z s = <Scheduler: "tcp://127.0.0.1:56127" workers: 0 cores: 0, tasks: 0>
2021-06-24T21:59:01.3775451Z a = <Worker: 'tcp://127.0.0.1:56128', 0, Status.closed, stored: 0, running: 0/1, ready: 0, comm: 0, waiting: 0>
2021-06-24T21:59:01.3776497Z b = <Worker: 'tcp://127.0.0.1:56130', 1, Status.closed, stored: 0, running: 0/2, ready: 0, comm: 0, waiting: 0>
2021-06-24T21:59:01.3777096Z 
2021-06-24T21:59:01.3777681Z     @pytest.mark.flaky(reruns=10, reruns_delay=5, condition=MACOS)
2021-06-24T21:59:01.3778377Z     @gen_cluster(client=True)
2021-06-24T21:59:01.3778959Z     async def test_statistical_profiling(c, s, a, b):
2021-06-24T21:59:01.3779629Z         futures = c.map(slowinc, range(10), delay=0.1)
2021-06-24T21:59:01.3780250Z         await wait(futures)
2021-06-24T21:59:01.3780680Z     
2021-06-24T21:59:01.3781158Z         profile = a.profile_keys["slowinc"]
2021-06-24T21:59:01.3781749Z >       assert profile["count"]
2021-06-24T21:59:01.3782180Z E       assert 0
2021-06-24T21:59:01.3782453Z 
2021-06-24T21:59:01.3783038Z distributed\tests\test_worker.py:1079: AssertionError
2021-06-24T21:59:01.3783841Z ---------------------------- Captured stderr call -----------------------------
2021-06-24T21:59:01.3784668Z distributed.scheduler - INFO - Clear task state
2021-06-24T21:59:01.3785604Z distributed.scheduler - INFO -   Scheduler at:     tcp://127.0.0.1:56127
2021-06-24T21:59:01.3786543Z distributed.scheduler - INFO -   dashboard at:           127.0.0.1:56126
2021-06-24T21:59:01.3787445Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:56128
2021-06-24T21:59:01.3788349Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:56128
2021-06-24T21:59:01.3789235Z distributed.worker - INFO -          dashboard at:            127.0.0.1:56129
2021-06-24T21:59:01.3790072Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:56127
2021-06-24T21:59:01.3790997Z distributed.worker - INFO - -------------------------------------------------
2021-06-24T21:59:01.3791802Z distributed.worker - INFO -               Threads:                          1
2021-06-24T21:59:01.3792582Z distributed.worker - INFO -                Memory:                   7.00 GiB
2021-06-24T21:59:01.3793703Z distributed.worker - INFO -       Local Directory: D:\a\distributed\distributed\dask-worker-space\worker-5u5l3qhr
2021-06-24T21:59:01.3795120Z distributed.worker - INFO - -------------------------------------------------
2021-06-24T21:59:01.3796081Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:56130
2021-06-24T21:59:01.3796966Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:56130
2021-06-24T21:59:01.3797835Z distributed.worker - INFO -          dashboard at:            127.0.0.1:56131
2021-06-24T21:59:01.3798681Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:56127
2021-06-24T21:59:01.3799603Z distributed.worker - INFO - -------------------------------------------------
2021-06-24T21:59:01.3800450Z distributed.worker - INFO -               Threads:                          2
2021-06-24T21:59:01.3801199Z distributed.worker - INFO -                Memory:                   7.00 GiB
2021-06-24T21:59:01.3802330Z distributed.worker - INFO -       Local Directory: D:\a\distributed\distributed\dask-worker-space\worker-cit4w7h1
2021-06-24T21:59:01.3803484Z distributed.worker - INFO - -------------------------------------------------
2021-06-24T21:59:01.3804635Z distributed.scheduler - INFO - Register worker <WorkerState 'tcp://127.0.0.1:56128', name: 0, memory: 0, processing: 0>
2021-06-24T21:59:01.3806036Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:56128
2021-06-24T21:59:01.3807017Z distributed.core - INFO - Starting established connection
2021-06-24T21:59:01.3808152Z distributed.scheduler - INFO - Register worker <WorkerState 'tcp://127.0.0.1:56130', name: 1, memory: 0, processing: 0>
2021-06-24T21:59:01.3809364Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:56130
2021-06-24T21:59:01.3810386Z distributed.core - INFO - Starting established connection
2021-06-24T21:59:01.3811283Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:56127
2021-06-24T21:59:01.3812207Z distributed.worker - INFO - -------------------------------------------------
2021-06-24T21:59:01.3813132Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:56127
2021-06-24T21:59:01.3814012Z distributed.worker - INFO - -------------------------------------------------
2021-06-24T21:59:01.3815055Z distributed.core - INFO - Starting established connection
2021-06-24T21:59:01.3815908Z distributed.core - INFO - Starting established connection
2021-06-24T21:59:01.3817599Z distributed.scheduler - INFO - Receive client connection: Client-fc8681fe-d536-11eb-8624-000d3ae50775
2021-06-24T21:59:01.3819576Z distributed.core - INFO - Starting established connection
2021-06-24T21:59:01.3820788Z distributed.scheduler - INFO - Remove client Client-fc8681fe-d536-11eb-8624-000d3ae50775
2021-06-24T21:59:01.3822280Z distributed.scheduler - INFO - Remove client Client-fc8681fe-d536-11eb-8624-000d3ae50775
2021-06-24T21:59:01.3823786Z distributed.scheduler - INFO - Close client connection: Client-fc8681fe-d536-11eb-8624-000d3ae50775
2021-06-24T21:59:01.3826034Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:56128
2021-06-24T21:59:01.3826975Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:56130
2021-06-24T21:59:01.3828098Z distributed.scheduler - INFO - Remove worker <WorkerState 'tcp://127.0.0.1:56128', name: 0, memory: 0, processing: 0>
2021-06-24T21:59:01.3829157Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:56128
2021-06-24T21:59:01.3830496Z distributed.scheduler - INFO - Remove worker <WorkerState 'tcp://127.0.0.1:56130', name: 1, memory: 0, processing: 0>
2021-06-24T21:59:01.3832824Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:56130
2021-06-24T21:59:01.3838406Z distributed.scheduler - INFO - Lost all workers
2021-06-24T21:59:01.3839282Z distributed.scheduler - INFO - Scheduler closing...
2021-06-24T21:59:01.3840156Z distributed.scheduler - INFO - Scheduler closing all comms
2021-06-24T21:59:01.3841976Z ============================== warnings summary ===============================

The text was updated successfully, but these errors were encountered:

jrbourbeau added the flaky test Intermittent failures on CI. label Jun 24, 2021

fjetter mentioned this issue Feb 16, 2022

Failing tests overview #5818

Open

51 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`test_statistical_profiling` flaky #4974

`test_statistical_profiling` flaky #4974

jrbourbeau commented Jun 24, 2021

test_statistical_profiling flaky #4974

test_statistical_profiling flaky #4974

Comments

jrbourbeau commented Jun 24, 2021

`test_statistical_profiling` flaky #4974

`test_statistical_profiling` flaky #4974