Timeout waiting for IOPub output #426

gshiba · 2019-09-06T17:53:02Z

Hello!

Issue: When a cell ~~takes too long to execute~~ fills up the IOPub channel ZMQ buffer, papermill (or nbconvert?) (almost) silently trims off the output and carries on to the next cell. I believe it should raise an error and exit with non-zero.

It appears this is addressed in jupyter/nbconvert#994, but I can't tell if the fix there will solve this issue.

Thank you!

To reproduce: Make a notebook with a single cell (adapted from jupyter/nbconvert#659 (comment)):

import sys
import time
str = '0'
for x in range(0, 10000):
    sys.stdout.write(str*100)
    sys.stdout.flush()
    time.sleep(0.0001)
print('hi')

Then, execute it. A warning is printed, but exit code is zero, and the 'hi' is not printed.

$ papermill input.ipynb output.ipynb
Input Notebook:  /home/gosuke/tmp/input.ipynb
Output Notebook: output.ipynb
Executing:   0%|                                       | 0/2 [00:00<?, ?cell/s]
Timeout waiting for IOPub output
Executing: 100%|███████████████████████████████| 2/2 [00:27<00:00, 14.26s/cell]
$ echo $?
0
$ tail -n 60 output.ipynb
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
     ]
    }
   ],
   "source": [
    "import sys\n",
    "import time\n",
    "str = '0'\n",
    "for x in range(0, 10000):\n",
    "    sys.stdout.write(str*100)\n",
    "    sys.stdout.flush()\n",
    "    time.sleep(0.0001)\n",
    "print('hi')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:brick37]",
   "language": "python",
   "name": "conda-env-brick37-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  },
  "papermill": {
   "duration": 16.32673,
   "end_time": "2019-09-06T17:52:44.503019",
   "environment_variables": {},
   "exception": null,
   "input_path": "/home/gosuke/tmp/input.ipynb",
   "output_path": "output.ipynb",
   "parameters": {},
   "start_time": "2019-09-06T17:52:28.176289",
   "version": "1.1.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

The text was updated successfully, but these errors were encountered:

MSeal · 2019-09-07T19:47:25Z

Thanks for raising the issue and making a clearly reproducible case.

So the fix in jupyter/nbconvert#994 helps with the issue but doesn't make it impossible to occur. I can reproduce with the latest nbconvert which includes to mentioned changed. The part that's causing the issue is that the zmq buffer can't take is the number of tiny messages per nbconvert cycle. If the message rate is modified by sleeping longer:

import sys
import time
str = '0'
for x in range(0, 1000):
    sys.stdout.write(str*100)
    sys.stdout.flush()
    time.sleep(0.01)
print('hi')

or the sys.stdout.flush() is commented out then the message buffer size is fine and the cell will execute as expected.

I've touched pretty much every part of the code up to the pyzmq layer now and we can make it slightly better but mostly there is a hardish limit to max message rate a kernel client can handle. It may be worth making an issue on https://github.com/ipython/ipykernel to see if that kernel could apply backpressure or action skip to the sys flush call to prevent very high flush rates for just kernel executions.

All that being said, I'd be amendable to making raise_on_iopub_timeout default to true in papermill, but I might want to get @mpacer @willingc @rgbkrk 's opinions before we make a default change to this field, in case they have a good objection to changing the failure mode.

gshiba · 2019-09-10T20:07:26Z

My original use case isn't necessarily lots of small messages, but rather lots of pngs. Something like the following:

from IPython.display import display, Markdown, Image
len(directories)  # roughly 900
for d in directories:
    display(Markdown(f'# {d}'))
    for png in ['a.png', 'b.png', 'c.png']:  # Roughly 200KB, 200KB, and 50KB on disk
        display(Image(filename=f'{d}/{png}'))
    sleep(0.5)
x = 'hello'

# next cell
print(x)

The papermill output ipynb file is ~120MB and shows pngs for roughly ~500 of the 900 directories, and 'hello' is printed in the next cell. When I run the same notebook interactively (through Chrome), the browser tab crashes on my PC.

Is there a limit on size (in bytes) somewhere as well?

Either way, I'd be happy with an error being raised for now.

MSeal · 2019-09-16T16:47:02Z

While there's no explicit limit to a notebook output size, I would say that anything above 100MB is in the realm of "this will crash browsers". Papermill will actually handle very large notebooks better than browsers (it's only limited by rate of messages), but the format still doesn't support large files well.

MSeal · 2019-09-16T16:48:05Z

Other papermill devs @mpacer @captainsafia @rgbkrk @willingc , on the topic of raising an error for the buffer overload case I think this would be a reasonable change but it would differ from the default from nbconvert that's been held for a long time.

rgbkrk · 2019-09-18T22:24:44Z

Gosh, anything above 20 MB will hang most browsers.

rgbkrk · 2019-09-18T22:25:22Z

I think raising an error in papermill's case makes sense. How reproducible is the notebook if data is dropped?

MSeal · 2019-09-18T23:06:15Z

Highly reproducible as far as I could tell from my local tests.

willingc · 2019-09-18T23:50:50Z

I think raising an error makes sense. We can clarify in the docs and mention the difference in default behavior from nbconvert in the docs/docstring.

MSeal · 2019-09-19T16:43:13Z

I'll work on making that change then.

See the following links for details on the issues with IOPub and nbconvert's ExecutePreprocessor: - nteract/papermill#426 (comment) - jupyter/nbconvert#994

MSeal · 2020-02-11T18:45:54Z

Change made in the 2.0 release (was a single config line so I skipped the PR)

mirekphd · 2022-04-19T11:15:50Z

We started getting papermill.execute_notebook errors on a notebook cell with 'IOPub message rate exceeded.':

A cell timed out while it was being executed, after 4 seconds.
The message was: Timeout waiting for IOPub output.

This is almost certainly caused by this change from v2.0.0 finally kicking in:
IOPub timeouts now raise an exception instead of a warning.
[ https://papermill.readthedocs.io/en/latest/changelog.html ]

MSeal added this to the Papermill 2.0 milestone Jan 23, 2020

jsvine added a commit to jsvine/nbexec that referenced this issue Jan 24, 2020

By default, raise errors on IOPub timeouts

719ae2d

See the following links for details on the issues with IOPub and nbconvert's ExecutePreprocessor: - nteract/papermill#426 (comment) - jupyter/nbconvert#994

maartenbreddels mentioned this issue Feb 5, 2020

fix: avoid iopub losing messages by polling two iopub AND shell socket jupyter/nbconvert#1183

Closed

MSeal closed this as completed Feb 11, 2020

mwouts mentioned this issue Apr 11, 2020

Intermittent error when testing "jupytext --execute" on Windows mwouts/jupytext#489

Closed

dsavchenko mentioned this issue Mar 21, 2024

Cell timeout (presumably due to the large output) oda-hub/nb2workflow#157

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeout waiting for IOPub output #426

Timeout waiting for IOPub output #426

gshiba commented Sep 6, 2019 •

edited

Loading

MSeal commented Sep 7, 2019

gshiba commented Sep 10, 2019

MSeal commented Sep 16, 2019

MSeal commented Sep 16, 2019

rgbkrk commented Sep 18, 2019

rgbkrk commented Sep 18, 2019

MSeal commented Sep 18, 2019

willingc commented Sep 18, 2019

MSeal commented Sep 19, 2019

MSeal commented Feb 11, 2020

mirekphd commented Apr 19, 2022

Timeout waiting for IOPub output #426

Timeout waiting for IOPub output #426

Comments

gshiba commented Sep 6, 2019 • edited Loading

MSeal commented Sep 7, 2019

gshiba commented Sep 10, 2019

MSeal commented Sep 16, 2019

MSeal commented Sep 16, 2019

rgbkrk commented Sep 18, 2019

rgbkrk commented Sep 18, 2019

MSeal commented Sep 18, 2019

willingc commented Sep 18, 2019

MSeal commented Sep 19, 2019

MSeal commented Feb 11, 2020

mirekphd commented Apr 19, 2022

gshiba commented Sep 6, 2019 •

edited

Loading