Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

occasional segfaults on CI #7879

Closed
keewis opened this issue May 29, 2023 · 3 comments
Closed

occasional segfaults on CI #7879

keewis opened this issue May 29, 2023 · 3 comments
Labels
Automation Github bots, testing workflows, release automation

Comments

@keewis
Copy link
Collaborator

keewis commented May 29, 2023

The upstream-dev CI currently fails sometimes due to a segfault (the normal CI crashes, too, but since we use pytest-xdist we only get a message stating "worker x crashed").

I'm not sure why, and I can't reproduce locally, either. Given that dask's local scheduler is in the traceback and the failing test is test_open_mfdataset_manyfiles, I assume there's some issue with parallel disk access or the temporary file creation.

log of the segfaulting CI job
============================= test session starts ==============================
platform linux -- Python 3.10.11, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/runner/work/xarray/xarray
configfile: setup.cfg
testpaths: xarray/tests, properties
plugins: env-0.8.1, xdist-3.3.1, timeout-2.1.0, cov-4.1.0, reportlog-0.1.2, hypothesis-6.75.6
timeout: 60.0s
timeout method: signal
timeout func_only: False
collected 16723 items / 2 skipped

xarray/tests/test_accessor_dt.py ....................................... [  0%]
........................................................................ [  0%]
........................................................................ [  1%]
........................................................................ [  1%]
...............................                                          [  1%]
xarray/tests/test_accessor_str.py ...................................... [  1%]
........................................................................ [  2%]
........................................................................ [  2%]
.............................................                            [  3%]
xarray/tests/test_array_api.py ...........                               [  3%]
xarray/tests/test_backends.py ........................X........x........ [  3%]
...................................s.........................X........x. [  3%]
.........................................s.........................X.... [  4%]
....x.......................................X........................... [  4%]
....................................X........x.......................... [  5%]
.............X.......................................................... [  5%]
....X........x....................x.x................X.................. [  5%]
x..x..x..x...................................X........x................. [  6%]
...x.x................X..................x..x..x..x..................... [  6%]
..............X........x....................x.x................X........ [  7%]
..........x..x..x..x.......................................X........x... [  7%]
...........................................X........x................... [  8%]
...ss........................X........x................................. [  8%]
.................X........x............................................. [  8%]
..X........x.............................................X........x..... [  9%]
............................................X........x.................. [  9%]
.........................................................X........x..... [ 10%]
......................................................................X. [ 10%]
.......x................................................................ [ 11%]
Fatal Python error: Segmentation fault

Thread 0x00007f9c7b8ff640 (most recent call first):
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/concurrent/futures/thread.py", line 81 in _worker
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 953 in run
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x00007f9c81f1d640 (most recent call first):
  File "/home/runner/work/xarray/xarray/xarray/backends/file_manager.py", line 216 in _acquire_with_cache_info
  File "/home/runner/work/xarray/xarray/xarray/backends/file_manager.py", line 198 in acquire_context
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/contextlib.py", line 135 in __enter__
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 392 in _acquire
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 398 in ds
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 336 in __init__
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 389 in open
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 588 in open_dataset
  File "/home/runner/work/xarray/xarray/xarray/backends/api.py", line 566 in open_dataset
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/utils.py", line 73 in apply
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/core.py", line 121 in _execute_task
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py", line 224 in execute_task
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py", line 238 in <listcomp>
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py", line 238 in batch_execute_tasks
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/concurrent/futures/thread.py", line 58 in run
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/concurrent/futures/thread.py", line 83 in _worker
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 953 in run
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x00007f9c82f1e640 (most recent call first):
  File "/home/runner/work/xarray/xarray/xarray/backends/file_manager.py", line 216 in _acquire_with_cache_info
  File "/home/runner/work/xarray/xarray/xarray/backends/file_manager.py", line 198 in acquire_context
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/contextlib.py", line 135 in __enter__
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 392 in _acquire
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 398 in ds
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 336 in __init__
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 389 in open
  File "/home/runner/work/xarray/xarray/xarray/backends/netCDF4_.py", line 588 in open_dataset
  File "/home/runner/work/xarray/xarray/xarray/backends/api.py", line 566 in open_dataset
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/utils.py", line 73 in apply
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/core.py", line 121 in _execute_task
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py", line 224 in execute_task
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py", line 238 in <listcomp>
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py", line 238 in batch_execute_tasks
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/concurrent/futures/thread.py", line 58 in run
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/concurrent/futures/thread.py", line 83 in _worker
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 953 in run
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x00007f9ca575e740 (most recent call first):
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/threading.py", line 320 in wait
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/queue.py", line 171 in get
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py", line 137 in queue_get
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py", line 500 in get_async
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/threaded.py", line 89 in get
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/base.py", line 595 in compute
  File "/home/runner/work/xarray/xarray/xarray/backends/api.py", line 1046 in open_mfdataset
  File "/home/runner/work/xarray/xarray/xarray/tests/test_backends.py", line 3295 in test_open_mfdataset_manyfiles
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/python.py", line 194 in pytest_pyfunc_call
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/python.py", line 1799 in runtest
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/runner.py", line 169 in pytest_runtest_call
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/runner.py", line 262 in <lambda>
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/runner.py", line 341 in from_call
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/runner.py", line 261 in call_runtest_hook
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/runner.py", line 222 in call_and_report
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/runner.py", line 133 in runtestprotocol
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/runner.py", line 114 in pytest_runtest_protocol
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/main.py", line 323 in _main
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/main.py", line 269 in wrap_session
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/config/__init__.py", line 166 in main
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/_pytest/config/__init__.py", line 189 in console_main
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/pytest/__main__.py", line 5 in <module>
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/runpy.py", line 86 in _run_code
  File "/home/runner/micromamba-root/envs/xarray-tests/lib/python3.10/runpy.py", line 196 in _run_module_as_main

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pandas._libs.ops, numexpr.interpreter, bottleneck.move, bottleneck.nonreduce, bottleneck.nonreduce_axis, bottleneck.reduce, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.indexing, pandas._libs.index, pandas._libs.internals, pandas._libs.join, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, cftime._cftime, yaml._yaml, cytoolz.utils, cytoolz.itertoolz, cytoolz.functoolz, cytoolz.dicttoolz, cytoolz.recipes, xxhash._xxhash, psutil._psutil_linux, psutil._psutil_posix, markupsafe._speedups, numpy.linalg.lapack_lite, matplotlib._c_internal_utils, PIL._imaging, matplotlib._path, kiwisolver._cext, scipy._lib._ccallback_c, _cffi_backend, unicodedata2, netCDF4._netCDF4, h5py._errors, h5py.defs, h5py._objects, h5py.h5, h5py.h5r, h5py.utils, h5py.h5s, h5py.h5ac, h5py.h5p, h5py.h5t, h5py._conv, h5py.h5z, h5py._proxy, h5py.h5a, h5py.h5d, h5py.h5ds, h5py.h5g, h5py.h5i, h5py.h5f, h5py.h5fd, h5py.h5pl, h5py.h5o, h5py.h5l, h5py._selector, pyproj._compat, pyproj._datadir, pyproj._network, pyproj._geod, pyproj.list, pyproj._crs, pyproj.database, pyproj._transformer, pyproj._sync, matplotlib._image, rasterio._version, rasterio._err, rasterio._filepath, rasterio._env, rasterio._transform, rasterio._base, rasterio.crs, rasterio._features, rasterio._warp, rasterio._io, numcodecs.compat_ext, numcodecs.blosc, numcodecs.zstd, numcodecs.lz4, numcodecs._shuffle, msgpack._cmsgpack, numcodecs.vlen, zstandard.backend_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._flinalg, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._statlib, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.cluster._vq, scipy.cluster._hierarchy, scipy.cluster._optimal_leaf_ordering, shapely.lib, shapely._geos, shapely._geometry_helpers, cartopy.trace, scipy.fftpack.convolve, tornado.speedups, cf_units._udunits2, scipy.io.matlab._mio_utils, scipy.io.matlab._streams, scipy.io.matlab._mio5_utils (total: 241)
/home/runner/work/_temp/b3f3888c-5349-4d19-80f6-41d140b86db5.sh: line 3:  6114 Segmentation fault      (core dumped) python -m pytest --timeout=60 -rf --report-log output-3.10-log.jsonl
@keewis keewis added Automation Github bots, testing workflows, release automation dependencies Pull requests that update a dependency file and removed dependencies Pull requests that update a dependency file labels May 29, 2023
@keewis keewis changed the title occasional segfaults on the upstream-dev CI occasional segfaults on CI May 29, 2023
@huard
Copy link
Contributor

huard commented May 29, 2023

There are similar segfaults in an xncml PR: xarray-contrib/xncml#48

Googling around suggest it is related to netCDF not being thread-safe and recent python-netcdf4 releasing the GIL.

@keewis
Copy link
Collaborator Author

keewis commented May 29, 2023

If I'm reading the different issues correctly, that means this is a duplicate of #7079

@max-sixty
Copy link
Collaborator

Defering to #7079

@max-sixty max-sixty closed this as not planned Won't fix, can't repro, duplicate, stale Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Automation Github bots, testing workflows, release automation
Projects
None yet
Development

No branches or pull requests

3 participants