Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: Failing test when built with Cython 3.0b2 #53125

Open
h-vetinari opened this issue May 7, 2023 · 11 comments
Open

TST: Failing test when built with Cython 3.0b2 #53125

h-vetinari opened this issue May 7, 2023 · 11 comments
Labels
Compat pandas objects compatability with Numpy or Python functions Testing pandas testing functions or related to the test suite

Comments

@h-vetinari
Copy link
Contributor

Encountered in conda-forge/pandas-feedstock#162 on linux when building pandas 2.0.1 against Cython 3.0.0b2 and running the test suite

=================================== FAILURES ===================================
_______________________ test_add_out_of_pydatetime_range _______________________
[gw0] linux -- Python 3.10.10 $PREFIX/bin/python

    @pytest.mark.xfail(is_numpy_dev, reason="result year is 1973, unclear why")
    def test_add_out_of_pydatetime_range():
        # GH#50348 don't raise in Timestamp.replace
        ts = Timestamp(np.datetime64("-20000-12-31"))
        off = YearEnd()
    
        result = ts + off
        expected = Timestamp(np.datetime64("-19999-12-31"))
>       assert result == expected
E       AssertionError: assert Timestamp('1973-12-31 00:00:00') == Timestamp('-19999-12-31 00:00:00')

Thankfully, it is the only error:

=========================== short test summary info ============================
FAILED ../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/site-packages/pandas/tests/tseries/offsets/test_year.py::test_add_out_of_pydatetime_range - AssertionError: assert Timestamp('1973-12-31 00:00:00') == Timestamp('-19999-12-31 00:00:00')
= 1 failed, 172055 passed, 23928 skipped, 912 xfailed, 9 xpassed, 445 warnings in 791.84s (0:13:11) =
Test environment
    _libgcc_mutex:      0.1-conda_forge            conda-forge
    _openmp_mutex:      4.5-2_gnu                  conda-forge
    attrs:              23.1.0-pyh71513ae_1        conda-forge
    backports.zoneinfo: 0.2.1-py39hf3d152e_7       conda-forge
    boto3:              1.26.129-pyhd8ed1ab_0      conda-forge
    botocore:           1.29.129-pyhd8ed1ab_0      conda-forge
    brotlipy:           0.7.0-py39hb9d737c_1005    conda-forge
    bzip2:              1.0.8-h7f98852_4           conda-forge
    ca-certificates:    2022.12.7-ha878542_0       conda-forge
    certifi:            2022.12.7-pyhd8ed1ab_0     conda-forge
    cffi:               1.15.1-py39he91dace_3      conda-forge
    click:              8.1.3-unix_pyhd8ed1ab_2    conda-forge
    colorama:           0.4.6-pyhd8ed1ab_0         conda-forge
    coverage:           7.2.5-py39hd1e30aa_0       conda-forge
    cryptography:       40.0.2-py39h079d5ae_0      conda-forge
    exceptiongroup:     1.1.1-pyhd8ed1ab_0         conda-forge
    execnet:            1.9.0-pyhd8ed1ab_0         conda-forge
    hypothesis:         6.75.2-pyha770c72_0        conda-forge
    idna:               3.4-pyhd8ed1ab_0           conda-forge
    importlib-metadata: 6.6.0-pyha770c72_0         conda-forge
    iniconfig:          2.0.0-pyhd8ed1ab_0         conda-forge
    jmespath:           1.0.1-pyhd8ed1ab_0         conda-forge
    ld_impl_linux-64:   2.40-h41732ed_0            conda-forge
    libblas:            3.9.0-16_linux64_openblas  conda-forge
    libcblas:           3.9.0-16_linux64_openblas  conda-forge
    libffi:             3.4.2-h7f98852_5           conda-forge
    libgcc-ng:          12.2.0-h65d4601_19         conda-forge
    libgfortran-ng:     12.2.0-h69a702a_19         conda-forge
    libgfortran5:       12.2.0-h337968e_19         conda-forge
    libgomp:            12.2.0-h65d4601_19         conda-forge
    liblapack:          3.9.0-16_linux64_openblas  conda-forge
    libnsl:             2.0.0-h7f98852_0           conda-forge
    libopenblas:        0.3.21-pthreads_h78a6416_3 conda-forge
    libsqlite:          3.40.0-h753d276_1          conda-forge
    libstdcxx-ng:       12.2.0-h46fd767_19         conda-forge
    libuuid:            2.38.1-h0b41bf4_0          conda-forge
    libzlib:            1.2.13-h166bdaf_4          conda-forge
    ncurses:            6.3-h27087fc_1             conda-forge
    numpy:              1.24.3-py39h6183b62_0      conda-forge
    openssl:            3.1.0-hd590300_3           conda-forge
    packaging:          23.1-pyhd8ed1ab_0          conda-forge
    pandas:             2.0.1-py39hb9e473a_0       local      
    pip:                23.1.2-pyhd8ed1ab_0        conda-forge
    pluggy:             1.0.0-pyhd8ed1ab_5         conda-forge
    psutil:             5.9.5-py39h72bdee0_0       conda-forge
    pycparser:          2.21-pyhd8ed1ab_0          conda-forge
    pyopenssl:          23.1.1-pyhd8ed1ab_0        conda-forge
    pysocks:            1.7.1-pyha2e5f31_6         conda-forge
    pytest:             7.3.1-pyhd8ed1ab_0         conda-forge
    pytest-asyncio:     0.21.0-pyhd8ed1ab_0        conda-forge
    pytest-cov:         4.0.0-pyhd8ed1ab_0         conda-forge
    pytest-xdist:       3.2.1-pyhd8ed1ab_0         conda-forge
    python:             3.9.16-h2782a2a_0_cpython  conda-forge
    python-dateutil:    2.8.2-pyhd8ed1ab_0         conda-forge
    python-tzdata:      2023.3-pyhd8ed1ab_0        conda-forge
    python_abi:         3.9-3_cp39                 conda-forge
    pytz:               2023.3-pyhd8ed1ab_0        conda-forge
    readline:           8.2-h8228510_1             conda-forge
    s3transfer:         0.6.1-pyhd8ed1ab_0         conda-forge
    setuptools:         67.7.2-pyhd8ed1ab_0        conda-forge
    six:                1.16.0-pyh6c4a22f_0        conda-forge
    sortedcontainers:   2.4.0-pyhd8ed1ab_0         conda-forge
    tk:                 8.6.12-h27826a3_0          conda-forge
    toml:               0.10.2-pyhd8ed1ab_0        conda-forge
    tomli:              2.0.1-pyhd8ed1ab_0         conda-forge
    typing_extensions:  4.5.0-pyha770c72_0         conda-forge
    tzdata:             2023c-h71feb2d_0           conda-forge
    urllib3:            1.26.15-pyhd8ed1ab_0       conda-forge
    wheel:              0.40.0-pyhd8ed1ab_0        conda-forge
    xz:                 5.2.6-h166bdaf_0           conda-forge
    zipp:               3.15.0-pyhd8ed1ab_0        conda-forge
@h-vetinari
Copy link
Contributor Author

CC @lithomas1 (saw you worked on this in #51640)
CC @matusvalo (who helped with Cython 3 compat for scipy)

@h-vetinari
Copy link
Contributor Author

Windows has a couple more failures than unix:

=========================== short test summary info ===========================
FAILED ..\_test_env\lib\site-packages\pandas\tests\tseries\offsets\test_common.py::test_apply_out_of_range[tzlocal()-BQuarterEnd]
FAILED ..\_test_env\lib\site-packages\pandas\tests\tseries\offsets\test_common.py::test_apply_out_of_range[tzlocal()-BQuarterBegin]
FAILED ..\_test_env\lib\site-packages\pandas\tests\tseries\offsets\test_common.py::test_apply_out_of_range[tzlocal()-QuarterEnd]
FAILED ..\_test_env\lib\site-packages\pandas\tests\tseries\offsets\test_year.py::test_add_out_of_pydatetime_range
= 4 failed, 171914 passed, 24049 skipped, 912 xfailed, 8 xpassed, 445 warnings in 1371.60s (0:22:51) =

Failure for the first three looks like:


self = tzlocal(), dt = datetime.datetime(4507, 12, 1, 0, 0, tzinfo=tzlocal())

    def _naive_is_dst(self, dt):
        timestamp = _datetime_to_timestamp(dt)
>       return time.localtime(timestamp + time.timezone).tm_isdst
E       OSError: [Errno 22] Invalid argument

..\_test_env\lib\site-packages\dateutil\tz\tz.py:260: OSError

@jbrockmendel
Copy link
Member

Same behavior we xfailed for npdev

@lithomas1 lithomas1 added Testing pandas testing functions or related to the test suite Compat pandas objects compatability with Numpy or Python functions labels May 7, 2023
@matusvalo
Copy link
Contributor

@h-vetinari I have couple of questions:

  • Was this issue introduced in Cython 3.0b2 or already present in Cython 3.0b1.
  • Have you tried cython 3 master?

@h-vetinari
Copy link
Contributor Author

Was this issue introduced in Cython 3.0b2 or already present in Cython 3.0b1.

First time I'm testing it, no idea about the stat of 3.0.0b1, though we have builds of that in conda-forge and I could answer that if you want.

Have you tried cython 3 master?

Due to the separation of all the various build artefacts, it's hard to use Cython master (though not absolutely impossible, if something is urgent). Otherwise I'd retest with the next beta.

@matusvalo
Copy link
Contributor

First time I'm testing it, no idea about the stat of 3.0.0b1, though we have builds of that in conda-forge and I could answer that if you want.

I did not went in deep with pandas and Cython3 compatibility. So It would be great to know which is last version of Cython without issue.

@h-vetinari
Copy link
Contributor Author

So It would be great to know which is last version of Cython without issue.

On conda-forge infra, the latest released pandas (2.0.1) passes its test suite when compiled with the latest released Cython (0.29.34), see conda-forge/pandas-feedstock#163

@matusvalo
Copy link
Contributor

I double checked pandas with multiple cython versions:

  • Cython 0.29.X - build successful, test test_add_out_of_pydatetime_range passes.
  • Cython <= 3.0b9 - build fails (did not check why)
  • Cython 3.0b10 and 3.b11 build successful, test test_add_out_of_pydatetime_range fails
  • Cython master - build fails (caused by PR Keep extern visibility in within context of struct/union cython/cython#5386), when patched test test_add_out_of_pydatetime_range still fails

Additional observation of pandas codebase compatibility with Cython 3.0:

pandas code base was not migrated to new noexcept keyword introduced in Cython beta 1. This should not be major problem in general - generated code will check for exception when dedicated return value occurs. But this can be issue mainly for functions returning void and inline functions because:

  1. void function will be checked everytime when function is called - this is most affecting nogil functions - see Document the nongil except* performance pitfall cython/cython#5327
  2. inline function should be more performance critical, hence here I suppose we should avoid all overhead.

@matusvalo
Copy link
Contributor

matusvalo commented May 10, 2023

OK. I have found the culprit. The issue is here:

year = stamp.year + dy

stamp variables is declared as datetime and hence the Cython (in version 3) uses low level access to year attribute/property:

  • Cython 3
     __pyx_t_1 = __pyx_f_7cpython_8datetime_8datetime_4year_year(__pyx_v_stamp); if (unlikely(__pyx_t_1 == ((int)-1) && PyErr_Occurred())) __PYX_ERR(0, 4548, __pyx_L1_error)
  • Cython 0.29.X:
    __pyx_t_1 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_stamp), __pyx_n_s_year); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 4548, __pyx_L1_error)

The simple reproducer is following:

from cpython.datetime cimport datetime

from pandas import Timestamp
import numpy as np

cdef datetime ts = Timestamp(np.datetime64("-20000-12-31"))
ts2 = Timestamp(np.datetime64("-20000-12-31"))
print(ts.year)
print(ts2.year)
  • Cython 3:
    1972
    -20000
    
  • Cython 0.29.X:
    -20000
    -20000
    

So the workaround is simple - just remove datetime annotation from stamp variable definition.

Note: I was not investigating what is better/more correct behaviour for this use-case, neither I did not check if it is a bug in cython or not.

@matusvalo
Copy link
Contributor

I played a bit with the issue. I am not sure whether the bug is on Cython side. Cython is using datetime API which should not support dates with year set to -2000:

>>> datetime(year=-2000, month=12, day=31)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: year -2000 is out of range

@matusvalo
Copy link
Contributor

I have investigated another bit. Here is a function definition generated by Cython:

static CYTHON_INLINE int __pyx_f_7cpython_8datetime_8datetime_4year_year(PyDateTime_DateTime *__pyx_v_self) {
  int __pyx_r;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("year", 0);

  /* "cpython/datetime.pxd":112
 *         @property
 *         cdef inline int year(self):
 *             return PyDateTime_GET_YEAR(self)             # <<<<<<<<<<<<<<
 * 
 *         @property
 */
  __pyx_r = PyDateTime_GET_YEAR(((PyObject *)__pyx_v_self));
  goto __pyx_L0;

  /* "cpython/datetime.pxd":111
 *     ctypedef extern class datetime.datetime[object PyDateTime_DateTime]:
 *         @property
 *         cdef inline int year(self):             # <<<<<<<<<<<<<<
 *             return PyDateTime_GET_YEAR(self)
 * 
 */

  /* function exit code */
  __pyx_L0:;
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

Basically it returns data directly from PyDateTime_GET_YEAR which is declared in python header file: https://github.com/python/cpython/blob/25db95d224d18fb7b7f53165aeaa87632b0230f2/Include/datetime.h#L122

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Testing pandas testing functions or related to the test suite
Projects
None yet
Development

No branches or pull requests

4 participants