Skip to content

Commit

Permalink
decouple pandera and pandas dtypes (#559)
Browse files Browse the repository at this point in the history
* refactor PandasDtype into class hierarchy supported by engines

* refactor DataFrameSchema based on DataType hierarchy

* refactor SchemaModel based on DataType hierarchy

* revert fix coerce=True and dtype=None should be a noop

* apply code style

* fix running tests/core with nox

* consolidate dtype names

* consolidate engine internal naming

* disable inherited __init__ with immutable(init=False)

* delete duplicated immutable

* disambiguate dtype variables

* add warning on base pandas_engine, numpy_engine.DataType init

* fix pylint, mypy errors

* fix DataFrameSchema.dtypes return type

* enable CI on dtypes branch

* Refactor inference, schema_statistics, strategies and io using the DataType hierarchy (#504)

* fix pandas_engine.Interval

* fix Timedelta64 registration with pandas_engine.Engine

* add DataType helpers

* add DataType.continuous attribute

* add dtypes.is_numeric

* refactor schema_statistics based on DataType hierarchy

* refactor schema_inference based on DataType hierarchy

* fix numpy_engine.Timedelta64.type

* add is_subdtype helper

* add Engine.get_registered_dtypes

* fix Engine error when registering a base DataType

* fix pandas_engine DateTime string alias

* clean up test_dtypes

* fix test_extensions

* refactor strategies based on DataType hierarchy

* refactor io based on DataType hierarchy

* replace dtypes module by new DataType hierarchy

* fix black

* delete dtypes_.py

* drop legacy pandas and python 3.6 from CI

* fix mypy errors

* fix ci-docs

* fix conda dependencies

* fix lint, update noxfile

* simplify nox tests, fix test_io

* update ci build

* update nox

* pin nox, handle windows data types

* fix windows platform

* fix pandas_engine on windows platform

* fix test_dtypes on windows platform

* force pip on docs CI

* test out windows dtype stuff

* more messing around with windows

* more debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* revert ci

* increase cache

* testing

Co-authored-by: cosmicBboy <[email protected]>

* Add DataTypes documentation (#536)

* delete print statements

* pin furo

* fix generated docs not removed by nox

* re-organize API section

* replace aliased pandas_engine data types with their aliases

* drop warning when calling Engine.register_dtype without arguments

* add data types to api reference doc

* add document for DataType refactor

* unpin sphinx and drop sphinx_rtd_theme

* add xdoctest

* ignore prompt when copying example from doc

* add doctest builder when running sphinx-build locally

* fix dtypes doc examples

* fix pandas_engine.DataType.check

* fix pylint

* remove whitespaces in dtypes doc

* Update docs/source/dtypes.rst

* Update dtypes.rst

* update docs structure

* update nox file

* force pip on doctests

* update test_schemas

* fix docs session not overriding html with doctest output

Co-authored-by: Niels Bantilan <[email protected]>

* add deprecation warnings for pandas_dtype and PandasDtype enum (#547)

* remove auto-generated docs

* add deprecation warnings, support pandas>=1.3.0

* add deprecation warnings for PandasDtype enum

* fix sphinx

* fix windows

* fix windows

* add support for pyarrow backed string data type (#548)

* add support for pyarrow backed string data type

* fix regression for pandas < 1.3.0

* add verbosity to test run

* loosen strategies unit tests deadline, exclude windows ci

* loosen test_strategies.py tests

* use "dev" hypothesis profile for python 3.7

* add pandas==1.2.5 test

* fix ci

* ci typo

* don't install environment.yml on unit tests

* install nox in ci

* remove environment.yml

* update environment in ci

Co-authored-by: cosmicBboy <[email protected]>

Co-authored-by: Jean-Francois Zinque <[email protected]>
  • Loading branch information
cosmicBboy and Jean-Francois Zinque committed Jul 22, 2021
1 parent 3a1436a commit 47c7f86
Show file tree
Hide file tree
Showing 59 changed files with 4,326 additions and 2,738 deletions.
51 changes: 36 additions & 15 deletions .github/workflows/ci-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,18 @@ name: CI Tests
on:
push:
branches:
- master
- dev
- bugfix
- 'release/*'
- master
- dev
- bugfix
- "release/*"
- dtypes
pull_request:
branches:
- master
- dev
- bugfix
- 'release/*'
- master
- dev
- bugfix
- "release/*"
- dtypes

env:
DEFAULT_PYTHON: 3.8
Expand Down Expand Up @@ -71,7 +73,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.6", "3.7", "3.8", "3.9"]
python-version: ["3.7", "3.8", "3.9"]
defaults:
run:
shell: bash -l {0}
Expand Down Expand Up @@ -133,16 +135,21 @@ jobs:
tests:
name: >
CI Tests (${{ matrix.python-version }},
${{ matrix.os }},
pandas-${{ matrix.pandas-version }})
CI Tests (${{ matrix.python-version }}, ${{ matrix.os }}, pandas-${{ matrix.pandas-version }})
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: ["ubuntu-latest", "macos-latest", "windows-latest"]
python-version: ["3.6", "3.7", "3.8", "3.9"]
pandas-version: ["latest", "0.25.3"]
python-version: ["3.7", "3.8", "3.9"]
pandas-version: ["1.2.5", "latest"]
# exclude these configurations until issue tracked here is fixed:
# https://github.com/pandera-dev/pandera/issues/555
exclude:
- os: windows-latest
python-version: "3.7"
- os: windows-latest
python-version: "3.9"

defaults:
run:
Expand Down Expand Up @@ -174,6 +181,13 @@ jobs:
use-only-tar-bz2: true
auto-activate-base: false

- name: Install pandas
run: |
if [[ "${{ matrix.pandas-version }}" != 'latest' ]]
then
mamba install -c conda-forge pandas==${{ matrix.pandas-version }}
fi
- name: Conda info
run: |
conda info
Expand Down Expand Up @@ -210,9 +224,16 @@ jobs:
- name: Upload coverage to Codecov
uses: "codecov/codecov-action@v1"

- name: Check Docstrings
run: >
nox
-db conda -r -v
--non-interactive
--session "doctests-${{ matrix.python-version }}"
- name: Check Docs
run: >
nox
-db conda -r -v
--non-interactive
--session "docs-${{ matrix.python-version }}(pandas='${{ matrix.pandas-version }}')"
--session "docs-${{ matrix.python-version }}"
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ venv.bak/
/asv_bench/results/

# Docs
docs/source/generated
docs/source/reference/generated

# Nox
.nox
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ requirements:
pip install -r requirements-dev.txt

docs:
rm -rf docs/source/generated && \
rm -rf docs/**/generated docs/**/methods docs/_build && \
python -m sphinx -E "docs/source" "docs/_build" -W && \
make -C docs doctest

Expand Down
167 changes: 0 additions & 167 deletions docs/source/API_reference.rst

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,6 @@

.. currentmodule:: {{ module }}

.. autoclass:: PandasDtype
:show-inheritance:
:exclude-members:

.. autoattribute:: str_alias
.. automethod:: from_str_alias
.. automethod:: from_pandas_api_type




.. autoclass:: {{ objname }}

{% block attributes %}
Expand All @@ -37,15 +26,16 @@
:nosignatures:
:toctree: methods

{% for item in methods %}
{%- if item not in inherited_members %}
~{{ name }}.{{ item }}
{%- endif %}
{%- endfor %}
{% endif %}
{# Ignore the DateTime alias to avoid `WARNING: document isn't included in any toctree`#}
{% if objname != "DateTime" %}
{% for item in methods %}
~{{ name }}.{{ item }}
{%- endfor %}

{%- if '__call__' in members %}
~{{ name }}.__call__
{%- if members and '__call__' in members %}
~{{ name }}.__call__
{%- endif %}
{%- endif %}

{%- endif %}
{% endblock %}
25 changes: 0 additions & 25 deletions docs/source/_templates/pandas_dtype_class.rst

This file was deleted.

7 changes: 6 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@
.. role:: green
"""

autosummary_generate = ["API_reference.rst"]
autosummary_generate = True
autosummary_filename_map = {
"pandera.Check": "pandera.Check",
"pandera.check": "pandera.check_decorator",
Expand All @@ -174,6 +174,11 @@
"pandas": ("http://pandas.pydata.org/pandas-docs/stable/", None),
}

# strip prompts
copybutton_prompt_text = (
r">>> |\.\.\. |\$ |In \[\d*\]: | {2,5}\.\.\.: | {5,8}: "
)
copybutton_prompt_is_regexp = True

# this is a workaround to filter out forward reference issue in
# sphinx_autodoc_typehints
Expand Down
4 changes: 2 additions & 2 deletions docs/source/data_synthesis_strategies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@

.. _data synthesis strategies:

Data Synthesis Strategies (new)
===============================
Data Synthesis Strategies
=========================

*new in 0.6.0*

Expand Down
Loading

0 comments on commit 47c7f86

Please sign in to comment.