Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix package dependency issue and python 3.12 support #984

Merged
merged 3 commits into from
Apr 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 31 additions & 13 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,39 @@ jobs:
strategy:
fail-fast: false
matrix:
python: ["3.8", "3.9"]
python: ["3.8", "3.9", "3.10", "3.11", "3.12"]
os: [ubuntu-latest, macos-latest, windows-latest]
include:
- os: ubuntu-latest
install_graphviz:
sudo apt install graphviz graphviz-dev
- os: macos-latest
install_graphviz: brew install graphviz
- os: windows-latest
install_graphviz:
choco install graphviz --version=2.48.0;
poetry run pip install --global-option=build_ext --global-option="-IC:\Program Files\Graphviz\include" --global-option="-LC:\Program Files\Graphviz\lib" pygraphviz;
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
- name: Checkout
uses: actions/checkout@v2

- uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python }}

- name: "Windows Graphviz install"
if: runner.os == 'Windows'
uses: crazy-max/ghaction-chocolatey@v3
with:
args: -h

- name: Install Graphviz for Windows
if: runner.os == 'Windows'
run: |
choco install graphviz --version=2.49.3

- name: Install pygraphviz for Windows
if: runner.os == 'Windows'
run: |
python -m pip install --use-pep517 --config-settings="--global-option=build_ext" --config-settings="--global-option=-IC:\\Program Files\\Graphviz\\include" --config-settings="--global-option=-LC:\\Program Files\\Graphviz\\lib" pygraphviz

- name: Install Graphviz for other platforms
if: runner.os != 'Windows'
uses: ts-graphviz/setup-graphviz@v2
with:
macos-skip-brew-update: 'true'

- name: Cache venv
uses: actions/cache@v2
with:
Expand All @@ -47,7 +60,7 @@ jobs:
${{ matrix.install_graphviz }}
echo "Cache Version ${{ secrets.CACHE_VERSION }}"
poetry install
poetry run pip install ERAlchemy
poetry run pip install ERAlchemy2
poetry config --list

- name: Print tool versions
Expand Down Expand Up @@ -95,6 +108,9 @@ jobs:
steps:
- uses: actions/checkout@v2

- name: Setup Graphviz
uses: ts-graphviz/[email protected]

- name: Install dependencies
run: |
pip install poetry
Expand All @@ -110,6 +126,8 @@ jobs:
run: |
pip install poetry
poetry install
poetry run pip install ERAlchemy2


- name: Build docs
run: poetry run sphinx-build -M html docs/source docs/build
Expand Down
3 changes: 2 additions & 1 deletion dataprep/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@

Dataprep let you prepare your data using a single library with a few lines of code.
"""

import logging

DEFAULT_PARTITIONS = 1

logging.basicConfig(level=logging.INFO, format="%(message)s")

__version__ = "0.4.4"
__version__ = "0.4.6"
1 change: 1 addition & 0 deletions dataprep/clean/address_utils.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Constants used by the clean_address() and validate_address() functions
"""

# pylint: disable=C0301, C0302, E1101

from builtins import zip
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ad_nrt.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Clean and validate a DataFrame column containing
Andorra NRT (Número de Registre Tributari, Andorra tax number).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument, E1101, E1133
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_al_nipt.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Clean and validate a DataFrame column containing
NIPT (Numri i Identifikimit për Personin e Tatueshëm, Albanian VAT number).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument, E1101, E1133
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ar_cbu.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Clean and validate a DataFrame column containing
CBU (Clave Bancaria Uniforme, Argentine bank account number).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument, E1101, E1133
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ar_cuit.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Clean and validate a DataFrame column containing
CUIT (Código Único de Identificación Tributaria, Argentinian tax number).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument, E1101, E1133
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ar_dni.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Clean and validate a DataFrame column containing
DNI (Documento Nacional de Identidad, Argentinian national identity nr.).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument, E1101, E1133
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_at_uid.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Clean and validate a DataFrame column containing
UID (Umsatzsteuer-Identifikationsnummer, Austrian VAT number).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument, E1101, E1133
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_at_vnr.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Clean and validate a DataFrame column containing
VNR, SVNR, VSNR (Versicherungsnummer, Austrian social security number).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument, E1101, E1133
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_au_abn.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Australian Business Numbers (ABNs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_au_acn.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Australian Company Numbers (ACNs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_au_tfn.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Australian Tax File Numbers (TFNs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_be_iban.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Belgian IBANs.
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_be_vat.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Belgian VAT numbers (VATs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_bg_egn.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Bulgarian national identification numbers (EGNs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_bg_pnf.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Bulgarian personal number of a foreigner.
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_bg_vat.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Bulgarian VAT numbers (VATs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_bic.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing ISO 9362 Business identifier codes.
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_bitcoin.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Bitcoin Addresses.
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_br_cnpj.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing CNPJ numbers, Brazilian company identifier.
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_br_cpf.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing CPF numbers, Brazilian national identifier.
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_by_unp.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Belarusian UNP numbers (UNPs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ca_bn.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Canadian Business Numbers (BNs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ca_sin.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Canadian Social Insurance Numbers(SINs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_casrn.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing CAS Registry Numbers.
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches, unused-argument
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ch_esr.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Swiss EinzahlungsSchein mit Referenznummer (ESRs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ch_ssn.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Swiss social security numbers (SSNs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ch_uid.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Swiss business identifiers (UIDs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_ch_vat.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Swiss VAT numbers (VATs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_cl_rut.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Chile RUT/RUN numbers (RUTs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_cn_ric.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Chinese Resident Identity Card Number (RICs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_cn_uscc.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Clean and validate a DataFrame column containing Chinese Unified Social Credit Code
(China tax number) (USCCs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_co_nit.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Colombian identity codes (NITs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
5 changes: 2 additions & 3 deletions dataprep/clean/clean_country.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing country names.
"""

from functools import lru_cache
from operator import itemgetter
from os import path
Expand Down Expand Up @@ -371,9 +372,7 @@ def _get_format_if_allowed(input_format: str, allowed_formats: Tuple[str, ...])
return (
"name"
if "name" in allowed_formats
else "official"
if "official" in allowed_formats
else None
else "official" if "official" in allowed_formats else None
)

return input_format if input_format in allowed_formats else None
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_cr_cpf.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Costa Rica physical person ID number (CPFs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_cr_cpj.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Costa Rica tax number (CPJs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_cr_cr.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Costa Rica foreigners ID number (CRs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
1 change: 1 addition & 0 deletions dataprep/clean/clean_cu_ni.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Clean and validate a DataFrame column containing Cuban identity card numbers (NIs).
"""

# pylint: disable=too-many-lines, too-many-arguments, too-many-branches
from typing import Any, Union
from operator import itemgetter
Expand Down
Loading
Loading