Skip to content

Commit

Permalink
REF: simplify CSVFormatter (pandas-dev#36046)
Browse files Browse the repository at this point in the history
* REF: extract properties cols and has_mi_columns

* REF: extract property chunksize

* REF: extract property quotechar

* REF: extract properties data_index and nlevels

* REF: refactor _save_chunk

* REF: refactor _save

* REF: extract method _save_body

* REF: reorder _save-like methods

* REF: extract compression property

* REF: Extract property index_label

* REF: extract helper properties

* REF: delete local variables in _save_header

* REF: extract method _get_header_rows

* REF: move check for header into _save function

* TYP: add several type annotations

* FIX: fix index labels

* FIX: fix multiindex

* FIX: fix test failures on compression

Needed to eliminate compression setter
due to the interdependencies between ioargs and compression.

* REF: eliminate preallocation of self.data

* REF: extract method _convert_to_native_types

* REF: rename regular -> flat as reviewed

* TYP: add type annotations as reviewed

* REF: refactor number formatting

Replace _convert_to_native_types method
in favor of a number formatting dictionary.

* FIX: mypy error with index_label

* FIX: reorder if-statements in index_label

To make sure that the newer mypy (v0.782) passes.

* TYP: move IndexLabel to pandas._typing

This eliminates repetition of the type annotations
for index label in multiple places.

* TYP: quotechar, has_mi_columns, _need_to_save...

* TYP: chunksize, but ignored assignment check

For some reason mypy would not recognize that chunksize
turns from Optional[int] to int inside the setter.
Even setting an intentional assertion
``assert chunksize is not None``
does not help.

* TYP: cols property

Limitations:
 - ignore type[assignment] error.
 - Created additional method _refine_cols to allow
 conversion from Optional[Sequence[Label]] to Sequence[Label].

* TYP: nlevels and _has_aliases

* CLN: move GH21227 check to pandas/io/common.py

* TYP: remove redundant bool from IndexLabel type

* TYP: add to _get_index_label... methods

* TYP: use Iterator instead of Generator

* TYP: explicitly use List type

* TYP: correct dict typing

* TYP: remaining properties
  • Loading branch information
ivanovmg authored and Kevin D Smith committed Nov 2, 2020
1 parent a00ca8e commit fbebcc8
Show file tree
Hide file tree
Showing 4 changed files with 202 additions and 179 deletions.
2 changes: 2 additions & 0 deletions pandas/_typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
List,
Mapping,
Optional,
Sequence,
Type,
TypeVar,
Union,
Expand Down Expand Up @@ -82,6 +83,7 @@

Axis = Union[str, int]
Label = Optional[Hashable]
IndexLabel = Optional[Union[Label, Sequence[Label]]]
Level = Union[Label, int]
Ordered = Optional[bool]
JSONSerializable = Optional[Union[PythonScalar, List, Dict]]
Expand Down
3 changes: 2 additions & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
CompressionOptions,
FilePathOrBuffer,
FrameOrSeries,
IndexLabel,
JSONSerializable,
Label,
Level,
Expand Down Expand Up @@ -3160,7 +3161,7 @@ def to_csv(
columns: Optional[Sequence[Label]] = None,
header: Union[bool_t, List[str]] = True,
index: bool_t = True,
index_label: Optional[Union[bool_t, str, Sequence[Label]]] = None,
index_label: IndexLabel = None,
mode: str = "w",
encoding: Optional[str] = None,
compression: CompressionOptions = "infer",
Expand Down
15 changes: 15 additions & 0 deletions pandas/io/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,21 @@ def get_filepath_or_buffer(
# handle compression dict
compression_method, compression = get_compression_method(compression)
compression_method = infer_compression(filepath_or_buffer, compression_method)

# GH21227 internal compression is not used for non-binary handles.
if (
compression_method
and hasattr(filepath_or_buffer, "write")
and mode
and "b" not in mode
):
warnings.warn(
"compression has no effect when passing a non-binary object as input.",
RuntimeWarning,
stacklevel=2,
)
compression_method = None

compression = dict(compression, method=compression_method)

# bz2 and xz do not write the byte order mark for utf-16 and utf-32
Expand Down
Loading

0 comments on commit fbebcc8

Please sign in to comment.