Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCS: Updated NDFrame.astype docs #17203

Merged
merged 5 commits into from
Aug 9, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 48 additions & 2 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3610,8 +3610,7 @@ def blocks(self):
mapping={True: 'raise', False: 'ignore'})
def astype(self, dtype, copy=True, errors='raise', **kwargs):
"""
Cast object to input numpy.dtype
Return a copy when copy = True (be really careful with this!)
Cast a pandas object to a specified dtype ``dtype``.

Parameters
----------
Expand All @@ -3620,6 +3619,10 @@ def astype(self, dtype, copy=True, errors='raise', **kwargs):
the same type. Alternatively, use {col: dtype, ...}, where col is a
column label and dtype is a numpy.dtype or Python type to cast one
or more of the DataFrame's columns to column-specific types.
copy : bool, default True.
Return a copy when ``copy=True`` (be very careful setting
``copy=False`` as changes to values then may propagate to other
pandas objects).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add something like ", and copy=False has only effect when the specified dtype is equivalent to the existing dtype" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I consciously made it clearer but not-too-specific, as I don't really know how copy does its thing, and I doubt this is used so often.

I suggest to leave it as it is. Is that ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is fine

errors : {'raise', 'ignore'}, default 'raise'.
Control raising of exceptions on invalid data for provided dtype.

Expand All @@ -3636,6 +3639,49 @@ def astype(self, dtype, copy=True, errors='raise', **kwargs):
Returns
-------
casted : type of caller

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add a See Also to numpy.astype here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

Examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's provide an example using the copy argument given that it says the parameter should be handled with care.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done, added example with copy=False, where result propagates upwards.

I could only get it to work with categoricals and not numpy dtypes, so the example is a bit contrieved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy is not really that useful here, but ok since that you did it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, especially if copy=False has no effect with numpy.dtypes.

Unless someone can find an effect with columns with numpy.dtypes, I wouldn't mind pulling this out again, as my example is maybe a bit silly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it works as well with numpy dtypes:

In [77]: s1 = pd.Series([1,2])

In [78]: s2 = s1.astype('int', copy=False)

In [79]: s2[0] = 10

In [80]: s1
Out[80]: 
0    10
1     2
dtype: int64

It's just that the dtype needs to be equivalent (otherwise it always takes a copy).

So I would change the example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've changed it to your example.

--------
>>> ser = pd.Series([1, 2], dtype='int32')
>>> ser
0 1
1 2
dtype: int32
>>> ser.astype('int64')
0 1
1 2
dtype: int64

Convert to categorical type:

>>> ser.astype('category')
0 1
1 2
dtype: category
Categories (2, int64): [1, 2]

Convert to ordered categorical type with custom ordering:

>>> ser.astype('category', ordered=True, categories=[2, 1])
0 1
1 2
dtype: category
Categories (2, int64): [2 < 1]

Note that using ``copy=False`` and changing data on a new
pandas object may propagate changes:

>>> s1 = pd.Series([1,2])
>>> s2 = s1.astype('int', copy=False)
>>> s2[0] = 10
>>> s1 # note that s1[0] has changed too
0 10
1 2
dtype: int64

See also
--------
numpy.ndarray.astype : Cast a numpy array to a specified type.
"""
if is_dict_like(dtype):
if self.ndim == 1: # i.e. Series
Expand Down