Skip to content

Commit

Permalink
API: str.cat will align on Series (#20347)
Browse files Browse the repository at this point in the history
  • Loading branch information
h-vetinari authored and TomAugspurger committed May 2, 2018
1 parent 3471b98 commit f851699
Show file tree
Hide file tree
Showing 4 changed files with 729 additions and 86 deletions.
80 changes: 70 additions & 10 deletions doc/source/text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -247,27 +247,87 @@ Missing values on either side will result in missing values in the result as wel
s.str.cat(t)
s.str.cat(t, na_rep='-')
Series are *not* aligned on their index before concatenation:
Concatenating a Series and something array-like into a Series
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. versionadded:: 0.23.0

The parameter ``others`` can also be two-dimensional. In this case, the number or rows must match the lengths of the calling ``Series`` (or ``Index``).

.. ipython:: python
u = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
# without alignment
d = pd.concat([t, s], axis=1)
s
d
s.str.cat(d, na_rep='-')
Concatenating a Series and an indexed object into a Series, with alignment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. versionadded:: 0.23.0

For concatenation with a ``Series`` or ``DataFrame``, it is possible to align the indexes before concatenation by setting
the ``join``-keyword.

.. ipython:: python
u = pd.Series(['b', 'd', 'a', 'c'], index=[1, 3, 0, 2])
s
u
s.str.cat(u)
# with separate alignment
v, w = s.align(u)
v.str.cat(w, na_rep='-')
s.str.cat(u, join='left')
.. warning::

If the ``join`` keyword is not passed, the method :meth:`~Series.str.cat` will currently fall back to the behavior before version 0.23.0 (i.e. no alignment),
but a ``FutureWarning`` will be raised if any of the involved indexes differ, since this default will change to ``join='left'`` in a future version.

The usual options are available for ``join`` (one of ``'left', 'outer', 'inner', 'right'``).
In particular, alignment also means that the different lengths do not need to coincide anymore.

.. ipython:: python
v = pd.Series(['z', 'a', 'b', 'd', 'e'], index=[-1, 0, 1, 3, 4])
s
v
s.str.cat(v, join='left', na_rep='-')
s.str.cat(v, join='outer', na_rep='-')
The same alignment can be used when ``others`` is a ``DataFrame``:

.. ipython:: python
f = d.loc[[3, 2, 1, 0], :]
s
f
s.str.cat(f, join='left', na_rep='-')
Concatenating a Series and many objects into a Series
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

List-likes (excluding iterators, ``dict``-views, etc.) can be arbitrarily combined in a list.
All elements of the list must match in length to the calling ``Series`` (resp. ``Index``):
All one-dimensional list-likes can be arbitrarily combined in a list-like container (including iterators, ``dict``-views, etc.):

.. ipython:: python
s
u
s.str.cat([u, pd.Index(u.values), ['A', 'B', 'C', 'D'], map(int, u.index)], na_rep='-')
All elements must match in length to the calling ``Series`` (or ``Index``), except those having an index if ``join`` is not None:

.. ipython:: python
v
s.str.cat([u, v, ['A', 'B', 'C', 'D']], join='outer', na_rep='-')
If using ``join='right'`` on a list of ``others`` that contains different indexes,
the union of these indexes will be used as the basis for the final concatenation:

.. ipython:: python
x = pd.Series([1, 2, 3, 4], index=['A', 'B', 'C', 'D'])
s.str.cat([['A', 'B', 'C', 'D'], s, s.values, x.index])
u.loc[[3]]
v.loc[[-1, 0]]
s.str.cat([u.loc[[3]], v.loc[[-1, 0]]], join='right', na_rep='-')
Indexing with ``.str``
----------------------
Expand Down
18 changes: 18 additions & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -308,6 +308,24 @@ The :func:`DataFrame.assign` now accepts dependent keyword arguments for python

df.assign(A=df.A+1, C= lambda df: df.A* -1)

.. _whatsnew_0230.enhancements.str_cat_align:

``Series.str.cat`` has gained the ``join`` kwarg
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Previously, :meth:`Series.str.cat` did not -- in contrast to most of ``pandas`` -- align :class:`Series` on their index before concatenation (see :issue:`18657`).
The method has now gained a keyword ``join`` to control the manner of alignment, see examples below and in :ref:`here <text.concatenate>`.

In v.0.23 `join` will default to None (meaning no alignment), but this default will change to ``'left'`` in a future version of pandas.

.. ipython:: python

s = pd.Series(['a', 'b', 'c', 'd'])
t = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
s.str.cat(t)
s.str.cat(t, join='left', na_rep='-')

Furthermore, meth:`Series.str.cat` now works for ``CategoricalIndex`` as well (previously raised a ``ValueError``; see :issue:`20842`).

.. _whatsnew_0230.enhancements.astype_category:

Expand Down
Loading

0 comments on commit f851699

Please sign in to comment.