Skip to content

Commit

Permalink
MINOR: [Docs][Python] Document type aliasing in pa.field/pa.schema (a…
Browse files Browse the repository at this point in the history
…pache#44512)

### Rationale for this change

PyArrow supports a set of type aliases, e.g., "string" aliases to pa.string() and these type aliases are triggered in calls to `pa.field` and `pa.schema`. Prior to this change, these weren't documented.

Note: I didn't think we wanted to deprecate these but if any reviewers want to discuss that let me know. The R package doesn't support a similar aliasing mechanism.

### What changes are included in this PR?

Updates to docs. One regression test.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Better docs.

Authored-by: Bryce Mecum <[email protected]>
Signed-off-by: Bryce Mecum <[email protected]>
  • Loading branch information
amoeba authored Oct 23, 2024
1 parent 8eccbfe commit 2bbd67d
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 2 deletions.
7 changes: 7 additions & 0 deletions python/pyarrow/tests/test_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -1153,6 +1153,13 @@ def test_field_basic():
pa.field('foo', None)


def test_field_datatype_alias():
f = pa.field('foo', 'string')

assert f.name == 'foo'
assert f.type is pa.string()


def test_field_equals():
meta1 = {b'foo': b'bar'}
meta2 = {b'bizz': b'bazz'}
Expand Down
28 changes: 26 additions & 2 deletions python/pyarrow/types.pxi
Original file line number Diff line number Diff line change
Expand Up @@ -3713,8 +3713,8 @@ def field(name, type=None, nullable=None, metadata=None):
Name of the field.
Alternatively, you can also pass an object that implements the Arrow
PyCapsule Protocol for schemas (has an ``__arrow_c_schema__`` method).
type : pyarrow.DataType
Arrow datatype of the field.
type : pyarrow.DataType or str
Arrow datatype of the field or a string matching one.
nullable : bool, default True
Whether the field's values are nullable.
metadata : dict, default None
Expand Down Expand Up @@ -3746,6 +3746,11 @@ def field(name, type=None, nullable=None, metadata=None):
>>> pa.struct([field])
StructType(struct<key: int32>)
A str can also be passed for the type parameter:
>>> pa.field('key', 'int32')
pyarrow.Field<key: int32>
"""
if hasattr(name, "__arrow_c_schema__"):
if type is not None:
Expand Down Expand Up @@ -5717,6 +5722,25 @@ def schema(fields, metadata=None):
some_int: int32
some_string: string
DataTypes can also be passed as strings. The following is equivalent to the
above example:
>>> pa.schema([
... pa.field('some_int', "int32"),
... pa.field('some_string', "string")
... ])
some_int: int32
some_string: string
Or more concisely:
>>> pa.schema([
... ('some_int', "int32"),
... ('some_string', "string")
... ])
some_int: int32
some_string: string
Returns
-------
schema : pyarrow.Schema
Expand Down

0 comments on commit 2bbd67d

Please sign in to comment.