-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pyarrow: Check compatibility of pyarrow.array and pyarrow.table with numeric and timestamp types #2864
base: main
Are you sure you want to change the base?
Conversation
Install pyarrow as an optional dependency, and ensure that pyarrow.array objects with int64 and timestamp types can be read by pygmt.info.
Ensure that pyarrow.table objects can be passed into pygmt functions like blockm, info, nearneighbor, project, triangulate and xyz2grd.
/format |
pygmt/tests/test_info.py
Outdated
pytest.param( | ||
getattr(pa, "table", None), | ||
"vector memory", | ||
marks=td.skip_if_no(package="pyarrow"), | ||
), | ||
], | ||
) | ||
def test_info_2d_array(array_func, expected_memory): | ||
""" | ||
Make sure info works on 2-D numpy.ndarray inputs. | ||
Make sure info works on 2-D numpy.ndarray and pyarrow.table inputs. | ||
""" | ||
table = np.loadtxt(POINTS_DATA) | ||
table = array_func(pd.read_csv(POINTS_DATA, sep=" ", header=None)) | ||
output = info(data=table) | ||
expected_output = ( | ||
"<matrix memory>: N = 20 <11.5309/61.7074> <-2.9289/7.8648> <0.1412/0.9338>\n" | ||
) | ||
expected_output = f"<{expected_memory}>: N = 20 <11.5309/61.7074> <-2.9289/7.8648> <0.1412/0.9338>\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that pyarrow.table
goes through put_vector
rather than put_matrix
, because pyarrow.table
is more akin to pandas.DataFrame
(which allows for columns with different dtypes) than a numpy.array
object (which only allows for one dtype).
Done by casting the pyarrow.TimestampScalar to a string first, before letting pd.to_datetime do the formatting. Have added a new parametrized unit test to test_solar_set_terminator_datetime with pyarrow and pd.Timestamp array_func to test this.
CodSpeed Performance ReportMerging #2864 will degrade performances by 6.62%Falling back to comparing Summary
Benchmarks breakdown
|
Description of proposed changes
Check that
pyarrow.array
andpyarrow.table
objects of various types (uint/int/timestamp) can be passed to PyGMT modules/functions.TODO:
pyarrow.array
by adding parametrized tests topygmt.info
pyarrow.table
to unit tests that already have the@pytest.mark.parametrize("array_func", ...)
pyarrow.array
objects can be used in place ofnumpy.array
solar
's terminator_datetime parameter to accept apyarrow.TimestampScalar
inputAddresses #2800 (comment), part of #2800.
Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.Slash Commands
You can write slash commands (
/command
) in the first line of a comment to performspecific operations. Supported slash commands are:
/format
: automatically format and lint the code/test-gmt-dev
: run full tests on the latest GMT development version