Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a tutorial for working with table inputs in PyGMT #2722

Merged
merged 28 commits into from
Dec 13, 2023
Merged
Changes from 9 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
5a3e78c
Add a tutorial for working with table inputs in PyGMT
seisman Oct 8, 2023
b8d656f
fix
seisman Oct 8, 2023
0410ddc
Apply suggestions from code review
seisman Oct 8, 2023
0e3e5f9
Merge branch 'main' into tutorial/working-with-tables
seisman Oct 10, 2023
f778830
Rename the tutorial
seisman Oct 10, 2023
a3350fc
Move geopandas.DataFrame before x/y/z arrays
seisman Oct 10, 2023
89a62ed
Merge branch 'main' into tutorial/working-with-tables
seisman Oct 19, 2023
c9c2593
Updates
seisman Oct 19, 2023
3b9ccf3
Fix
seisman Oct 19, 2023
0bcd2ca
Apply suggestions from code review
seisman Oct 24, 2023
8ea2b1f
Apply suggestions from code review
seisman Oct 24, 2023
63c0cd4
Minor updates
seisman Oct 26, 2023
ff6e4ff
Merge branch 'main' into tutorial/working-with-tables
seisman Oct 28, 2023
0578774
Update examples/get_started/04_table_inputs.py
seisman Nov 2, 2023
727ee2c
Merge branch 'main' into tutorial/working-with-tables
seisman Nov 4, 2023
7db8cca
Minor updates
seisman Nov 4, 2023
fa9ebda
Apply suggestions from code review
seisman Nov 4, 2023
3dcef5e
Formatting
seisman Nov 5, 2023
a2daa33
Apply suggestions from code review
seisman Nov 7, 2023
0b49647
Fix styling
seisman Nov 10, 2023
6d41e79
Merge branch 'main' into tutorial/working-with-tables
seisman Nov 18, 2023
45dfb66
Merge branch 'main' into tutorial/working-with-tables
seisman Nov 23, 2023
544ee1d
Merge branch 'main' into tutorial/working-with-tables
seisman Dec 11, 2023
75f34ad
Apply suggestions from code review
seisman Dec 12, 2023
5244549
Link to Python standard list type
seisman Dec 12, 2023
06f85bb
Apply suggestions from code review
seisman Dec 12, 2023
3a72413
Merge branch 'main' into tutorial/working-with-tables
seisman Dec 12, 2023
97c1a90
Remove hyperlink from the section heading to reduce line length
seisman Dec 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions examples/get_started/04_table_inputs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
"""
4. PyGMT I/O: Table inputs
==========================
Comment on lines +2 to +3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I/O means Input/Output, but this tutorial is only on inputs 🙂 Will the 'Output' part be added as a separate page? Or do we want multiple parts like:

  • 4.1 PyGMT I/O: Table inputs
  • 4.2 PyGMT I/O: Table outputs
  • 4.3 PyGMT I/O: Grid inputs
  • 4.4 PyGMT I/O: Grid outputs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's my plan, but I may prefer to the following order:

4.1 PyGMT I/O: Table inputs
4.2 PyGMT I/O: Grid inputs
4.3 PyGMT I/O: Table outputs
4.4 PyGMT I/O: Grid outputs


Generally, PyGMT accepts two different types of data inputs: tables and grids.

- A table is a 2-D array of data, with *M* rows and *N* columns. Each column
represents a different variable (e.g., *x*, *y* and *z*) and each row
represents a different record.
- A grid is a 2-D array of data that is regularly spaced in the x and y
directions.
seisman marked this conversation as resolved.
Show resolved Hide resolved

In this tutorial, we'll focus on working with table inputs, and cover grids in
the following tutorials.

PyGMT supports a variety of table input types that allow you to work with data
in a format that suits your needs. In this tutorial, we'll explore the
different table input types available in PyGMT and provide examples for each.
By understanding the different table input types, you can choose the one that
best fits your data and analysis needs, and work more efficiently with PyGMT.
"""

# %%
# ASCII table file
# ----------------
#
# Most PyGMT functions/methods that accept table input data have a ``data``
# parameter. The easiest way to provide table input data to PyGMT is by
# specifying the file name of an ASCII table (e.g., ``data="input_data.dat"``).
# This is useful when your data is stored in a separate text file.

import numpy as np
import pygmt
seisman marked this conversation as resolved.
Show resolved Hide resolved

# Create an example file with 3 rows and 2 columns
data = np.array([[1.0, 2.0], [5.0, 4.0], [8.0, 3.0]])
np.savetxt("input_data.dat", data, fmt="%f")

# Pass the file name to the data parameter
fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="X10c/5c", frame=True)
fig.plot(data="input_data.dat", style="p0.2c", fill="blue")
fig.show()

# Now let's delete the example file
from pathlib import Path
seisman marked this conversation as resolved.
Show resolved Hide resolved
seisman marked this conversation as resolved.
Show resolved Hide resolved

seisman marked this conversation as resolved.
Show resolved Hide resolved
Path("input_data.dat").unlink()

# %%
# Besides a plain string to a table file, following variants are also accepted:
seisman marked this conversation as resolved.
Show resolved Hide resolved
#
# - A :class:`pathlib.Path` object.
# - A full URL. PyGMT will download the file to the current directory first.
# - A file name prefixed with ``@`` (e.g., ``data="@input_data.dat"``), which
# is a special syntax in GMT to indicate that the file is a remote file
# hosted on the GMT data server.

# %%
# 2-D array: list, numpy.ndarray, and pandas.DataFrame
# ----------------------------------------------------
#
# The ``data`` parameter also accepts a 2-D array, e.g.,
#
# - A list of list
seisman marked this conversation as resolved.
Show resolved Hide resolved
# - A :class:`numpy.ndarray` object with with a dimension of 2
# - A :class:`pandas.DataFrame` object
#
# This is useful when you want to plot data that is already in memory.

import pandas as pd

fig = pygmt.Figure()
seisman marked this conversation as resolved.
Show resolved Hide resolved
fig.basemap(region=[0, 10, 0, 5], projection="X10c/5c", frame=True)

# Pass a 2-D list to the 'data' parameter
fig.plot(data=[[1.0, 2.0], [3.0, 4.0]], style="c0.2c", fill="black")

# Pass a 2-D numpy array to the 'data' parameter
fig.plot(data=np.array([[4.0, 2.0], [6.0, 4.0]]), style="t0.2c", fill="red")

# Pass a pandas.DataFrame to the 'data' parameter
df = pd.DataFrame(np.array([[7.0, 3.0], [9.0, 2.0]]), columns=["x", "y"])
fig.plot(data=df, style="a0.5c", fill="blue")

fig.show()

# %%
# geopandas.GeoDataFrame
# ----------------------
#
# If you're working with geospatial data, you can read your data as a
# :class:`geopandas.GeoDataFrames` object and pass it to the ``data``
# parameter. This is useful if your data is stored in a geospatial data format
# (e.g., GeoJSON, etc.) that GMT and PyGMT do not support natively.
seisman marked this conversation as resolved.
Show resolved Hide resolved

import geopandas as gpd

# Example GeoDataFrame
seisman marked this conversation as resolved.
Show resolved Hide resolved
gdf = gpd.GeoDataFrame(
{
"geometry": gpd.points_from_xy([1, 2, 3], [2, 3, 4]),
seisman marked this conversation as resolved.
Show resolved Hide resolved
"value": [10, 20, 30],
}
)

# Use the GeoDataFrame to specify the data
seisman marked this conversation as resolved.
Show resolved Hide resolved
fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="X10c/5c", frame=True)
fig.plot(data=gdf, style="c0.2c", fill="purple")
fig.show()

# %%
# Scalar values or 1-D arrays
# ---------------------------
#
# In addition to the ``data`` parameter, some PyGMT functions/methods also
# provide invididual parameters (e.g., ``x`` and ``y`` for data coordinates)
seisman marked this conversation as resolved.
Show resolved Hide resolved
# which allow you to specify the data. These parameters accept individual
# scalar values or 1-D arrays (lists or 1-D numpy arrays). This is useful if
# you want to plot a single data point or already have 1-D arrays of data in
# memory.

fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="X10c/5c", frame=True)

# Pass scalar values to plot a single data point
fig.plot(x=1.0, y=2.0, style="a0.2c", fill="blue")

# Pass 1-D lists to plot multiple data points
fig.plot(x=[5.0, 5.0, 5.0], y=[2.0, 3.0, 4.0], style="t0.2c", fill="green")

# Pass 1-D numpy arrays to plot multiple data points
fig.plot(
x=np.array([8.0, 8.0, 8.0]), y=np.array([2.0, 3.0, 4.0]), style="c0.2c", fill="red"
)

fig.show()

# %%
# Conclusion
# ----------
#
# In PyGMT, you have the flexibility to provide data in various table input
# types, including file names, 2-D array (2-D list, :class:`numpy.ndarray`,
seisman marked this conversation as resolved.
Show resolved Hide resolved
# :class:`pandas.DataFrames`), scalar values or a series of 1-D arrays, and
# :class:`geopandas.GeoDataFrames`. Choose the input type that best suits your
# data source and analysis requirements.
Loading