Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

**BREAKING** pygmt.grdcut: Refactor to store output in virtualfiles for grids #3115

Open
wants to merge 98 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
bcf43f0
Wrap GMT's standard data type GMT_IMAGE for images
seisman Mar 18, 2024
a052a1a
Initial implementation of to_dataarray method for _GMT_IMAGE class
weiji14 Mar 20, 2024
59d523c
pygmt.grdcut: Support both grid and image output
seisman Apr 16, 2024
56a6d65
Merge branch 'main' into datatypes/gmtimage
seisman Apr 17, 2024
3315324
Merge branch 'main' into gmtimage
seisman Apr 19, 2024
cea3374
Fix
seisman Apr 19, 2024
80d9837
Refactor
seisman Apr 19, 2024
22fba56
fix
seisman Apr 19, 2024
f71e79c
Merge branch 'main' into datatypes/gmtimage
weiji14 Jun 18, 2024
4cce4a2
Small typo fixes and add output type-hint for to_dataarray
weiji14 Jun 18, 2024
e02b650
Fix mypy error using np.array([0, 1, 2]) instead of np.arange
weiji14 Jun 18, 2024
f3d4b1f
Parse name and data_attrs from grid/image header
weiji14 Jun 18, 2024
4390136
Transpose array to (band, y, x) order and add doctest for to_dataarray
weiji14 Jun 20, 2024
5f25669
Set registration and gtype from header
weiji14 Jun 20, 2024
a3c6c14
Print basic shape and padding info in _GMT_IMAGE doctest
weiji14 Jun 20, 2024
5888e10
Only set Conventions = CF-1.7 attribute for NetCDF grid type
weiji14 Jun 20, 2024
798e658
Merge branch 'main' into datatypes/gmtimage
weiji14 Jun 20, 2024
3dbf2f2
Remove rioxarray import
weiji14 Jun 20, 2024
3a24ebd
Apply suggestions from code review
seisman Jun 20, 2024
4eee7e6
Merge branch 'main' into gmtimage
seisman Jun 20, 2024
5e390d4
Address reviewer's comments
seisman Jun 20, 2024
003383d
Fix GMT_OUT
seisman Jun 21, 2024
606ac7e
Merge branch 'main' into gmtimage
seisman Jun 21, 2024
c6cdcc8
Merge branch 'main' into gmtimage
seisman Jul 7, 2024
377941a
Revert changes for _GMT_IMAGE
seisman Jul 7, 2024
20617f5
Use rioxarray.open_rasterio for loading images
seisman Jul 7, 2024
a998718
Check if rioxarray is installed
seisman Jul 7, 2024
86cab44
Improve grdcut
seisman Jul 7, 2024
6031bab
Fix typos in grdcut
seisman Jul 7, 2024
eb0af2d
Add tests for grdcut images
seisman Jul 7, 2024
7f6ca7d
Fix one failing test
seisman Jul 7, 2024
21b194a
Fix open_rasterio
seisman Jul 7, 2024
e7eaf5c
Fix open_rasterio
seisman Jul 7, 2024
e3c8569
Make sure the image is loaded
seisman Jul 7, 2024
1c8312c
Update pygmt/clib/session.py
seisman Jul 7, 2024
3913430
Use rioxarray.open_rasterio in a context manager
seisman Jul 8, 2024
812a225
Merge branch 'main' into gmtimage
seisman Jul 8, 2024
90bd29e
Merge remote-tracking branch 'origin/gmtimage' into gmtimage
seisman Jul 8, 2024
ab77187
Fix mypy errors
seisman Jul 8, 2024
6f3e474
Move grdcut image tests to a separate test file
seisman Jul 8, 2024
5b07dd9
Fix copy & paste errors
seisman Jul 8, 2024
31272ab
Run codspeed benchmark for test_grdcut_image_dataarray
seisman Jul 8, 2024
6b860bf
Merge branch 'main' into datatypes/gmtimage
seisman Jul 27, 2024
5a09329
Merge branch 'main' into gmtimage
seisman Aug 5, 2024
279595b
Add the raster_kind function to determine the raster kind
seisman Aug 5, 2024
7def4b5
Simplify the grdcut function
seisman Aug 5, 2024
be175d8
Merge branch 'main' into gmtimage
seisman Sep 19, 2024
0bf9368
Merge branch 'main' into datatypes/gmtimage
seisman Sep 19, 2024
7d437be
Use enum for grid ids
seisman Sep 19, 2024
268e34e
Fix the band. Starting from 1
seisman Sep 19, 2024
86765e1
Refactor the tests for images
seisman Sep 19, 2024
86f3ffa
In np.reshape, a is a position-only parameter
seisman Sep 20, 2024
cc28247
Improve tests
seisman Sep 20, 2024
1e2c973
Fix one failing doctest due to xarray changes
seisman Sep 20, 2024
734dc28
The np.reshape's newshape parameter is deprecated
seisman Sep 20, 2024
919dc00
Define grid IDs using IntEnum instead of Enum
seisman Sep 20, 2024
b1eacf1
Pass the new shape as a positional parameter
seisman Sep 20, 2024
aa4fdc9
Fix failing tests
seisman Sep 20, 2024
c87a3ec
One more fix
seisman Sep 20, 2024
a20d8a2
One more fix
seisman Sep 20, 2024
926427b
Simplify a doctest
seisman Sep 20, 2024
c73328e
Improve the tests
seisman Sep 20, 2024
2825eae
Merge branch 'datatypes/gmtimage' into gmtimage
seisman Sep 20, 2024
bf9275c
Remove the workaround for images
seisman Sep 20, 2024
fb97daa
Convert ctypes array to numpy array using np.ctypeslib.as_array
seisman Sep 20, 2024
15b8d53
Fix the incorrect value due to floating number conversion in sphinter…
seisman Sep 20, 2024
8433e78
Merge branch 'ctypesarray' into datatypes/gmtimage
seisman Sep 20, 2024
3e3a6f3
Update the to_dataarray method to match the codes in GMT_GRID
seisman Sep 20, 2024
12ef40a
image data should has uint8 dtype
seisman Sep 20, 2024
f64fbb8
Further improve the tests
seisman Sep 21, 2024
e9cb0a5
Merge branch 'datatypes/gmtimage' into gmtimage
seisman Sep 21, 2024
4f2ae48
Merge branch 'main' into datatypes/gmtimage
seisman Sep 24, 2024
d49afed
Add a note that currently only 3-band images are supported
seisman Sep 24, 2024
a97d0b3
Apply suggestions from code review
seisman Sep 28, 2024
f70bec0
Merge branch 'main' into datatypes/gmtimage
seisman Sep 28, 2024
2fd13fb
Remove the old GMTGridID enums from pygmt/datatypes/header.py
seisman Sep 28, 2024
9972ba1
A minor fix
seisman Sep 28, 2024
ac6b7c3
Merge branch 'datatypes/gmtimage' into gmtimage
seisman Sep 28, 2024
7c32d41
Merge branch 'main' into gmtimage
seisman Sep 29, 2024
9ec00be
Let _raster_kind return grid by default
seisman Sep 29, 2024
f3a2f8e
Simplify the grdcut image tests
seisman Sep 29, 2024
3c12e2b
Add one more test for file in & file out
seisman Sep 29, 2024
f852b0d
Fix typos
seisman Sep 29, 2024
5f7683c
Merge branch 'main' into gmtimage
seisman Sep 30, 2024
bb1a0b0
Use the new load_blue_marble function
seisman Sep 30, 2024
584b5af
Drop the spatial_ref coord
seisman Sep 30, 2024
b19ba00
Merge branch 'main' into gmtimage
seisman Nov 22, 2024
cceb929
Merge branch 'main' into gmtimage
seisman Nov 25, 2024
fac51f1
Update _raster_kind
seisman Nov 25, 2024
517ee52
Revert "Update _raster_kind"
seisman Dec 9, 2024
d5ec308
Merge branch 'main' into gmtimage
seisman Dec 9, 2024
7ca58fb
Remove the _raster_kind function
seisman Dec 9, 2024
66580d4
Add the 'kind' parameter
seisman Dec 9, 2024
194ee90
Minor update
seisman Dec 9, 2024
473428c
Avoid keep the file open
seisman Dec 9, 2024
84b2ed4
Remove one unnecessary pytest.skipif marker
seisman Dec 9, 2024
7968096
Add it back because we still need rioxarray
seisman Dec 9, 2024
90d5210
Merge branch 'main' into gmtimage
seisman Dec 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 31 additions & 14 deletions pygmt/src/grdcut.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,24 @@
grdcut - Extract subregion from a grid.
"""

from typing import Literal

import xarray as xr
from pygmt.clib import Session
from pygmt.exceptions import GMTInvalidInput
from pygmt.helpers import (
GMTTempFile,
build_arg_list,
data_kind,
fmt_docstring,
kwargs_to_strings,
use_alias,
)
from pygmt.io import load_dataarray

__doctest_skip__ = ["grdcut"]


@fmt_docstring
@use_alias(
G="outgrid",
R="region",
J="projection",
N="extend",
Expand All @@ -28,9 +29,11 @@
f="coltypes",
)
@kwargs_to_strings(R="sequence")
def grdcut(grid, **kwargs) -> xr.DataArray | None:
def grdcut(
grid, kind: Literal["grid", "image"] = "grid", outgrid: str | None = None, **kwargs
) -> xr.DataArray | None:
r"""
Extract subregion from a grid.
Extract subregion from a grid or image.

Produce a new ``outgrid`` file which is a subregion of ``grid``. The
subregion is specified with ``region``; the specified range must not exceed
Expand All @@ -48,6 +51,10 @@
Parameters
----------
{grid}
kind
The raster data kind. Valid values are ``grid`` and ``image``. When the input
``grid`` is a file name, it's hard to determine if the file is a grid or an
image, so we need to specify the kind explicitly. The default is ``grid``.
{outgrid}
{projection}
{region}
Expand Down Expand Up @@ -100,13 +107,23 @@
>>> # 12° E to 15° E and a latitude range of 21° N to 24° N
>>> new_grid = pygmt.grdcut(grid=grid, region=[12, 15, 21, 24])
"""
with GMTTempFile(suffix=".nc") as tmpfile:
with Session() as lib:
with lib.virtualfile_in(check_kind="raster", data=grid) as vingrd:
if (outgrid := kwargs.get("G")) is None:
kwargs["G"] = outgrid = tmpfile.name # output to tmpfile
lib.call_module(
module="grdcut", args=build_arg_list(kwargs, infile=vingrd)
)
# Determine the output data kind based on the input data kind.
match inkind := data_kind(grid):
case "image" | "grid":
outkind = inkind
case "file":
outkind = kind
case "_":
msg = f"Unsupported data type {type(grid)}."
raise GMTInvalidInput(msg)

Check warning on line 118 in pygmt/src/grdcut.py

View check run for this annotation

Codecov / codecov/patch

pygmt/src/grdcut.py#L117-L118

Added lines #L117 - L118 were not covered by tests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 117-118 should be caught by the test_grdcut_fails test.


return load_dataarray(outgrid) if outgrid == tmpfile.name else None
with Session() as lib:
with (
lib.virtualfile_in(check_kind="raster", data=grid) as vingrd,
lib.virtualfile_out(kind=outkind, fname=outgrid) as voutgrd,
):
kwargs["G"] = voutgrd
lib.call_module(module="grdcut", args=build_arg_list(kwargs, infile=vingrd))
return lib.virtualfile_to_raster(
vfname=voutgrd, kind=outkind, outgrid=outgrid
)
88 changes: 88 additions & 0 deletions pygmt/tests/test_grdcut_image.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
"""
Test pygmt.grdcut on images.
"""

from pathlib import Path

import numpy as np
import pytest
import xarray as xr
from pygmt import grdcut
from pygmt.datasets import load_blue_marble
from pygmt.helpers import GMTTempFile

try:
import rioxarray

_HAS_RIOXARRAY = True
except ImportError:
_HAS_RIOXARRAY = False


@pytest.fixture(scope="module", name="region")
def fixture_region():
"""
Set the data region.
"""
return [-53, -49, -20, -17]


@pytest.fixture(scope="module", name="expected_image")
def fixture_expected_image():
"""
Load the expected grdcut image result.
"""
return xr.DataArray(
data=np.array(
[
[[90, 93, 95, 90], [91, 90, 91, 91], [91, 90, 89, 90]],
[[87, 88, 88, 89], [88, 87, 86, 85], [90, 90, 89, 88]],
[[48, 49, 49, 45], [49, 48, 47, 45], [48, 47, 48, 46]],
],
dtype=np.uint8,
),
coords={
"band": [1, 2, 3],
seisman marked this conversation as resolved.
Show resolved Hide resolved
"x": [-52.5, -51.5, -50.5, -49.5],
"y": [-17.5, -18.5, -19.5],
},
dims=["band", "y", "x"],
attrs={
"scale_factor": 1.0,
"add_offset": 0.0,
},
)


@pytest.mark.benchmark
def test_grdcut_image_file(region, expected_image):
"""
Test grdcut on an input image file.
"""
result = grdcut("@earth_day_01d", region=region, kind="image")
xr.testing.assert_allclose(a=result, b=expected_image)


@pytest.mark.benchmark
@pytest.mark.skipif(not _HAS_RIOXARRAY, reason="rioxarray is not installed")
seisman marked this conversation as resolved.
Show resolved Hide resolved
def test_grdcut_image_dataarray(region, expected_image):
"""
Test grdcut on an input xarray.DataArray object.
"""
raster = load_blue_marble()
result = grdcut(raster, region=region, kind="image")
xr.testing.assert_allclose(a=result, b=expected_image)


def test_grdcut_image_file_in_file_out(region, expected_image):
"""
Test grdcut on an input image file and outputs to another image file.
"""
with GMTTempFile(suffix=".tif") as tmp:
result = grdcut("@earth_day_01d", region=region, outgrid=tmp.name)
assert result is None
assert Path(tmp.name).stat().st_size > 0
if _HAS_RIOXARRAY:
with rioxarray.open_rasterio(tmp.name) as raster:
image = raster.load().drop_vars("spatial_ref")
xr.testing.assert_allclose(a=image, b=expected_image)
Loading