Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to load Blue Marble dataset #2235

Merged
merged 36 commits into from
Sep 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
9bde7f7
initial commit for blue_marble dataset
willschlitzer Dec 7, 2022
d8c8759
update docstring
willschlitzer Dec 7, 2022
c6bd235
Merge branch 'main' into load-remote-dataset/blue-marble
weiji14 Feb 17, 2024
093fc26
Rename load_blue_marble.py to earth_daynight.py and update docs
weiji14 Feb 17, 2024
2f29458
Add load_blue_marble funtion to doc/api/index.rst
weiji14 Feb 17, 2024
1c99f21
Fix mypy errors in load_remote_dataset.py
weiji14 Feb 17, 2024
e839ef7
Add doctest and rename returned output from grid to image
weiji14 Feb 17, 2024
aa8a474
Use rioxarray.open_rasterio to open earth_day images
weiji14 Feb 17, 2024
cd4cab6
Refactor fixture_xr_image to use load_blue_marble function
weiji14 Feb 17, 2024
fc060ea
Only load GMTDataArray accessor info when engine!=rasterio
weiji14 Feb 18, 2024
c3017d5
Set tiled=True for spatial resolutions 05m and higher
weiji14 Feb 18, 2024
c3e2e66
Add unit test to load_blue_marble 01d
weiji14 Feb 18, 2024
a2a18a9
Skip test_datasets_load_earth_daynight if rioxarray not installed
weiji14 Feb 18, 2024
7c38754
Add region parameter for load_blue_marble
weiji14 Feb 18, 2024
5c65097
Merge branch 'main' into load-remote-dataset/blue-marble
weiji14 Mar 16, 2024
767d7c7
Allow grdcut to slice GeoTIFFs so load_blue_marble's region param works
weiji14 Mar 16, 2024
a8600d3
Lint
weiji14 Mar 16, 2024
971ec11
Merge branch 'main' into load-remote-dataset/blue-marble
weiji14 Jun 18, 2024
6eabbf4
Handle image kind in _load_remote_dataset function
weiji14 Jun 18, 2024
4ef65b0
Merge branch 'main' into load-remote-dataset/blue-marble
weiji14 Sep 28, 2024
54184fb
Update attrs (name, long_name, description) of blue_marble
weiji14 Sep 28, 2024
e6c2964
Add type hints to resolution and region parameters
weiji14 Sep 28, 2024
19a7755
Merge branch 'main' into load-remote-dataset/blue-marble
weiji14 Sep 29, 2024
01d1b80
Partially revert "Allow grdcut to slice GeoTIFFs so load_blue_marble'…
weiji14 Sep 29, 2024
0ca210a
Revert "Use rioxarray.open_rasterio to open earth_day images"
weiji14 Sep 29, 2024
28d668d
Use npt.assert_allclose in test_blue_marble_01m_default_registration
weiji14 Sep 29, 2024
a6a08e3
Plot xr_image in same fashion as @earth_day_01d using region="d"
weiji14 Sep 29, 2024
f13d12b
Delete comment about images not being supported
weiji14 Sep 29, 2024
471c6e8
Set OGC:CRS84 projection on blue_marble image
weiji14 Sep 30, 2024
e886408
Add note on registration/gtype
weiji14 Sep 30, 2024
5b6bb42
Rename earth_daynight to earth_day
weiji14 Sep 30, 2024
755275e
Remove test_blue_marble_01m_default_registration
weiji14 Sep 30, 2024
f7e381e
Remove mention of tiled images for 5 arc-min or higher
weiji14 Sep 30, 2024
9a5e36b
Apply suggestions from code review
weiji14 Sep 30, 2024
a0fa29b
Create kwdict with R and T keys once
weiji14 Sep 30, 2024
70f6041
Lint
weiji14 Sep 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/api/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,7 @@ and store them in GMT's user data directory.
:toctree: generated

datasets.list_sample_data
datasets.load_blue_marble
datasets.load_earth_age
datasets.load_earth_free_air_anomaly
datasets.load_earth_geoid
Expand Down
1 change: 1 addition & 0 deletions pygmt/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"""

from pygmt.datasets.earth_age import load_earth_age
from pygmt.datasets.earth_day import load_blue_marble
from pygmt.datasets.earth_free_air_anomaly import load_earth_free_air_anomaly
from pygmt.datasets.earth_geoid import load_earth_geoid
from pygmt.datasets.earth_magnetic_anomaly import load_earth_magnetic_anomaly
Expand Down
97 changes: 97 additions & 0 deletions pygmt/datasets/earth_day.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
"""
Function to download the NASA Blue Marble image datasets from the GMT data server, and
load as :class:`xarray.DataArray`.

The images are available in various resolutions.
"""

from collections.abc import Sequence
from typing import Literal

import xarray as xr
from pygmt.datasets.load_remote_dataset import _load_remote_dataset

__doctest_skip__ = ["load_blue_marble"]


def load_blue_marble(
resolution: Literal[
"01d",
"30m",
"20m",
"15m",
"10m",
"06m",
"05m",
"04m",
"03m",
"02m",
"01m",
"30s",
] = "01d",
region: Sequence[float] | str | None = None,
) -> xr.DataArray:
r"""
Load NASA Blue Marble images in various resolutions.

.. figure:: https://www.generic-mapping-tools.org/remote-datasets/_images/GMT_earth_daynight.jpg
:width: 80%
:align: center

Earth day/night dataset.

The images are downloaded to a user data directory (usually
``~/.gmt/server/earth/earth_day/``) the first time you invoke this function.
Afterwards, it will load the image from the data directory. So you'll need an
internet connection the first time around.

These images can also be accessed by passing in the file name
**@earth_day**\_\ *res* to any image processing function or plotting method. *res*
is the image resolution (see below).

Refer to :gmt-datasets:`earth-daynight.html` for more details about available
datasets, including version information and references.

Parameters
----------
resolution
The image resolution. The suffix ``d``, ``m``, and ``s`` stand for arc-degree,
arc-minute, and arc-second.

region
The subregion of the image to load, in the form of a sequence [*xmin*, *xmax*,
*ymin*, *ymax*].

Returns
-------
image
The NASA Blue Marble image. Coordinates are latitude and longitude in degrees.

Note
----
The registration and coordinate system type of the returned
:class:`xarray.DataArray` image can be accessed via the GMT accessors (i.e.,
``image.gmt.registration`` and ``image.gmt.gtype`` respectively). However, these
properties may be lost after specific image operations (such as slicing) and will
need to be manually set before passing the image to any PyGMT data processing or
plotting functions. Refer to :class:`pygmt.GMTDataArrayAccessor` for detailed
explanations and workarounds.

Examples
--------

>>> from pygmt.datasets import load_blue_marble
>>> # load the default image (pixel-registered 1 arc-degree image)
>>> image = load_blue_marble()
"""
image = _load_remote_dataset(
name="earth_day",
prefix="earth_day",
resolution=resolution,
region=region,
registration="pixel",
)
# If rioxarray is installed, set the coordinate reference system
if hasattr(image, "rio"):
image = image.rio.write_crs(input_crs="OGC:CRS84")
Comment on lines +94 to +96
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In GMT_IMAGE.to_dataarray(), perhaps we should parse the header->ProjRefPROJ4 and set the correct CRS to the 3-band xarray.DataArray. If done, then we probably don't need to set the CRS here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking of parsing the projection information from the header when you mentioned this ProjRefPROJ4 field at #3128 (comment). But ideally we'll need to handle PROJ4/WKT/EPSG:

# Referencing system string in PROJ.4 format
("ProjRefPROJ4", ctp.c_char_p),
# Referencing system string in WKT format
("ProjRefWKT", ctp.c_char_p),
# Referencing system EPSG code
("ProjRefEPSG", ctp.c_int),

Something to consider for a separate PR, because we might need to use pyproj for this.

return image
30 changes: 26 additions & 4 deletions pygmt/datasets/load_remote_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,25 @@ class GMTRemoteDataset(NamedTuple):
"01m": Resolution("01m", registrations=["gridline"], tiled=True),
},
),
"earth_day": GMTRemoteDataset(
description="NASA Day Images",
units=None,
extra_attributes={"long_name": "blue_marble", "horizontal_datum": "WGS84"},
resolutions={
"01d": Resolution("01d", registrations=["pixel"]),
"30m": Resolution("30m", registrations=["pixel"]),
"20m": Resolution("20m", registrations=["pixel"]),
"15m": Resolution("15m", registrations=["pixel"]),
"10m": Resolution("10m", registrations=["pixel"]),
"06m": Resolution("06m", registrations=["pixel"]),
"05m": Resolution("05m", registrations=["pixel"]),
"04m": Resolution("04m", registrations=["pixel"]),
"03m": Resolution("03m", registrations=["pixel"]),
"02m": Resolution("02m", registrations=["pixel"]),
"01m": Resolution("01m", registrations=["pixel"]),
"30s": Resolution("30s", registrations=["pixel"]),
},
),
"earth_faa": GMTRemoteDataset(
description="IGPP Earth free-air anomaly",
units="mGal",
Expand Down Expand Up @@ -409,15 +428,18 @@ def _load_remote_dataset(
f"'region' is required for {dataset.description} resolution '{resolution}'."
)

# Currently, only grids are supported. Will support images in the future.
kwdict = {"T": "g", "R": region} # region can be None
kind = "image" if name in {"earth_day"} else "grid"
kwdict = {
"R": region, # region can be None
"T": "i" if kind == "image" else "g",
}
with Session() as lib:
with lib.virtualfile_out(kind="grid") as voutgrd:
with lib.virtualfile_out(kind=kind) as voutgrd:
lib.call_module(
module="read",
args=[fname, voutgrd, *build_arg_list(kwdict)],
)
grid = lib.virtualfile_to_raster(outgrid=None, vfname=voutgrd)
grid = lib.virtualfile_to_raster(kind=kind, outgrid=None, vfname=voutgrd)

# Full path to the grid if not tiled grids.
source = which(fname, download="a") if not resinfo.tiled else None
Expand Down
41 changes: 41 additions & 0 deletions pygmt/tests/test_datasets_earth_day.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
"""
Test basic functionality for loading Blue Marble datasets.
"""

import numpy as np
import numpy.testing as npt
from pygmt.datasets import load_blue_marble


def test_blue_marble_01d():
"""
Test some properties of the Blue Marble 01d data.
"""
data = load_blue_marble(resolution="01d")
assert data.name == "z"
assert data.long_name == "blue_marble"
assert data.attrs["horizontal_datum"] == "WGS84"
assert data.attrs["description"] == "NASA Day Images"
assert data.shape == (3, 180, 360)
assert data.dtype == "uint8"
assert data.gmt.registration == 1
assert data.gmt.gtype == 1
npt.assert_allclose(data.y, np.arange(89.5, -90.5, -1))
npt.assert_allclose(data.x, np.arange(-179.5, 180.5, 1))
npt.assert_allclose(data.min(), 10, atol=1)
npt.assert_allclose(data.max(), 255, atol=1)


def test_blue_marble_01d_with_region():
"""
Test loading low-resolution Blue Marble with 'region'.
"""
data = load_blue_marble(resolution="01d", region=[-10, 10, -5, 5])
assert data.shape == (3, 10, 20)
assert data.dtype == "uint8"
assert data.gmt.registration == 1
assert data.gmt.gtype == 1
npt.assert_allclose(data.y, np.arange(4.5, -5.5, -1))
npt.assert_allclose(data.x, np.arange(-9.5, 10.5, 1))
npt.assert_allclose(data.min(), 10, atol=1)
npt.assert_allclose(data.max(), 77, atol=1)
12 changes: 5 additions & 7 deletions pygmt/tests/test_grdimage_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
"""

import pytest
from pygmt import Figure, which
from pygmt import Figure
from pygmt.datasets import load_blue_marble

rioxarray = pytest.importorskip("rioxarray")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this line be removed?

Suggested change
rioxarray = pytest.importorskip("rioxarray")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, because we are still using tempfile_from_image to convert xarray.DataArray to a GeoTIFF before passing into grdimage. But see #3468.


Expand All @@ -14,12 +15,9 @@ def fixture_xr_image():
Load the image data from Blue Marble as an xarray.DataArray with shape {"band": 3,
"y": 180, "x": 360}.
"""
geotiff = which(fname="@earth_day_01d_p", download="c")
with rioxarray.open_rasterio(filename=geotiff) as rda:
if len(rda.band) == 3:
xr_image = rda.load()
assert xr_image.sizes == {"band": 3, "y": 180, "x": 360}
return xr_image
xr_image = load_blue_marble(resolution="01d")
assert xr_image.sizes == {"band": 3, "y": 180, "x": 360}
return xr_image


@pytest.mark.mpl_image_compare
Expand Down
Loading