Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt the SEVIRI native format reader in Satpy to support remote reading #2863

Merged
merged 22 commits into from
Aug 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
5401d28
Add `readers.utils.fromfile()` for remote reading
pkhalaj Jul 26, 2024
7142a68
Update `AUTHORS.md`
pkhalaj Jul 26, 2024
8051d50
Reformat `readers.utils` to make it PEP8 compliant
pkhalaj Jul 26, 2024
16f2642
Adapt `readers.seviri_l1b_native` for remote reading
pkhalaj Jul 26, 2024
1f10690
Update `test_seviri_l1b_native` with `_get_array`
pkhalaj Jul 26, 2024
7d516d8
Update mock.patch args in `test_seviri_l1b_native` to match changes i…
pkhalaj Jul 26, 2024
d01a519
Update `test_seviri_l1b_native` with tests for remote reading
pkhalaj Jul 28, 2024
0e5f3e7
Parametrize `test_read_physical_seviri_nat_file` to test zip files as…
pkhalaj Jul 29, 2024
15d17d6
Reformat `test_seviri_l1b_native` to make it PEP8 compliant
pkhalaj Jul 29, 2024
df4a6db
Address @mraspaud comments on PR #2863
pkhalaj Jul 29, 2024
43b846b
Simplify fixtures and parameters of `test_read_physical_seviri_nat_fi…
pkhalaj Jul 29, 2024
816a2c7
Use a header which leads to a smaller seviri nat file on disk
pkhalaj Jul 29, 2024
d4976c3
Make the returned path posix compliant in `compress_seviri_native_file`
pkhalaj Jul 29, 2024
db3ddb8
Resolve warnings raised as a result of `slow` & `order` being pytest …
pkhalaj Jul 29, 2024
fbd02b5
Resolve warnings raised as a failure in the orbit polynomial
pkhalaj Jul 29, 2024
f002b00
Fix the issue with the docstring in `amend_seviri_native_null_header()`
pkhalaj Jul 29, 2024
a7596d3
Update `fsspec` support column for `seviri_l1b_native` reader in the …
pkhalaj Jul 29, 2024
80f3925
Use dask `map_blocks()` in `_get_array()`
pkhalaj Jul 30, 2024
42dfc05
Replace `np.zeros` with `np.empty` in the meta parameter of the dask …
pkhalaj Aug 13, 2024
1c00de7
Making `dask_array` private by prefixing it with `_`
pkhalaj Aug 13, 2024
b58823c
Replace `np.empty` with `np.array` for performance (less memory footp…
pkhalaj Aug 13, 2024
d22d2b6
Change the first line of docstrings to imperative mood.
pkhalaj Aug 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion AUTHORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ The following people have made contributions to this project:
- [Jactry Zeng](https://github.com/jactry)
- [Johannes Johansson (JohannesSMHI)](https://github.com/JohannesSMHI)
- [Sauli Joro (sjoro)](https://github.com/sjoro)
- [Pouria Khalaj](https://github.com/pkhalaj)
mraspaud marked this conversation as resolved.
Show resolved Hide resolved
- [Janne Kotro (jkotro)](https://github.com/jkotro)
- [Ralph Kuehn (ralphk11)](https://github.com/ralphk11)
- [Panu Lahtinen (pnuu)](https://github.com/pnuu)
Expand Down Expand Up @@ -94,5 +95,4 @@ The following people have made contributions to this project:
- [Will Sharpe (wjsharpe)](https://github.com/wjsharpe)
- [Sara Hörnquist (shornqui)](https://github.com/shornqui)
- [Antonio Valentino](https://github.com/avalentino)
- [Pouria Khalaj](https://github.com/pkhalaj)
- [Clément (ludwigvonkoopa)](https://github.com/ludwigVonKoopa)
2 changes: 1 addition & 1 deletion satpy/etc/readers/seviri_l1b_native.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ reader:
description: >
Reader for EUMETSAT MSG SEVIRI Level 1b native format files.
status: Nominal
supports_fsspec: false
supports_fsspec: true
sensors: [seviri]
default_channels: [HRV, IR_016, IR_039, IR_087, IR_097, IR_108, IR_120, IR_134, VIS006, VIS008, WV_062, WV_073]
reader: !!python/name:satpy.readers.yaml_reader.GEOFlippableFileYAMLReader
Expand Down
68 changes: 43 additions & 25 deletions satpy/readers/seviri_l1b_native.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@
get_native_header,
native_trailer,
)
from satpy.readers.utils import reduce_mda
from satpy.readers.utils import fromfile, generic_open, reduce_mda
from satpy.utils import get_legacy_chunk_size

logger = logging.getLogger("native_msg")
Expand Down Expand Up @@ -193,10 +193,27 @@
# Available channels are known only after the header has been read
self.header_type = get_native_header(has_archive_header(self.filename))
self._read_header()
self.dask_array = da.from_array(self._get_memmap(), chunks=(CHUNK_SIZE,))
self._make_dask_array_with_map_blocks()
self._read_trailer()
self.image_boundaries = ImageBoundaries(self.header, self.trailer, self.mda)

def _make_dask_array_with_map_blocks(self):
"""Make the dask array using the ``da.map_blocks()`` functionality."""
dtype = self._get_data_dtype()
chunks = da.core.normalize_chunks(
"auto",
shape=(self.mda["number_of_lines"],),
djhoese marked this conversation as resolved.
Show resolved Hide resolved
dtype=dtype)
self._dask_array = da.map_blocks(
_get_array,
dtype=dtype,
chunks=chunks,
meta=np.array([], dtype=dtype),
# The following will be passed as keyword arguments to the `_get_array()` function.
filename=self.filename,
hdr_size=self.header_type.itemsize
)

@property
def _repeat_cycle_duration(self):
"""Get repeat cycle duration from the trailer."""
Expand Down Expand Up @@ -266,25 +283,17 @@
# each pixel is 10-bits -> one line of data has 25% more bytes
# than the number of columns suggest (10/8 = 1.25)
visir_rec = get_lrec(int(self.mda["number_of_columns"] * 1.25))
number_of_visir_channels = len(
[s for s in self.mda["channel_list"] if not s == "HRV"])
drec = [("visir", (visir_rec, number_of_visir_channels))]
drec = [("visir", (visir_rec, self._number_of_visir_channels()))]

if self.mda["available_channels"]["HRV"]:
hrv_rec = get_lrec(int(self.mda["hrv_number_of_columns"] * 1.25))
drec.append(("hrv", (hrv_rec, 3)))

return np.dtype(drec)

def _get_memmap(self):
"""Get the memory map for the SEVIRI data."""
with open(self.filename) as fp:
data_dtype = self._get_data_dtype()
hdr_size = self.header_type.itemsize

return np.memmap(fp, dtype=data_dtype,
shape=(self.mda["number_of_lines"],),
offset=hdr_size, mode="r")
def _number_of_visir_channels(self):
"""Return the number of visir channels, i.e. all channels excluding ``HRV``."""
return len([s for s in self.mda["channel_list"] if not s == "HRV"])

def _read_header(self):
"""Read the header info."""
Expand Down Expand Up @@ -387,9 +396,7 @@
data_size = (self._get_data_dtype().itemsize *
self.mda["number_of_lines"])

with open(self.filename) as fp:
fp.seek(hdr_size + data_size)
data = np.fromfile(fp, dtype=native_trailer, count=1)
data = fromfile(self.filename, dtype=native_trailer, count=1, offset=hdr_size + data_size)

self.trailer.update(recarray2dict(data))

Expand Down Expand Up @@ -587,10 +594,10 @@
# Check if there is only 1 channel in the list as a change
# is needed in the array assignment ie channel id is not present
if len(self.mda["channel_list"]) == 1:
raw = self.dask_array["visir"]["line_data"]
raw = self._dask_array["visir"]["line_data"]

Check warning on line 597 in satpy/readers/seviri_l1b_native.py

View check run for this annotation

Codecov / codecov/patch

satpy/readers/seviri_l1b_native.py#L597

Added line #L597 was not covered by tests
else:
i = self.mda["channel_list"].index(dataset_id["name"])
raw = self.dask_array["visir"]["line_data"][:, i, :]
raw = self._dask_array["visir"]["line_data"][:, i, :]
data = dec10216(raw.flatten())
data = data.reshape(shape)
return data
Expand All @@ -601,7 +608,7 @@

data_list = []
for i in range(3):
raw = self.dask_array["hrv"]["line_data"][:, i, :]
raw = self._dask_array["hrv"]["line_data"][:, i, :]

Check warning on line 611 in satpy/readers/seviri_l1b_native.py

View check run for this annotation

Codecov / codecov/patch

satpy/readers/seviri_l1b_native.py#L611

Added line #L611 was not covered by tests
data = dec10216(raw.flatten())
data = data.reshape(shape_layer)
data_list.append(data)
Expand Down Expand Up @@ -660,7 +667,7 @@

def _get_acq_time_hrv(self):
"""Get raw acquisition time for HRV channel."""
tline = self.dask_array["hrv"]["acq_time"]
tline = self._dask_array["hrv"]["acq_time"]

Check warning on line 670 in satpy/readers/seviri_l1b_native.py

View check run for this annotation

Codecov / codecov/patch

satpy/readers/seviri_l1b_native.py#L670

Added line #L670 was not covered by tests
tline0 = tline[:, 0]
tline1 = tline[:, 1]
tline2 = tline[:, 2]
Expand All @@ -672,9 +679,9 @@
# Check if there is only 1 channel in the list as a change
# is needed in the array assignment, i.e. channel id is not present
if len(self.mda["channel_list"]) == 1:
return self.dask_array["visir"]["acq_time"].compute()
return self._dask_array["visir"]["acq_time"].compute()

Check warning on line 682 in satpy/readers/seviri_l1b_native.py

View check run for this annotation

Codecov / codecov/patch

satpy/readers/seviri_l1b_native.py#L682

Added line #L682 was not covered by tests
i = self.mda["channel_list"].index(dataset_id["name"])
return self.dask_array["visir"]["acq_time"][:, i].compute()
return self._dask_array["visir"]["acq_time"][:, i].compute()

def _update_attrs(self, dataset, dataset_info):
"""Update dataset attributes."""
Expand Down Expand Up @@ -888,12 +895,23 @@

def has_archive_header(filename):
"""Check whether the file includes an ASCII archive header."""
with open(filename, mode="rb") as istream:
with generic_open(filename, mode="rb") as istream:
return istream.read(36) == ASCII_STARTSWITH


def read_header(filename):
"""Read SEVIRI L1.5 native header."""
dtype = get_native_header(has_archive_header(filename))
hdr = np.fromfile(filename, dtype=dtype, count=1)
hdr = fromfile(filename, dtype=dtype, count=1)
return recarray2dict(hdr)


def _get_array(filename=None, hdr_size=None, block_info=None):
"""Get the numpy array for the SEVIRI data."""
output_block_info = block_info[None]
data_dtype = output_block_info["dtype"]
return fromfile(
filename,
dtype=data_dtype,
offset=hdr_size + output_block_info["array-location"][0][0] * data_dtype.itemsize,
count=output_block_info["chunk-shape"][0])
37 changes: 29 additions & 8 deletions satpy/readers/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,8 @@ def get_geostationary_angle_extent(geos_area):
h = float(h) / 1000 + req

# compute some constants
aeq = 1 - req**2 / (h ** 2)
ap_ = 1 - rp**2 / (h ** 2)
aeq = 1 - req ** 2 / (h ** 2)
ap_ = 1 - rp ** 2 / (h ** 2)

# generate points around the north hemisphere in satellite projection
# make it a bit smaller so that we stay inside the valid area
Expand Down Expand Up @@ -142,15 +142,15 @@ def _lonlat_from_geos_angle(x, y, geos_area):
b__ = (a / float(b)) ** 2

sd = np.sqrt((h__ * np.cos(x) * np.cos(y)) ** 2 -
(np.cos(y)**2 + b__ * np.sin(y)**2) *
(h__**2 - (float(a) / 1000)**2))
(np.cos(y) ** 2 + b__ * np.sin(y) ** 2) *
(h__ ** 2 - (float(a) / 1000) ** 2))
# sd = 0

sn = (h__ * np.cos(x) * np.cos(y) - sd) / (np.cos(y)**2 + b__ * np.sin(y)**2)
sn = (h__ * np.cos(x) * np.cos(y) - sd) / (np.cos(y) ** 2 + b__ * np.sin(y) ** 2)
s1 = h__ - sn * np.cos(x) * np.cos(y)
s2 = sn * np.sin(x) * np.cos(y)
s3 = -sn * np.sin(y)
sxy = np.sqrt(s1**2 + s2**2)
sxy = np.sqrt(s1 ** 2 + s2 ** 2)

lons = np.rad2deg(np.arctan2(s2, s1)) + lon_0
lats = np.rad2deg(-np.arctan2(b__ * s3, sxy))
Expand Down Expand Up @@ -256,7 +256,7 @@ def _unzip_with_pbzip(filename, tmpfilepath, fdn):
if n_thr:
runner = [pbzip,
"-dc",
"-p"+str(n_thr),
"-p" + str(n_thr),
filename]
else:
runner = [pbzip,
Expand Down Expand Up @@ -361,6 +361,27 @@ def generic_open(filename, *args, **kwargs):
fp.close()


def fromfile(filename, dtype, count=1, offset=0):
"""Read the numpy array from a (remote or local) file using a buffer.

Note:
This function relies on the :func:`generic_open` context manager to read a file remotely.

Args:
filename: Either the name of the file to read or a :class:`satpy.readers.FSFile` object.
dtype: The data type of the numpy array
count (Optional, default ``1``): Number of items to read
offset (Optional, default ``0``): Starting point for reading the buffer from

Returns:
The content of the filename as a numpy array with the given data type.
"""
with generic_open(filename, mode="rb") as istream:
istream.seek(offset)
content = np.frombuffer(istream.read(dtype.itemsize * count), dtype=dtype, count=count)
return content


def bbox(img):
"""Find the bounding box around nonzero elements in the given array.

Expand Down Expand Up @@ -395,7 +416,7 @@ def get_earth_radius(lon, lat, a, b):
latlong = pyproj.CRS.from_dict({"proj": "latlong", "a": a, "b": b, "units": "m"})
transformer = pyproj.Transformer.from_crs(latlong, geocent)
x, y, z = transformer.transform(lon, lat, 0.0)
return np.sqrt(x**2 + y**2 + z**2)
return np.sqrt(x ** 2 + y ** 2 + z ** 2)


def reduce_mda(mda, max_size=100):
Expand Down
Loading
Loading