Skip to content

Commit

Permalink
Check variable with Custom Sensor Metadata reader (#62)
Browse files Browse the repository at this point in the history
* Fix when reading np.int64 integers

* Update changelog

* Fix no station within selected dist with new pygeogrids version

* Derive station name from fname in header format

* Check variable with custom sensor metadata reader

* Update changelog
  • Loading branch information
wpreimes authored Jan 21, 2023
1 parent dd0e034 commit d9b889e
Show file tree
Hide file tree
Showing 5 changed files with 104 additions and 23 deletions.
10 changes: 8 additions & 2 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,17 @@
Changelog
=========

Unreleased
==========
Unreleased changes in master branch
===================================

-

Version 1.3.2
=============

- Fix bug where station names in metadata can be different between Header and CEOP format.
- Custom Sensor Metadata reader now also checks the measured variable.

Version 1.3.1
=============

Expand Down
101 changes: 87 additions & 14 deletions docs/examples/custom_meta.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,13 @@
"\n",
"## Set metadata reader\n",
"\n",
"Then we set up the metadata reader. Here we use one of the predefined readers, but you can (and usually have to) also write your own reader as long as it inherits from the abstract class `ismn.custom.CustomMetaReader` and implements a function `read_metadata` which uses the information from previously loaded metadata for a station to find the matching entries in the provided data, and either returns a `ismn.meta.MetaData` object or a dictionary of metadata variables and the according values. Normally you use either the station latitude, longitude and sometimes also the sensor depth information; maybe even the station name."
"Then we set up the metadata reader. Here we use one of the predefined readers, but you can (and usually have to) also write your own reader as long as it inherits from the abstract class `ismn.custom.CustomMetaReader` and implements a function `read_metadata` which uses the information from previously loaded metadata for a station to find the matching entries in the provided data, and either returns a `ismn.meta.MetaData` object or a dictionary of metadata variables and the according values. Normally you use either the station latitude, longitude and sometimes also the sensor depth information; maybe even the station name.\n",
"We also assign a fill value for one of the 2 VOD variables, which is used for stations / sensors for which no counterpart is found in the csv file."
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "b7a5944e-0632-4095-a0c0-90f44d2ae4e8",
"metadata": {},
"outputs": [],
Expand All @@ -51,12 +52,12 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 11,
"id": "ddf2b498-8601-4f89-9d37-51eba7a12efc",
"metadata": {},
"outputs": [],
"source": [
"my_meta_reader = CustomStationMetadataCsv('vod.csv')"
"my_meta_reader = CustomStationMetadataCsv('vod.csv', fill_values={'vod_k': -9999})"
]
},
{
Expand All @@ -69,7 +70,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 14,
"id": "eb5786c4-c0f9-49de-990a-5db18793f987",
"metadata": {},
"outputs": [
Expand All @@ -86,16 +87,16 @@
"name": "stderr",
"output_type": "stream",
"text": [
"Files Processed: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 4.97it/s]"
"Files Processed: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 11.62it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Metadata generation finished after 0 Seconds.\n",
"Metadata and Log stored in /tmp/tmp8omp4vpr\n",
"Found existing ismn metadata in /tmp/tmp8omp4vpr/Data_seperate_files_20170810_20180809.csv.\n"
"Metadata and Log stored in /tmp/tmp3zk16t_7\n",
"Found existing ismn metadata in /tmp/tmp3zk16t_7/Data_seperate_files_20170810_20180809.csv.\n"
]
},
{
Expand All @@ -121,7 +122,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 17,
"id": "7f293998-a3f0-4ac2-a673-d55e03229f3f",
"metadata": {},
"outputs": [
Expand All @@ -134,7 +135,7 @@
"])"
]
},
"execution_count": 5,
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -148,25 +149,89 @@
"id": "2a7bbf34-ffd4-42cd-832d-9c3fe473ae06",
"metadata": {},
"source": [
"But not for other stations (in this case pandas automatically assigns NaN to the variable)."
"But not for other stations. "
]
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 20,
"id": "8af19018-884a-4aad-9984-a0a52227eae8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MetaData([\n",
" MetaVar([vod_k, nan, None]),\n",
" MetaVar([vod_k, -9999.0, None]),\n",
" MetaVar([vod_x, nan, None])\n",
"])"
]
},
"execution_count": 13,
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds['COSMOS']['ARM-1'][0].metadata[['vod_k', 'vod_x']]"
]
},
{
"cell_type": "markdown",
"id": "53565bc8-f74c-4a4d-aa05-842fb5778584",
"metadata": {},
"source": [
"The station wide variable is also available for sensors at the station (here we simply pick the first available sensor at the station, with index 0)."
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "d7630de7-bbb1-4d73-a664-9dc9eba8404f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MetaData([\n",
" MetaVar([vod_k, 0.64922965, None]),\n",
" MetaVar([vod_x, 0.39021793, None])\n",
"])"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds['FR_Aqui']['fraye'][0].metadata[['vod_k', 'vod_x']]"
]
},
{
"cell_type": "markdown",
"id": "56ebd643-83c8-4ea2-b1b2-9a052fff5e99",
"metadata": {},
"source": [
"For stations where no VOD was assigned, the fill value (or np.NaN if no fill value is provided) is used (here we simply pick the first available sensor at the station, with index 0)."
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "e4ae3da5-3c82-48c4-9c60-e899a329bad3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MetaData([\n",
" MetaVar([vod_k, -9999.0, None]),\n",
" MetaVar([vod_x, nan, None])\n",
"])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
Expand Down Expand Up @@ -441,6 +506,14 @@
"data, meta = ds.read(ids, return_meta=True)\n",
"meta"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4664e11a-7c3e-4574-9a92-d1ca0bccafae",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
5 changes: 3 additions & 2 deletions src/ismn/custom.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ class CustomSensorMetadataCsv(CustomStationMetadataCsv):
In this case that the metadata must be stored in a csv file with the
following structure:
network;station;instrument;depth_from;depth_to;<var1>;<var1>_depth_from;<var1>_depth_to;<var2> ...
network;station;instrument;variable;depth_from;depth_to;<var1>;<var1>_depth_from;<var1>_depth_to;<var2> ...
where <var1> etc. are the names of the custom metadata variables that are
transferred into the python metadata
Expand All @@ -218,11 +218,12 @@ def read_metadata(self, meta: MetaData):
cond = (self.df['network'] == meta['network'].val) & \
(self.df['station'] == meta['station'].val) & \
(self.df['instrument'] == meta['instrument'].val) & \
(self.df['variable'] == meta['variable'].val) & \
(self.df['depth_from'] == meta['instrument'].depth[0]) & \
(self.df['depth_to'] == meta['instrument'].depth[1])

df = self.df[cond].set_index(
['network', 'station', 'instrument', 'depth_from', 'depth_to'])
['network', 'station', 'instrument', 'variable', 'depth_from', 'depth_to'])

# drop potential duplicates, keep first
df = df[~df.index.duplicated(keep='first')]
Expand Down
1 change: 1 addition & 0 deletions tests/test_custom_meta.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ def test_build_custom_metadata_sensor():
assert ds['COSMOS'][0][0].metadata['myvar3'].val == 'unknown'

assert ds['FR_Aqui']['fraye'][0].metadata['myvar1'].val == 'lorem'
assert ds['FR_Aqui']['fraye'][0].metadata['variable'].val == 'soil_moisture'
assert ds['FR_Aqui']['fraye'][0].metadata['myvar1'].depth == Depth(0, 1)
assert ds['FR_Aqui']['fraye'][0].metadata['myvar2'].val == 1.1
assert ds['FR_Aqui']['fraye'][0].metadata['myvar2'].depth is None
Expand Down
10 changes: 5 additions & 5 deletions tests/test_data/custom_metadata/custom_sensormeta.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
myvar1;myvar1_depth_from;myvar1_depth_to;myvar2;myvar3;myvar3_depth_to;network;station;instrument;depth_from;depth_to
lorem;0;1;1.1;2022-01-01;1;FR_Aqui;fraye;ThetaProbe-ML2X;0.05;0.05
NOT_USED;0.1;0.2;1.2;2022-01-02;2;FR_Aqui;fraye;ThetaProbe-ML2X;0.1;0.1
NOT_USED;0.1;1.1;1.3;2022-01-03;3;FR_Aqui;fraye;ThetaProbe-ML2X;99;9999
NOT_USED;0.1;1.1;1.4;2022-01-04;4;FR_Aqui;grandcal;ThetaProbe-ML2X;0.05;0.05
myvar1;myvar1_depth_from;myvar1_depth_to;myvar2;myvar3;myvar3_depth_to;network;station;instrument;variable;depth_from;depth_to
lorem;0;1;1.1;2022-01-01;1;FR_Aqui;fraye;ThetaProbe-ML2X;soil_moisture;0.05;0.05
NOT_USED;0.1;0.2;1.2;2022-01-02;2;FR_Aqui;fraye;ThetaProbe-ML2X;soil_moisture;0.1;0.1
NOT_USED;0.1;1.1;1.3;2022-01-03;3;FR_Aqui;fraye;ThetaProbe-ML2X;soil_moisture;99;9999
NOT_USED;0.1;1.1;1.4;2022-01-04;4;FR_Aqui;grandcal;ThetaProbe-ML2X;soil_moisture;0.05;0.05

0 comments on commit d9b889e

Please sign in to comment.