-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix geometry issues #532
Fix geometry issues #532
Conversation
The GADM issue means that some GADM_ID may have non-standard GADM_ID, e.g. for India (part of
A visual check gives the following picture (which actually doen't clarify the issue): Regions with non-standard GADM_IDs are highlighted by violet. |
It appears that non-standard GADM_IDs cause troubles when being interpreted as country names (e.g. 'not found') and propagating into the cleaned osm datasets. E.g. that is currently the case for India while for China non-standard GADM_IDs seem to remain in the country_shape and gadm_shape only do not causing any harm downstream. Although, such a difference may be a random effect of a particular workflow realisation. |
A picture for non-standard GADM_IDs for China looks as follows
|
It looks like a fix is needed in The most strait-forward solution would be to filter by GID_0 (aka country name) values with something like |
for more information, see https://pre-commit.ci
In a current implementation the onshore shape consists of a some shapes each corresponding to a special administrative area (those ones notated with the non-standard GADM_IDs)
That leads to some troubles along the workflow. In particular,
That issue is caused by a fact that |
It could be a good idea to unify all the polygons when generating onshore_shape. The problem is to merge all the geometries simultaneously keeping other features (like It's worth to investigate if the geometry consisting of different polygons would lead to any problems along the workflow apart of the mentioned one in Apart of that, it may be worth to test the workflow on the countries which consist of different areas (e.g. Malaysia) |
It looks like onshore_shape composed of different geometries doesn't lead to other problems along the workflow apart the known one in However, it appears that for some reasons geometries were duplicated in offshore_shape. Currently offshore_shape consists of five identical geometries which needs to be investigated. |
The issue with duplicated offshore geometries was fixed for India. However, during tests with |
There are some weird issues during renewable profiles generation:
That is quite weird and most likely linked to some environment issues. #538 attempts to fix it |
Update: in the CI run there is currently one of the errors listed above, namely INFO:snakemake.logging:
[Thu Dec 15 17:07:34 2022]
INFO:snakemake.logging:[Thu Dec 15 17:07:34 2022]
rule build_renewable_profiles:
input: networks/base.nc, resources/natura.tiff, data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_EPSG-4326.tif, data/gebco/GEBCO_2021_TID.nc, resources/shapes/country_shapes.geojson, resources/shapes/offshore_shapes.geojson, data/hydro_capacities.csv, data/eia_hydro_annual_generation.csv, resources/powerplants.csv, resources/bus_regions/regions_onshore.geojson, cutouts/africa-2013-era5-tutorial.nc
output: resources/renewable_profiles/profile_hydro.nc
log: logs/build_renewable_profile_hydro.log
jobid: 20
benchmark: benchmarks/build_renewable_profiles_hydro
reason: Missing output files: resources/renewable_profiles/profile_hydro.nc; Input files updated by another job: networks/base.nc, resources/shapes/offshore_shapes.geojson, data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_EPSG-4326.tif, resources/powerplants.csv, data/hydro_capacities.csv, cutouts/africa-2013-era5-tutorial.nc, resources/shapes/country_shapes.geojson, data/gebco/GEBCO_2021_TID.nc, resources/bus_regions/regions_onshore.geojson, resources/natura.tiff
wildcards: technology=hydro
threads: 2
resources: tmpdir=/tmp, mem_mb=20000, mem_mib=19074
INFO:snakemake.logging:rule build_renewable_profiles:
input: networks/base.nc, resources/natura.tiff, data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_EPSG-4326.tif, data/gebco/GEBCO_2021_TID.nc, resources/shapes/country_shapes.geojson, resources/shapes/offshore_shapes.geojson, data/hydro_capacities.csv, data/eia_hydro_annual_generation.csv, resources/powerplants.csv, resources/bus_regions/regions_onshore.geojson, cutouts/africa-2013-era5-tutorial.nc
output: resources/renewable_profiles/profile_hydro.nc
log: logs/build_renewable_profile_hydro.log
jobid: 20
benchmark: benchmarks/build_renewable_profiles_hydro
reason: Missing output files: resources/renewable_profiles/profile_hydro.nc; Input files updated by another job: networks/base.nc, resources/shapes/offshore_shapes.geojson, data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_EPSG-4326.tif, resources/powerplants.csv, data/hydro_capacities.csv, cutouts/africa-2013-era5-tutorial.nc, resources/shapes/country_shapes.geojson, data/gebco/GEBCO_2021_TID.nc, resources/bus_regions/regions_onshore.geojson, resources/natura.tiff
wildcards: technology=hydro
threads: 2
resources: tmpdir=/tmp, mem_mb=20000, mem_mib=19074
INFO:snakemake.logging:
INFO:__main__:Hydro normalization mode hydro_capacities
Determine upstream basins per plant: 0it [00:00, ?it/s]
Determine upstream basins per plant: 3it [00:00, 201.72it/s]
Traceback (most recent call last):
File "/home/runner/work/pypsa-earth/pypsa-earth/.snakemake/scripts/tmpndxdiqdd.build_renewable_profiles.py", line 455, in <module>
inflow = correction_factor * func(capacity_factor=True, **resource)
File "/usr/share/miniconda3/envs/pypsa-earth/lib/python3.10/site-packages/atlite/convert.py", line 805, in hydro
matrix = cutout.indicatormatrix(basins.shapes)
File "/usr/share/miniconda3/envs/pypsa-earth/lib/python3.10/site-packages/atlite/cutout.py", line 536, in indicatormatrix
return compute_indicatormatrix(self.grid, shapes, self.crs, shapes_crs)
File "/usr/share/miniconda3/envs/pypsa-earth/lib/python3.10/site-packages/atlite/gis.py", line 148, in compute_indicatormatrix
if o.intersects(d):
AttributeError: 'numpy.int64' object has no attribute 'intersects'
[Thu Dec 15 17:07:40 2022]
INFO:snakemake.logging:[Thu Dec 15 17:07:40 2022]
Error in rule build_renewable_profiles:
jobid: 20
input: networks/base.nc, resources/natura.tiff, data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_EPSG-4326.tif, data/gebco/GEBCO_2021_TID.nc, resources/shapes/country_shapes.geojson, resources/shapes/offshore_shapes.geojson, data/hydro_capacities.csv, data/eia_hydro_annual_generation.csv, resources/powerplants.csv, resources/bus_regions/regions_onshore.geojson, cutouts/africa-2013-era5-tutorial.nc
output: resources/renewable_profiles/profile_hydro.nc
log: logs/build_renewable_profile_hydro.log (check log file(s) for error details)
ERROR:snakemake.logging:Error in rule build_renewable_profiles:
jobid: 20
input: networks/base.nc, resources/natura.tiff, data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_EPSG-4326.tif, data/gebco/GEBCO_2021_TID.nc, resources/shapes/country_shapes.geojson, resources/shapes/offshore_shapes.geojson, data/hydro_capacities.csv, data/eia_hydro_annual_generation.csv, resources/powerplants.csv, resources/bus_regions/regions_onshore.geojson, cutouts/africa-2013-era5-tutorial.nc
output: resources/renewable_profiles/profile_hydro.nc
log: logs/build_renewable_profile_hydro.log (check log file(s) for error details)
RuleException:
CalledProcessError in file /home/runner/work/pypsa-earth/pypsa-earth/Snakefile, line 325:
Command 'set -euo pipefail; /usr/share/miniconda3/envs/pypsa-earth/bin/python3.10 /home/runner/work/pypsa-earth/pypsa-earth/.snakemake/scripts/tmpndxdiqdd.build_renewable_profiles.py' returned non-zero exit status 1.
File "/home/runner/work/pypsa-earth/pypsa-earth/Snakefile", line 325, in __rule_build_renewable_profiles
File "/usr/share/miniconda3/envs/pypsa-earth/lib/python3.10/concurrent/futures/thread.py", line 58, in run
ERROR:snakemake.logging:RuleException:
CalledProcessError in file /home/runner/work/pypsa-earth/pypsa-earth/Snakefile, line 325:
Command 'set -euo pipefail; /usr/share/miniconda3/envs/pypsa-earth/bin/python3.10 /home/runner/work/pypsa-earth/pypsa-earth/.snakemake/scripts/tmpndxdiqdd.build_renewable_profiles.py' returned non-zero exit status 1.
File "/home/runner/work/pypsa-earth/pypsa-earth/Snakefile", line 325, in __rule_build_renewable_profiles
File "/usr/share/miniconda3/envs/pypsa-earth/lib/python3.10/concurrent/futures/thread.py", line 58, in run
Shutting down, this might take some time.
WARNING:snakemake.logging:Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
ERROR:snakemake.logging:Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-12-15T165417.221370.snakemake.log
WARNING:snakemake.logging:Complete log: .snakemake/log/2022-12-15T165417.221370.snakemake.log
Error: Process completed with exit code 1. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
@davide-f, thank you so much for the review and the discussion! My feeling is that we are close to finalise the first bug-fixing keeping advanced features for the second step. The only point left is replacing a If you don't have major objections regarding the introduced changes, I'll move the changes into a clean PR |
@ekatef May we close this PR since we worked in the others? |
Closed to be finalised in a cleaned version |
It has been found during workflow testing on data around the world that underlaying geographical shapes sometimes can be sometimes problematic, namely:
it looks like the
GID_0
feature in the gadm file may sometimes have values which do not exactly correspond to the country name. As a result, theGADM_ID
in thegadm_shapes.geojson
in may have some unexpected values like"Z02.28_1"
or"Z03.28_1"
(taken from data for China), and leads incountry_shapes.geojson
file to zeros for the population value and"name": null
. This problem may be reproduced by runningbuild_shapes
for China or for India and leads to numerous troubles along the workflow.sometimes the country area is a multiply connected region (an area with holes) which is treated by shapely as invalid geometry. In particular, that is the case for Dahagram–Angarpota Bangladeshi enclave in India.
it may happen that Voronoi partition leads to empty polygons which are not filtered by
dropna()
, are being written asnone
geometry into"regions_onshore.geojson"
and cause some troubles propagating along the workflow. In particular, that is the case for Kazakhstan shapes (OSM data on 4.12.2022, 10 clusters).Changes proposed in this Pull Request
This pull request is intended to fix such regional-specific issues.
Checklist
envs/environment.yaml
andenvs/environment.docs.yaml
.config.default.yaml
andconfig.tutorial.yaml
.test/
(note tests are changing the config.tutorial.yaml)doc/configtables/*.csv
and line references are adjusted indoc/configuration.rst
anddoc/tutorial.rst
.doc/release_notes.rst
is amended in the format of previous release notes, including reference to the requested PR.