Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JOSS Review] Code and examples #30

Closed
nmstreethran opened this issue Feb 20, 2024 · 4 comments
Closed

[JOSS Review] Code and examples #30

nmstreethran opened this issue Feb 20, 2024 · 4 comments

Comments

@nmstreethran
Copy link

@AlexanderJuestel here are my comments on the code and examples.

openjournals/joss-reviews#6275

Code

I recommend linting processing.py for some suggestions on improving the code style. Here are three suggested changes:

Additionally, these two snippets seem to be identical. Is it possible to define it as a function for use in these two places to avoid code repetition?

# Converting Shapely Polygon to GeoDataFrame
if isinstance(mask_gdf, Polygon):
mask_gdf = gpd.GeoDataFrame(geometry=[mask_gdf],
crs=hd_gdf.crs)
# Checking that the hd_gdf is of type GeoDataFrame
if not isinstance(hd_gdf, gpd.GeoDataFrame):
raise TypeError('The heat demand gdf must be provided as GeoDataFrame')
# Checking that the mask_gdf is of type GeoDataFrame
if not isinstance(mask_gdf, gpd.GeoDataFrame):
raise TypeError('The mask gdf must be provided as GeoDataFrame')
# Checking that the Heat Demand Data Column is provided as string
if not isinstance(hd_data_column, str):
raise TypeError('The heat demand data column must be provided as string')
# Checking that the HD Data Column is in the HD GeoDataFrame
if not hd_data_column in hd_gdf:
raise ValueError('%s is not a column in the GeoDataFrame' % hd_data_column)
# Reprojecting Data if necessary
if mask_gdf.crs != hd_gdf.crs:
hd_gdf = hd_gdf.to_crs(mask_gdf.crs)
# Exploding MultiPolygons
if any(shapely.get_type_id(hd_gdf.geometry) == 6):
hd_gdf = hd_gdf.explode(index_parts=True).reset_index(drop=True)
# Assigning area to Polygons
if all(shapely.get_type_id(hd_gdf.geometry) == 3):
# Assigning area of original geometries to GeoDataFrame
hd_gdf['area'] = hd_gdf.area
# Assigning lengths to LineStrings
elif all(shapely.get_type_id(hd_gdf.geometry) == 1):
# Assigning length of original geometries to GeoDataFrame
hd_gdf['length'] = hd_gdf.length

# Converting Shapely Polygon to GeoDataFrame
if isinstance(mask_gdf, Polygon):
mask_gdf = gpd.GeoDataFrame(geometry=[mask_gdf],
crs=hd_gdf.crs)
# Checking that the hd_gdf is of type GeoDataFrame
if not isinstance(hd_gdf, gpd.GeoDataFrame):
raise TypeError('The heat demand gdf must be provided as GeoDataFrame')
# Checking that the mask_gdf is of type GeoDataFrame
if not isinstance(mask_gdf, gpd.GeoDataFrame):
raise TypeError('The mask gdf must be provided as GeoDataFrame')
# Checking that the Heat Demand Data Column is provided as string
if not isinstance(hd_data_column, str):
raise TypeError('The heat demand data column must be provided as string')
# Checking that the HD Data Column is in the HD GeoDataFrame
if not hd_data_column in hd_gdf:
raise ValueError('%s is not a column in the GeoDataFrame' % hd_data_column)
# Reprojecting Data if necessary
if mask_gdf.crs != hd_gdf.crs:
hd_gdf = hd_gdf.to_crs(mask_gdf.crs)
# Exploding MultiPolygons
if any(shapely.get_type_id(hd_gdf.geometry) == 6):
hd_gdf = hd_gdf.explode(index_parts=True).reset_index(drop=True)
# Assigning area to Polygons
if all(shapely.get_type_id(hd_gdf.geometry) == 3):
# Assigning area of original geometries to GeoDataFrame
hd_gdf['area'] = hd_gdf.area
# Assigning lengths to LineStrings
elif all(shapely.get_type_id(hd_gdf.geometry) == 1):
# Assigning length of original geometries to GeoDataFrame
hd_gdf['length'] = hd_gdf.length

Example notebooks

To import the library from the cloned repository, using from pyheatdemand import processing was sufficient. Using sys was not necessary (also did not work for me; I'm not using Windows though). In the example notebooks, the sys path uses both pyhd and pyheatdemand; all instances of pyhd should be fixed.

The data paths used in the notebooks are inconsistent, e.g. ../../../data/, ../data/, etc. I noticed that many of the datasets used are available in the test folder. Could you make use of these in the example notebooks? This will allow potential users to execute the notebooks themselves and reproduce the results.

In some notebooks (for example, 02_Processing_Data_Type_I_Raster.ipynb), the data was reprojected to epsg:3034. However, subsequent code uses to_crs('EPSG:3034'), which is redundant and can be removed.

The semicolon in show(); should be removed.

In the notebook 14_Refining_Polygon_Mask.ipynb, under 'Refining the mask', there's a cell with a TypeError output that should be deleted.

Some other suggestions for improvement are to include axis labels / titles / legends in as many plots as possible (where appropriate) as there are a lot of images in the notebooks and this will help with some context. For example, the histograms could use titles or axis labels, and the heat demand colour bars could use a legend title.

@AlexanderJuestel
Copy link
Owner

AlexanderJuestel commented Feb 20, 2024

Code

All requested Code changes were implemented in 50110d9

I have also slightly adapted the tests.

@AlexanderJuestel
Copy link
Owner

AlexanderJuestel commented Feb 20, 2024

Example notebooks

To import the library from the cloned repository, using from pyheatdemand import processing was sufficient. Using sys was not necessary (also did not work for me; I'm not using Windows though). In the example notebooks, the sys path uses both pyhd and pyheatdemand; all instances of pyhd should be fixed.

Imports were fixed in 543ae78

The data paths used in the notebooks are inconsistent, e.g. ../../../data/, ../data/, etc. I noticed that many of the datasets used are available in the test folder. Could you make use of these in the example notebooks? This will allow potential users to execute the notebooks themselves and reproduce the results.

All paths were edited and data was added to the test folder in 1954775 and 4b29bca

In some notebooks (for example, 02_Processing_Data_Type_I_Raster.ipynb), the data was reprojected to epsg:3034. However, subsequent code uses to_crs('EPSG:3034'), which is redundant and can be removed.

The respective code was removed in 1954775

The semicolon in show(); should be removed.

The semicolons were removed in 253b74d

In the notebook 14_Refining_Polygon_Mask.ipynb, under 'Refining the mask', there's a cell with a TypeError output that should be deleted.

The notebook was adapted in ee3c30c

Some other suggestions for improvement are to include axis labels / titles / legends in as many plots as possible (where appropriate) as there are a lot of images in the notebooks and this will help with some context. For example, the histograms could use titles or axis labels, and the heat demand colour bars could use a legend title.

All plots were edited in 1954775

@AlexanderJuestel
Copy link
Owner

@nmstreethran I think I am done with your comments here. Feel free to have a look and thanks for bringing up some important points.

@nmstreethran
Copy link
Author

@AlexanderJuestel I've had a quick look and everything looks good. Many thanks for addressing my comments!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants