[JOSS Review] Code and examples #30

nmstreethran · 2024-02-20T05:36:08Z

@AlexanderJuestel here are my comments on the code and examples.

Code

I recommend linting processing.py for some suggestions on improving the code style. Here are three suggested changes:

change the following to shapes = list(zip(gdf_hd['geometry'], gdf_hd['HD']))

pyheatdemand/pyheatdemand/processing.py

Line 782 in 73f41fd

shapes = [(geom, value) for geom, value in zip(gdf_hd['geometry'], gdf_hd['HD'])]
bare except should be avoided if possible

pyheatdemand/pyheatdemand/processing.py

Lines 963 to 964 in 73f41fd

except:

gdf = None
these should use if a not in b instead of if not a in b

pyheatdemand/pyheatdemand/processing.py

Line 513 in 73f41fd

if not hd_data_column in hd_gdf:

pyheatdemand/pyheatdemand/processing.py

Line 656 in 73f41fd

if not hd_data_column in hd_gdf:

pyheatdemand/pyheatdemand/processing.py

Line 1372 in 73f41fd

if not hd_data_column in gdf_buildings:

pyheatdemand/pyheatdemand/processing.py

Line 1441 in 73f41fd

if not hd_data_column in gdf_buildings:

Additionally, these two snippets seem to be identical. Is it possible to define it as a function for use in these two places to avoid code repetition?

pyheatdemand/pyheatdemand/processing.py

Lines 495 to 532 in 73f41fd

    
           # Converting Shapely Polygon to GeoDataFrame 
        
           if isinstance(mask_gdf, Polygon): 
        
               mask_gdf = gpd.GeoDataFrame(geometry=[mask_gdf], 
        
                                           crs=hd_gdf.crs) 
        
           # Checking that the hd_gdf is of type GeoDataFrame 
        
           if not isinstance(hd_gdf, gpd.GeoDataFrame): 
        
               raise TypeError('The heat demand gdf must be provided as GeoDataFrame') 
        
           # Checking that the mask_gdf is of type GeoDataFrame 
        
           if not isinstance(mask_gdf, gpd.GeoDataFrame): 
        
               raise TypeError('The mask gdf must be provided as GeoDataFrame') 
        
           # Checking that the Heat Demand Data Column is provided as string 
        
           if not isinstance(hd_data_column, str): 
        
               raise TypeError('The heat demand data column must be provided as string') 
        
           # Checking that the HD Data Column is in the HD GeoDataFrame 
        
           if not hd_data_column in hd_gdf: 
        
               raise ValueError('%s is not a column in the GeoDataFrame' % hd_data_column) 
        
           # Reprojecting Data if necessary 
        
           if mask_gdf.crs != hd_gdf.crs: 
        
               hd_gdf = hd_gdf.to_crs(mask_gdf.crs) 
        
           # Exploding MultiPolygons 
        
           if any(shapely.get_type_id(hd_gdf.geometry) == 6): 
        
               hd_gdf = hd_gdf.explode(index_parts=True).reset_index(drop=True) 
        
           # Assigning area to Polygons 
        
           if all(shapely.get_type_id(hd_gdf.geometry) == 3): 
        
               # Assigning area of original geometries to GeoDataFrame 
        
               hd_gdf['area'] = hd_gdf.area 
        
           # Assigning lengths to LineStrings 
        
           elif all(shapely.get_type_id(hd_gdf.geometry) == 1): 
        
               # Assigning length of original geometries to GeoDataFrame 
        
               hd_gdf['length'] = hd_gdf.length

pyheatdemand/pyheatdemand/processing.py

Lines 638 to 675 in 73f41fd

    
           # Converting Shapely Polygon to GeoDataFrame 
        
           if isinstance(mask_gdf, Polygon): 
        
               mask_gdf = gpd.GeoDataFrame(geometry=[mask_gdf], 
        
                                           crs=hd_gdf.crs) 
        
           # Checking that the hd_gdf is of type GeoDataFrame 
        
           if not isinstance(hd_gdf, gpd.GeoDataFrame): 
        
               raise TypeError('The heat demand gdf must be provided as GeoDataFrame') 
        
           # Checking that the mask_gdf is of type GeoDataFrame 
        
           if not isinstance(mask_gdf, gpd.GeoDataFrame): 
        
               raise TypeError('The mask gdf must be provided as GeoDataFrame') 
        
           # Checking that the Heat Demand Data Column is provided as string 
        
           if not isinstance(hd_data_column, str): 
        
               raise TypeError('The heat demand data column must be provided as string') 
        
           # Checking that the HD Data Column is in the HD GeoDataFrame 
        
           if not hd_data_column in hd_gdf: 
        
               raise ValueError('%s is not a column in the GeoDataFrame' % hd_data_column) 
        
           # Reprojecting Data if necessary 
        
           if mask_gdf.crs != hd_gdf.crs: 
        
               hd_gdf = hd_gdf.to_crs(mask_gdf.crs) 
        
           # Exploding MultiPolygons 
        
           if any(shapely.get_type_id(hd_gdf.geometry) == 6): 
        
               hd_gdf = hd_gdf.explode(index_parts=True).reset_index(drop=True) 
        
           # Assigning area to Polygons 
        
           if all(shapely.get_type_id(hd_gdf.geometry) == 3): 
        
               # Assigning area of original geometries to GeoDataFrame 
        
               hd_gdf['area'] = hd_gdf.area 
        
           # Assigning lengths to LineStrings 
        
           elif all(shapely.get_type_id(hd_gdf.geometry) == 1): 
        
               # Assigning length of original geometries to GeoDataFrame 
        
               hd_gdf['length'] = hd_gdf.length

Example notebooks

To import the library from the cloned repository, using from pyheatdemand import processing was sufficient. Using sys was not necessary (also did not work for me; I'm not using Windows though). In the example notebooks, the sys path uses both pyhd and pyheatdemand; all instances of pyhd should be fixed.

The data paths used in the notebooks are inconsistent, e.g. ../../../data/, ../data/, etc. I noticed that many of the datasets used are available in the test folder. Could you make use of these in the example notebooks? This will allow potential users to execute the notebooks themselves and reproduce the results.

In some notebooks (for example, 02_Processing_Data_Type_I_Raster.ipynb), the data was reprojected to epsg:3034. However, subsequent code uses to_crs('EPSG:3034'), which is redundant and can be removed.

The semicolon in show(); should be removed.

In the notebook 14_Refining_Polygon_Mask.ipynb, under 'Refining the mask', there's a cell with a TypeError output that should be deleted.

Some other suggestions for improvement are to include axis labels / titles / legends in as many plots as possible (where appropriate) as there are a lot of images in the notebooks and this will help with some context. For example, the histograms could use titles or axis labels, and the heat demand colour bars could use a legend title.

The text was updated successfully, but these errors were encountered:

AlexanderJuestel · 2024-02-20T11:07:07Z

Code

All requested Code changes were implemented in 50110d9

I have also slightly adapted the tests.

AlexanderJuestel · 2024-02-20T14:19:02Z

Example notebooks

To import the library from the cloned repository, using from pyheatdemand import processing was sufficient. Using sys was not necessary (also did not work for me; I'm not using Windows though). In the example notebooks, the sys path uses both pyhd and pyheatdemand; all instances of pyhd should be fixed.

Imports were fixed in 543ae78

The data paths used in the notebooks are inconsistent, e.g. ../../../data/, ../data/, etc. I noticed that many of the datasets used are available in the test folder. Could you make use of these in the example notebooks? This will allow potential users to execute the notebooks themselves and reproduce the results.

All paths were edited and data was added to the test folder in 1954775 and 4b29bca

In some notebooks (for example, 02_Processing_Data_Type_I_Raster.ipynb), the data was reprojected to epsg:3034. However, subsequent code uses to_crs('EPSG:3034'), which is redundant and can be removed.

The respective code was removed in 1954775

The semicolon in show(); should be removed.

The semicolons were removed in 253b74d

In the notebook 14_Refining_Polygon_Mask.ipynb, under 'Refining the mask', there's a cell with a TypeError output that should be deleted.

The notebook was adapted in ee3c30c

Some other suggestions for improvement are to include axis labels / titles / legends in as many plots as possible (where appropriate) as there are a lot of images in the notebooks and this will help with some context. For example, the histograms could use titles or axis labels, and the heat demand colour bars could use a legend title.

All plots were edited in 1954775

AlexanderJuestel · 2024-02-21T13:32:37Z

@nmstreethran I think I am done with your comments here. Feel free to have a look and thanks for bringing up some important points.

nmstreethran · 2024-02-22T00:43:56Z

@AlexanderJuestel I've had a quick look and everything looks good. Many thanks for addressing my comments!

nmstreethran mentioned this issue Feb 20, 2024

[REVIEW]: PyHD - PyHeatDemand - Processing Tool for Heat Demand Data openjournals/joss-reviews#6275

Closed

nmstreethran closed this as completed Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JOSS Review] Code and examples #30

[JOSS Review] Code and examples #30

nmstreethran commented Feb 20, 2024

AlexanderJuestel commented Feb 20, 2024 •

edited

Loading

Code

AlexanderJuestel commented Feb 20, 2024 •

edited

Loading

Example notebooks

AlexanderJuestel commented Feb 21, 2024

nmstreethran commented Feb 22, 2024

[JOSS Review] Code and examples #30

[JOSS Review] Code and examples #30

Comments

nmstreethran commented Feb 20, 2024

Code

Example notebooks

AlexanderJuestel commented Feb 20, 2024 • edited Loading

Code

AlexanderJuestel commented Feb 20, 2024 • edited Loading

Example notebooks

AlexanderJuestel commented Feb 21, 2024

nmstreethran commented Feb 22, 2024

AlexanderJuestel commented Feb 20, 2024 •

edited

Loading

AlexanderJuestel commented Feb 20, 2024 •

edited

Loading