Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate code, write tests, write docstrings #19

Closed
barneydobson opened this issue Jan 18, 2024 · 9 comments
Closed

Migrate code, write tests, write docstrings #19

barneydobson opened this issue Jan 18, 2024 · 9 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancements feature Adding a new functionality, small or large

Comments

@barneydobson
Copy link
Collaborator

Need to do..

@barneydobson barneydobson added documentation Improvements or additions to documentation feature Adding a new functionality, small or large enhancements priority! labels Jan 18, 2024
@barneydobson
Copy link
Collaborator Author

barneydobson commented Jan 19, 2024

Currently have done terrible job at creating sensible modules, propose to migrate code into these modules:

  • downloaders.py - as is
  • graph_operations.py - every registered graphfcn that takes a graph and returns a graph
  • geospatial_operations.py - all the complicated polygon and raster stuff goes here
  • hydraulic_design.py - currently just the guts pipe_by_pipe go in here - tbh could just go in graph_operations.py, but then it becomes a busy module
  • swmm_text_tools.py - all the SWMM file writing tools go here - not sure how good Jinja2 is but maybe this isn't needed
  • swmmanywhere.py - main script to read the config file and cycle the graph operations

@barneydobson
Copy link
Collaborator Author

While @dalonsoa and @cheginit try and get a big picture, I will work on geospatial_operations.py (because these are functions that are not really related to the overall software architecture) and swmm_text_tools.py (because it's such a mess that there's not much to be done and we'll probably replace it with Jinja2 ( #16 )

@barneydobson barneydobson self-assigned this Jan 19, 2024
@barneydobson barneydobson mentioned this issue Jan 22, 2024
4 tasks
@cheginit
Copy link
Collaborator

For modules and structuring the codes, I recommend making them a bit more categorical. For example, something like the following:

  • prepare_data: This module contains functionalities for retrieving raw data from various sources
  • pre_processing: This module processes the raw input data into formats that the BUSN generator algorithm and SWMM case generator require
  • generate_busn: This module generates BUSNs as nx.DiGraph objects
  • post_processing: This module generates SWMM cases based on the generated BUSNs and other input data
  • data_analysis: Sensitivity analysis and other data analysis code goes into this module.
  • utils: All other misc functionalities go here. If there are many of them we can have more specific utility modules, e.g., utils_geo, utils_graph, and so on.

A thematic structure like this makes it easier to navigate the codebase.

@cheginit
Copy link
Collaborator

For managing config files, there are two main options: in-house reader or existing config management libraries. If the config file is simple, we can develop a reader module with suitable validator functionalities. If there are many input files, I recommend using existing Python libraries. A popular choice is MogaConf. A more general solution is pydantic that provides powerful schema generation and validators. So, it boils down to the complexity of the config files.

@dalonsoa
Copy link
Collaborator

I very much recommend using existing Python tooling for configuration based on standard formats (yaml, toml, etc). Otherwise, things can become very complicated very easily. Another option, more lightweight than the ones suggested by @cheginit (which are indeed pretty good), would be schema.

@barneydobson
Copy link
Collaborator Author

barneydobson commented Jan 25, 2024

For modules and structuring the codes, I recommend making them a bit more categorical. For example, something like the following:

  • prepare_data: This module contains functionalities for retrieving raw data from various sources
  • pre_processing: This module processes the raw input data into formats that the BUSN generator algorithm and SWMM case generator require
  • generate_busn: This module generates BUSNs as nx.DiGraph objects
  • post_processing: This module generates SWMM cases based on the generated BUSNs and other input data
  • data_analysis: Sensitivity analysis and other data analysis code goes into this module.
  • utils: All other misc functionalities go here. If there are many of them we can have more specific utility modules, e.g., utils_geo, utils_graph, and so on.

A thematic structure like this makes it easier to navigate the codebase.

OK I'm easy on naming conventions.
As I see it the mapping to what I have now in both this and the old repository is as follows:

  • downloaders -> prepare_data
  • pre_processing -> equivalent mainly to the operations in download function (besides the downloads themselves) in the old repo, and I guess a few more where I do any other data tidying on the fly.
  • swmm_text_tools -> post_processing
  • experimenter a function in the old repo -> data_analysis - I don't want to touch this yet until we make some other decisions e.g., about config file Configuration file #10 since that will determine some big picture choices about how we would 'do' sensitivity analysis
  • swmmanywhere (old repo) -> generate_busn - though I don't like the name generate_busn since I have every intention of expanding this to cover foul networks down the line. Perhaps just generate_network ?
  • geospatial_analysis -> utils_geo
  • graph_functions -> utils_graph

If we're happy with this I will first make a PR to rename modules.

Implicit within this for me is the idea that there will be a set of registered graph functions in utils_graph whose order is defined in the config file, which is read and iterated over in the generate_network module. Are you both happy with this?

Then, perhaps do we need to finalise discussion of config file (#10 ) before commencing on utils_graph and generate_network (I think it makes sense that these two modules are under the same PR since they are quite tightly linked)?

I would probably work on pre_processing after these to fit the requirements of the polished functions in utils_graph. Then finally the data_analysis once everything else is done.

Also @dalonsoa you mentioned for the graph functions (i.e., a function that takes a graph and gives a graph), I can register it, @register_graphfcn for example. Is there any more I need to know about that at this stage - is it similar to what you did in WSIMOD nodes? Possibly this too is linked in with the config file discussion

@dalonsoa
Copy link
Collaborator

The purpose is similar, but the implementation is not.

In that case I used __init_subclass__ in the parent class to automatically register subclasses in a registry that you can use somewhere else (and do some validation and raise warnings/errors if needed). In this case, you use the decorator you register a function - any function you want - in a registry that contain functions that can be applied to the graph sequentially. Again, you can include in this decorator function some validation aspects (eg. the decorated function has some particular signature).

@cheginit
Copy link
Collaborator

Using generate_network is fine by me.

I have a self-imposed limit that when a module becomes larger than 1000 LOC, I tend to break it into more modules. So, if generate_network becomes large, you can break it into two modules, generate_busn, and generate_bssn (below-ground sanitary sewer network), or generate_foul_network 😄

@barneydobson barneydobson mentioned this issue Jan 26, 2024
13 tasks
@barneydobson barneydobson mentioned this issue Feb 5, 2024
4 tasks
@barneydobson
Copy link
Collaborator Author

Closed by #83

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancements feature Adding a new functionality, small or large
Projects
None yet
Development

No branches or pull requests

3 participants