Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical integration of the soil module #191

Merged
merged 51 commits into from
Mar 28, 2023

Conversation

jacobcook1995
Copy link
Collaborator

Description

This pull request adds functionality to integrate the soil model (or at least the simple dummy model that currently exists). This involves a function to set up and run the integration process, and a function that combines the existing functions into a form suitable for integration using scipy's solve_ivp function.

One thing that definitely needs thought is where DataArrays get used and where plain numpy arrays are used. I basically just adopted the structure that worked that was the smallest change from what I had previously. This might be an area where it's worth being more systematic as I am guessing this has performance implications (at least if data is being copied/altered).

This pull request adds dotmap as a dependency. I added it because I needed a container for mocking integration output for one of the tests. I considered using a dataclass rather than a DotMap for but it seemed like it would end up being significantly more longwinded. I am happy to try and implement this if adding dotmap as a dependancy is a problem.

Type of change

  • New feature (non-breaking change which adds functionality)
  • Optimization (back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)

Key checklist

  • Make sure you've run the pre-commit checks: $ pre-commit run -a
  • All tests pass: $ poetry run pytest

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good starting point, and you have definitely got the idea of wrapping the functionality to be used by the solver. I've just made a suggestion to make the code more generalisable down the line.

virtual_rainforest/core/base_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/carbon.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/carbon.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/carbon.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@davidorme davidorme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good - and I think this is the only way to go. Some minor comments. I don't think we need DotMap but it isn't the most expensive thing in the world.

Oh wait. I've just got why it is being used - because you're mocking, you need to pass something like the real output of solve_ivp forward to integrate_soil_model. I have to admit that for a simple integration test, I'd just run solve_ivp rather than mocking it. If we do mock it, then we can import the actual expected return class, which is basically just scipys own implementation of a dict with dot attribute access.

from scipy.integrate._ivp.ivp import OdeResult
# or avoiding the private special subclass
from scipy.optimize import OptimizeResult

and then we just define the expected output as:

OptimizeResult(
                success=True,
                y=np.array(
                    [
                        [5.000e-02, 3.210e-02],
                        [2.000e-02, 1.921e-02],
                        [4.500e00, 4.509e00],
                        [5.000e-01, 5.000e-01],
                        [3.210e-02, 5.000e-02],
                        [1.921e-02, 2.000e-02],
                        [4.509e00, 4.500e00],
                        [5.000e-01, 5.000e-01],
                    ]))

tests/models/soil/test_soil_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some minor suggestions in places.

virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved

# Update carbon pools (attributes and data object)
# n.b. this also updates the data object automatically
self.carbon.update_soil_carbon_pools(self.data, carbon_pool_updates)
self.replace_soil_pools(updated_carbon_pools)

# Finally increment timing
self.next_update += self.update_interval

def cleanup(self) -> None:
"""Placeholder function for soil model cleanup."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I know this isn't part of this PR, but I've only just noticed it.)

Do all models have a cleanup() method? What's it needed for? You don't normally have to worry about manually cleaning up resources with Python.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure tbh, we decided on that set of core functions for the standard model api a while back. We settled on this as one of the key functions but I'm not sure what its roll was envisioned as being. It was originally intended as being the opposite of spinup (I think we initially called it tear_down) but not sure what that entails.

In a similar vein, the setup function is a bit redundant now that from_config is defined

virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a few suggestions - formatting will be wrong - to make the code more concise. I'm not an enthusiast of using locals, as you do in line 72, but I cannot think on a better, scalable method at the moment.

virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
@jacobcook1995
Copy link
Collaborator Author

I do agree that using locals() is a bit weird, but I couldn't come up with an alternative (apart from vars() which is either worse or much the same)

@jacobcook1995
Copy link
Collaborator Author

jacobcook1995 commented Mar 22, 2023

@dalonsoa I've removed the use of locals by passing the pool order as a dictionary rather than a list (so that values can be assigned to it).

I'm not 100% whether this is a reasonable thing to do in python, as don't really know how memory allocation works under the hood. Relatedly I'm also unsure whether

delta_pools_ordered = {
    str(name): np.array([])
    for name in self.data.data.keys()
    if str(name).startswith("soil_c_pool_")
}

or

delta_pools_ordered = {
    str(name): np.zeros(no_cells)
    for name in self.data.data.keys()
    if str(name).startswith("soil_c_pool_")
}

will work out being more efficient when memory is reallocated, e.g. when this step is carried out

delta_pools_ordered["soil_c_pool_lmwc"] = -lmwc_to_maom
delta_pools_ordered["soil_c_pool_maom"] = lmwc_to_maom

(if this isn't premature optimisation).

Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a small suggestion and a few questions, but otherwise it looks good!

@@ -47,8 +54,8 @@ class SoilModel(BaseModel):
model_name = "soil"
"""An internal name used to register the model and schema"""
required_init_vars = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A probably out of place question, but just worth the check here, as it has come in a conversation with @vgro:

  • Are these variables the ones that need to be present in the data object to be able to initialise this class, i.e. to run the __init__ method?
  • If so, are these variables (these arrays) loaded from files when creating the data object?
  • As the soil_schema.json do not indicate that those files should be present in the input config, how is the user informed that they should be present - keeping aside trying to run the thing and getting an error when trying to initialise these model?
  • How should they be included in the input file

Copy link
Collaborator Author

@jacobcook1995 jacobcook1995 Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my case these don't have to be present in the data object to initialise the class as my class doesn't have any attributes. I've just put all the things that need to be in data for the functions used by the SoilModel to work in here. This does cause initialisation to fail when the data object technically contains everything needed to initialise the class, which might not be ideal behaviour.

I don't have any data for the soil variables as of yet, so they are not at present loaded, but they should be loaded when the data object is created.

On the latter 2 questions, I haven't really given them much thought, and as far as I know there isn't any functionality in place to handle them as of yet.

Comment on lines +179 to +180
self.data["soil_c_pool_lmwc"] = new_pools["soil_c_pool_lmwc"]
self.data["soil_c_pool_maom"] = new_pools["soil_c_pool_maom"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what I understand here, and based on what required_init_vars says, the value of these pools over time is not kept in the data object - or kept at all - right? In other words, you are no interested into tracking how these evolve over time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would definitely be interested in tracking the value of these pools over time. However, I couldn't see an easy way to add a time dimension to the existing data object so decided to just overwrite it for this first attempt at integrating the model

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacobcook1995 Let me know if you want a chat about adding that time dimension! But I agree that it is fine to park that for now.

virtual_rainforest/models/soil/carbon.py Outdated Show resolved Hide resolved
Comment on lines +66 to +67
delta_pools_ordered["soil_c_pool_lmwc"] = -lmwc_to_maom
delta_pools_ordered["soil_c_pool_maom"] = lmwc_to_maom
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigning these to the dictionary keys seems a bit overkilling with just 2 pools, but I think it will make sense once there are more pools involved. In that sense, using a dictionary to ensure the order is the same than the one used in the inputs seems like the right move.

As you are totally replacing the existing, dummy array assigned to each dictionary entry, using during creation np.array([]) makes sense as you are just using as place holder.

Copy link
Collaborator

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a couple of small queries/suggestions, but otherwise LGTM!

tests/models/soil/test_soil_model.py Outdated Show resolved Hide resolved
virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
@@ -6,7 +6,7 @@

import numpy as np
import pytest
from dotmap import DotMap # type: ignore
from scipy.optimize import OptimizeResult # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you ignoring the type here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell scipy.optimize is not typed even though other scipy modules are typed. I could have missed them but also couldn't track down type stubs


log_check(caplog, expected_log)


def test_order_independance(dummy_carbon_data, soil_model_fixture):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine as is, but it could be part of a parametrized test, with an order argument that determines what order the data is in.

virtual_rainforest/models/soil/soil_model.py Outdated Show resolved Hide resolved
@alexdewar alexdewar self-requested a review March 22, 2023 14:24
Copy link
Collaborator

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant to approve before...

Copy link
Collaborator

@davidorme davidorme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jacobcook1995 jacobcook1995 merged commit 7ce9b32 into develop Mar 28, 2023
@jacobcook1995 jacobcook1995 deleted the feature/integrate_soil_model branch March 28, 2023 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants