Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

159 interpolation bug #160

Merged
merged 9 commits into from
May 23, 2024
Merged

159 interpolation bug #160

merged 9 commits into from
May 23, 2024

Conversation

barneydobson
Copy link
Collaborator

@barneydobson barneydobson commented May 13, 2024

Description

Identified bug in interpolation in #76 and some general performance improvements.

  • pbias should compare mean in case the two variables are not the same length (e.g., in diameter size)
  • Aggregation now occurs in align_by_shape rather than median_coef_by_shape - there is no point interpolating a bunch of stuff that you are about to aggregate, you should aggregate first.
  • Interpolation now occurs by sub_id rather than just applied on the whole results df... <- this was the bug (and a dumb one...)
  • Added and test outlet_pbias_diameter in metrics.

Fixes #159

Dobson added 2 commits May 13, 2024 14:58
- Use mean for pbias in case different size
- Add outlet_pbias_diameter
- Update demo_config to use all available metrics
fix interpolation
@barneydobson barneydobson linked an issue May 13, 2024 that may be closed by this pull request
@barneydobson barneydobson requested review from cheginit and dalonsoa and removed request for cheginit May 13, 2024 14:14
@cheginit
Copy link
Collaborator

I am not sure if I understand the logic behind computing p-bias for two variables with different lengths?

Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

As usual, you're changing or adding code, but there're no tests associated to those changes. In the end, you will end up with a lot of untested code, which is no good, and fixing that will take you ages. It is much better to create tests as you code.

Also, you often put together in the same PR changes that are totally unrelated. I understand why that happens, but again that's not a good practice. PR should address conceptually related aspects of the code.

@barneydobson
Copy link
Collaborator Author

I am not sure if I understand the logic behind computing p-bias for two variables with different lengths?

@cheginit

For a distribution, e.g., of pipe diameters so that I can see whether the diameters produced are over or under estimates in general.

@barneydobson
Copy link
Collaborator Author

Looks good to me.

As usual, you're changing or adding code, but there're no tests associated to those changes. In the end, you will end up with a lot of untested code, which is no good, and fixing that will take you ages. It is much better to create tests as you code.

Also, you often put together in the same PR changes that are totally unrelated. I understand why that happens, but again that's not a good practice. PR should address conceptually related aspects of the code.

@dalonsoa thanks for pointing out. I think the functions are included in coverage since they are called by metrics but I see I hadn't tested them separately (which is probably how I didn't notice this bug) - will add tests for them

@dalonsoa
Copy link
Collaborator

If you change sum by mean in a piece of code and the tests are still passing without also updating them, then they are not very good tests: they are not correctly capturing and assessing that the functionality of the function they are testing is correct. The same applies to the other bits that you changed. Coverage is a very misleading quantity: it is very easy to have tests that gives you 100% of coverage but that totally fail to asses the validity of the code. It can serve you as a guide, but you need to review each test on its own.

@barneydobson
Copy link
Collaborator Author

If you change sum by mean in a piece of code and the tests are still passing without also updating them, then they are not very good tests: they are not correctly capturing and assessing that the functionality of the function they are testing is correct. The same applies to the other bits that you changed. Coverage is a very misleading quantity: it is very easy to have tests that gives you 100% of coverage but that totally fail to asses the validity of the code. It can serve you as a guide, but you need to review each test on its own.

But because they are being divided by each other, they should give the same output - the only difference is that mean works when they are different lengths (just added a test for that).

@cheginit
Copy link
Collaborator

Ok, I think I figured out the source of confusion. As far as I know, the definition of p-bias is $\dfrac{\sum(y_\mathrm{sim} - y_\mathrm{obs})}{\sum y_\mathrm{obs}} \times 100$ (not $\sum y_\mathrm{sim} - \sum y_\mathrm{obs}$) which is why the lengths of the variables must be the same.

@barneydobson
Copy link
Collaborator Author

barneydobson commented May 14, 2024

Yep - you're correct there. My adjustment is to give the same value (less the x100) for two timeseries, but to give a similarly helpful number for two distributions.

I guess it would be sensible to name it something different to avoid confusion - relative_difference or something?

Edit:

Looks like: normalized mean bias error or relative error can mean this

@cheginit
Copy link
Collaborator

By distribution, do you mean spatial distribution? For example, for comparing the spatial distribution of pipe sizes? If that's the case, you need to use metrics like spatial autocorrelation (which you can use pysal package with Queen or Rook contiguity).

@barneydobson
Copy link
Collaborator Author

Sorry no I mean like, probability distribution or whatever.. probably an ambiguous word to use here.

E.g., if I have 5 x 300mm pipes, and 3 x 500mm pipes - then the relative error is (300 - 500)/500. Its magnitude should be I think directly proportional to the KS statistic, but it retains directionality (which is helpful to understand, e.g., if you are under or over estimating pipe sizes).

@cheginit
Copy link
Collaborator

I see. I think this particular example doesn't really provide a measure of over or underestimation. In pipe network design, comparing discharge capacity and the spatial distribution of pipe geometric properties offers a more granular measure of under or overestimation. A simple measure for pipe sizes can be the number of pipes for each pipe diameter, i.e., categorical value counts.

Dobson added 3 commits May 16, 2024 21:22
remove nearest... can't reproduce the issue it was trying to fix
@barneydobson barneydobson merged commit a9f20b7 into main May 23, 2024
4 checks passed
@barneydobson barneydobson deleted the 159-interpolation-bug branch May 23, 2024 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

grid and subcatchment interpolation bug
3 participants