-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement KDE bootstrapped error estimates and more #689
Conversation
I think it looks mostly good! Back in the day we made for example Param sets hasable....so one can quite easily detect if any changes were made. So maybe there we could think of a more global mechanism of stashing....maybe? |
@philippeller Yes, the stashing is a bit of a use-at-your-own-risk thing. All changes in previous stages after the first call to |
Or, better, we run a check on the pipeline after instantiation to look for configuration errors and produce an error if the stage before the KDE stage has a parameter that is free while stashing is enabled. Errors are better than warnings, we still have tons of warnings in PISA that just get ignored... (actually an issue on its own, too many warnings) |
I think cleanest would be to implement stashing in general at the pipeline level. So you would stay: stash at stage X, and then the pipeline will not run anything up to stage X, except when a parameter has changed. What do you think? It probably should then keep an internal copy of the |
That sounds like a good idea! Though, I'm wondering how exactly we'd go about it to make it safe in general. The weights are often manipulated in place in a pipeline, so as you said it would have to store an internal copy of the weights and re-apply them. But what if there is more than just "weights" that need to be applied? I guess we can say that we store everything that's in |
we could make the user specify what keys to be cached! So this way would be explicit. E.g. in the cfg specify: At stage X cache keys x, and y |
We could do that in the |
Hey, @philippeller! Sorry I was too busy with other stuff, now coming back to this. Could we please find a compromise here? I understand that it would be nice to implement caching globally, but right now it's only used for the KDE stage and for a very specific reason. I see the point that it could lead to unexpected behavior if used inadvertently, and I'm prepared to make changes that make it harder to mis-use such as calling the caching flag Could we please just pass the salt this one time? |
sure, i do not want to be an unreasonable pain. Can you create an issue about our discussion, and that we should fix it soon? Then we will merge the PR as is for now |
Ok, just did that. Will merge the PR now |
Thank you! |
This PR adds the option to estimate errors on KDE'd histograms using the
bootstrap_kde
class from the KDE module that we already used. It's of course very slow and not recommended to be used in a fit, but it is very useful for fitting hyper surfaces with statistically sound errors. Other changes are:aeff.weight
stage:aeff.weight
stageutils.kde
stage:utils.hist
stage:black
to make it nicer.