Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TASK] Constraints for Samplers #2351

Open
10 tasks
sprapp-inl opened this issue Aug 13, 2024 · 1 comment
Open
10 tasks

[TASK] Constraints for Samplers #2351

sprapp-inl opened this issue Aug 13, 2024 · 1 comment
Labels
priority_minor task This tag should be used for any new capability, improvement or enanchment

Comments

@sprapp-inl
Copy link

sprapp-inl commented Aug 13, 2024


Issue Description

Is your feature request related to a problem? Please describe.

At the moment, it appears to be impossible to constrain the input space from a sampler (Monte Carlo, for example) using a constraint function like the ones possible in optimizers.

Describe the solution you'd like

Additional optional fields to specify constraint functions (<Constraint>) or implicit constraint functions (<ImplicitConstraint>) for use in Samplers where the inclusion of constraints would be applicable should the user need it. These would operate in the same way as the constraints for sampling input space for optimizers.

Describe alternatives you've considered

At the moment, the only features considered have been variable sampling functions or a Custom Sampling strategy, but the use of these relies on the user to provide a lot more data, where RAVEN could generate it automatically. A partial workaround is included below.


Example

As an example of the problem, the initialization of a Genetic Algorithm optimizer samples the input space of all variables using a sampler that is pulled from outside the optimizer for the first generation. This results in any defined constraints in the Genetic Algorithm optimizer to be ignored and a possibility that a portion of the initial population falls outside constraints, reducing diversity.

An example where 3 independent variables are constrained by their sum (the sum does not need to be sampled in RAVEN so a sampling function would be inapplicable):

  • Suppose you have a set of variables (genes) x, y, and z that have continuous distributions A, B, and C.
  • Suppose you want to apply a constraint such that the sum of x, y, and z is less than a constant value S.
  • The distributions A, B, and C. allow the sum to exceed S if the input space is not constrained (i.e. there exist values possible for x, y, and z such that their sum is greater than S).
  • A solution to this is to take x, y, and z and add a fourth variable w that is sampled from a distribution that has maximum value S and minimum value (max(x) + max(y)).
  • Then, you can calculate z in the driven code (or RAVEN function) as z = w - (x + y).
  • If this method is applied, the initial sample of the Genetic Algorithm will always be within the input space.
  • The drawbacks of this approach is that you must reduce the domain of w such that w is always greater than (x + y), which reduces the domain of the sum of x, y, and z from [0, S] to [max(x) + max(y), S]

This constraint is trivial to apply when sampled from within the optimizer, however the initial generation does not use the defined constraints if provided.

I do not know who would be best suited to be assigned to this, so I'm mentioning the following according to the contributing guidelines: @wangcj05 @PaulTalbot-INL @mandd


For Change Control Board: Issue Review

This review should occur before any development is performed as a response to this issue.

  • 1. Is it tagged with a type: defect or task?
  • 2. Is it tagged with a priority: critical, normal or minor?
  • 3. If it will impact requirements or requirements tests, is it tagged with requirements?
  • 4. If it is a defect, can it cause wrong results for users? If so an email needs to be sent to the users.
  • 5. Is a rationale provided? (Such as explaining why the improvement is needed or why current code is wrong.)

For Change Control Board: Issue Closure

This review should occur when the issue is imminently going to be closed.

  • 1. If the issue is a defect, is the defect fixed?
  • 2. If the issue is a defect, is the defect tested for in the regression test system? (If not explain why not.)
  • 3. If the issue can impact users, has an email to the users group been written (the email should specify if the defect impacts stable or master)?
  • 4. If the issue is a defect, does it impact the latest release branch? If yes, is there any issue tagged with release (create if needed)?
  • 5. If the issue is being closed without a pull request, has an explanation of why it is being closed been provided?
@sprapp-inl sprapp-inl added priority_minor task This tag should be used for any new capability, improvement or enanchment labels Aug 13, 2024
@PaulTalbot-INL
Copy link
Collaborator

Hm, this is a good point. While I don't generally think most samplers can make good use of constraints when being used for statistical workflows, certainly if a sampler is used to prepopulate an optimizer, that sampler should follow the same constraints that we impose on the optimization itself. Good point!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority_minor task This tag should be used for any new capability, improvement or enanchment
Projects
None yet
Development

No branches or pull requests

2 participants