Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New approach to TRAFO #246

Open
mb706 opened this issue Aug 16, 2019 · 1 comment
Open

New approach to TRAFO #246

mb706 opened this issue Aug 16, 2019 · 1 comment

Comments

@mb706
Copy link
Contributor

mb706 commented Aug 16, 2019

During an insightful conversation with Franz, Raphael and Jakob we were coming to the issue that our $trafo is missing the information about its image, i.e. the parameter set that it maps to. This information would be useful e.g. for the "autotuner" (i.e. a tuning wrapper for a Learner), because the autotuner would like to know what parameters the user should not set (because the tuner is doing that). My idea for the solution for this would mostly be an extension of my suggestion in #215. To avoid confusion with the old $trafo slot, I am going to use different slot names for the new things I introduce, although one of these could well be named $trafo.

I think this is a cool design, for whatever that may be worth to you ;-)

The plan:

  • Remove the current ps$trafo slot.
  • ParamSet gets a method ps$transform(x, context = list(), terminal = TRUE) that takes a named list x that is a valid parameter configuration according to the ParamSet, and returns a named list with transformed parameter values. (Ignore context and terminal for now).
  • ParamSet gets a method ps$image(terminal = TRUE) that returns a ParamSet. This is the ParamSet that all values of ps$transform() will conform to. (In fact, ParamSet checks that the return value of transform() conforms to this image and throws an error if it doesn't).
  • The ps$get_values() method of ParamSet is extended with the parameter transformed = TRUE and context = list(). ps$get_values(transformed = FALSE) behaves just as ps$get_values() does currently. If ps$get_values(transformed = TRUE, context = ctx) is called, it returns the same as ps$get_values(transformed = FALSE) %>% ps$transform(context = ctx). We could argue about renaming the function get_transformed_values() or something, or having both functions and removing the transformed parameter
  • ParamSet gets a method ps$add_trafo(trafo, new_ps). trafo is a function(x, param_set, context), new_ps is a ParamSet. What it does is that it "transmutes" ps into the ParamSet given in the new_ps argument. Given that ps does not have a "trafo" yet, the trafo function then takes inputs according to new_ps and gives outputs according to ps, i.e. the old ps becomes its image. Some examples:
    # given:
    # * ps (ParamSet) that DOES NOT HAVE A TRAFO
    # * new_ps (ParamSet)
    # * trafo (function(x, param_set, context))
    # * x (named list)
    ps_old = ps$clone(deep = TRUE)  # keep the old ps for comparison
    ps$add_trafo(trafo, new_ps)
    
    # from the outside, ps now looks like param_set
    all.equal(ps$params, new_ps$params)  # TRUE
    
    # the `ps$transform()` function calls the `trafo` function
    all.equal(trafo(x = x, param_set = ps, context = list()),
      ps$transform(x = x, context = list()))  # TRUE
    
    # the `ps$image()` is just the "old" ParamSet
    all.equal(ps$image()$params, ps_old$params)  # TRUE
  • What if a ps already has a "trafo" and another one is added? It just stacks! In that case, the trafo that was added later is called first, then the earlier trafo is called, etc. Think of the different ParamSets as a linked list, connected by "trafo"-functions, the image of each is the preimage of the next. This is where the "terminal" comes in: We can choose to apply all transformations of a ParamSet in a row, or just one transformation to go one "step" ahead. Similarly, we can get the "terminal" image, i.e. of the last image, or just the image of one transformation step. In code:
    # given:
    # * ps_one, ps_two, ps_three (ParamSet that DO NOT HAVE A TRAFO)
    # * trafo_one_two, trafo_two_three (function(x, param_set, context))
    # * x, y, z (named lists)
    ps = ps_three$clone(deep = TRUE)
    ps$add_trafo(trafo_two_three, ps_two$clone(deep = TRUE))
    ps$add_trafo(trafo_one_two, ps_one$clone(deep = TRUE))
    
    # from the outside, ps now looks like ps_one
    all.equal(ps$params, ps_one$params)  # TRUE
    
    # images: ps_three is the "terminal" one, but ps_two is the "next" one
    all.equal(ps$image()$params, ps_three$params)  # TRUE
    all.equal(ps$image(terminal = FALSE)$params, ps_two$params)  # TRUE
    
    # can go along the linked list to reach terminal
    all.equal(ps$image(terminal = FALSE)$image(terminal = FALSE)$params,
      ps_three$params)  # TRUE
    
    # ps_three does not have a "trafo", btw, so its image is just itself
    all.equal(ps_three$image()$params, ps_three$params)  # TRUE
    
    # trafos: ps$transform() calls trafo_one_two, then trafo_two_three
    # but only if "terminal" is TRUE
    all.equal(ps$transform(x = x, context = list()),
      trafo_one_two(x = x, param_set = ps, context = list()) %>%
        trafo_two_three(param_set = ps$
          image(terminal = FALSE), context = list()))  # TRUE
    all.equal(ps$transform(x = x, context = list(), terminal = FALSE),
      trafo_one_two(x = x, param_set = ps, context = list())
    
    # we could also go along the linked list here:
    all.equal(ps$transform(x = x, context = list()),
      ps$transform(x = x, context = list(), terminal = FALSE) %>%
        ps$image(terminal = FALSE)$
          transform(x = x, context = list(), terminal = FALSE))  # TRUE
    
    # ps_three does not have a "trafo", so its `$transform()` is the identity
    all.equal(ps_three$transform(x = x, context = list()), x)
  • What about the context()? It can optionally be given to the ps$transform() function as an argument, and it will be passed on to the trafo function given to ps$add_trafo(). It can contain information about how the transformation is to be performed. It could, for example, contain information about a task (number of features, number of samples), and the trafo could then make use of this information to transform a parameter value. This will work together with a convention that each Learner will always call ps$get_values(transformed = TRUE, context = list(task = task)). Now what happens is the following:
    • The learner is created with a vanilla ParamSet, so ps$get_values(transformed = TRUE, [...]) when called in the Learner's $train() function just gives the parameter values as given by the user.
    • If the user wants to add a transformation to the ParamSet, he calls learner$param_set$add_trafo(....). This changes how the ParamSet looks to the user at the outside. For example, maybe the new ParamSet contains a mtry_pexp parameter, while the Learner's original ParamSet only had an mtry parameter.
    • When the Learner now calls ps$get_values(transformed = TRUE, [...]), the result will be conforming to the ParamSet that the Learner was created with (because get_values in this case gives a value conforming to the $image).
    • Because context = list(task = task) is given to the $get_values(), and hence to the trafo() function, the transformation can depend on properties of the task. It could, for example, do
      x$mtry = context$task$nfeat ^ x$mtry_pexp`
    • There may be other contexts, for example inside a prediction-aggregating PipeOp. These pipeops can call get_values with a different context argument. How they call get_values should be documented, so the user can choose to use $add_trafo() in a way that makes use of all information available. I am not sure yet if it is possible to build this behaviour into the (Learner, PipeOp, ...) class in some way to make it consistent, e.g. for all Learners
    • context is basically what I called env in Parameter transformations inside ParamSet #215 / Ps Transformations inside ParamSet #225
  • It should be noted that this is a transformation that can be both performed at the learner-side or at the tuner-side. I.e. if I have a Learner with parameters that I want to tune over, but with transformed values (say tune_ps, and trafo tune_trafo), I can do either of the following:
    1. Transformation happens in Learner
    learner$param_set$add_trafo(tune_trafo, tune_ps)
    tune_learner(lrn = learner, ps = tune_ps, [...])
    1. Transformation happens in the tuner
    total_tune_ps = learner$param_set$clone(deep = TRUE)
    total_tune_ps$add_trafo(tune_trafo, tune_ps)
    tune_learner(lrn = learner, ps = total_tune_ps, [...])
    Either of these could make sense in their own right. (i) is relevant if transformation should be task-dependent, (ii) is relevant if the tuning result parameters should be in a form that is naturally understandable to someone familiar with the learner.
  • I am thinking about whether there should be an ps$add_trafo(trafo, preimage_ps, image_ps) function, so that we can add a transformation just on a subset of the ParamSet. E.g. if the ParamSet has the parameters mtry, n.tree and we just want to add a trafo for mtry, we could do
    ps$add_trafo(function(x, ...) x$mtry = round(exp(mtry),
      preimage_ps = ParamSet$new(ParamDbl$new("mtry", 0, 10)),
      image_ps = ParamSet$new(ParamInt$new("mtry", 0, Inf)))
    And the trafo function would only be called with the "mtry" part of the input parameter value. This would make subsetting easy. But that is a story for a different time :-)
@jakob-r
Copy link
Member

jakob-r commented Aug 19, 2019

  • 👍 for get_transformed_values() because it is more clear and might cause the trafo to be ignored often. Maybe even get_domain_value() and get_value() because then you intuitively get the right one to evaluate on.
  • what if a ParamSet has 100 Parameters and you just want to trafo 1. Then we will have to specify tow nearly identical ParamSets? We will soon want to have a helper for that. (Just a thought) -- okay that was your las bullet point. Why not just:
ps$add_trafo(function(x, ...) x$mtry = round(exp(mtry),
  preimage_ps = ParamSet$new(ParamDbl$new("mtry", 0, 10)),
  image_ps_id = "mtry")

As mtry is already in ps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants