Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separated out functionalities from initPopulation #36

Closed
wants to merge 1 commit into from

Conversation

aahaselgrove
Copy link
Contributor

initPopulation was a heavily overloaded parameter, with a variety of
types corresponding to a range of possible behaviours. This commit:

  • Removes the option to pass a vector representing the search space.
    This is specific and unclear, and behaviour can easily be easily
    represented by one of the other options
  • Restricts initPopulation to be a vector of individuals (where an
    individual is a single member of the population)
  • Adds a new parameter creation to represent a function to create an
    individual.

Default behaviour has not been changed, but is now represented by a
creation function.

@coveralls
Copy link

coveralls commented Sep 18, 2019

Coverage Status

Coverage decreased (-0.4%) to 77.897% when pulling afb674b on aahaselgrove:master into 6dbab80 on wildart:master.

Copy link
Owner

@wildart wildart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced that an introduction of a new creation parameter provides better design. Now you have to reconcile type differences between initPopulation values and values generated by creation.

src/Evolutionary.jl Show resolved Hide resolved
mutation::Function = ((r,m)->r),
smutation::Function = (s->s),
termination::Function = (x->false),
creation::Function = (n -> rand(n)),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, the creation function could generate vector with the type different from initPolulation if only initPolulation parameter passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intent here is that either a creation function or an initPopulation vector is passed, and if both are passed, the creation function is not called. Unfortunately Julia functions are not parameterised by argument or return types, so we cannot enforce creation (or for that matter, any other function used) to accept or return compatible types.

@aahaselgrove
Copy link
Contributor Author

The reason I created this PR is because I was primarily working towards my second PR #37 which addresses my issue #35, and I was having difficulty accommodating the existing initialisation system. Specifically, for populations with individuals which aren't Vector{T} (eg. object individuals or higher dimension individuals), Matrix{T} is not appropriate for initialising a group of individuals. Hence, I changed it to Vector{I}, which could be Vector{Vector{T}}, Vector{Object} or Vector{Array{T, N}} accordingly.

With regards to adding a creation function, I added this because it felt weird to me for initPopulation to be a vector or a function. Also, having separate parameters means that, in future, additional functionality could easily be implemented for passing partial initial populations: for an initPopulation of length less than N, the creation function could be used to generate the remaining individuals.

@wildart
Copy link
Owner

wildart commented Oct 10, 2019

Specifically, for populations with individuals which aren't Vector{T} (eg. object individuals or higher dimension individuals), Matrix{T} is not appropriate for initialising a group of individuals. Hence, I changed it to Vector{I}, which could be Vector{Vector{T}}, Vector{Object} or Vector{Array{T, N}} accordingly.

The Individual type does not specify any type parameter on Vector, so essentially individual can be vector of anything: numbers, array, object, matrix, anything. Matrix type is used to introduce a predefined population.

I like a separate parameter for a creation function, so I think it would be better to set a following initialization procedure for population as follows:

If no creation function specified
   if initPopulation is Matrix
      convert initial population Matrix to Vector{Vector} which is an internal population data structure
   else if element is initPopulation parameter is of primitive type
      generate Vector of random weights multiplied to initPopulation
   else
      error & advise to use creation function
else // creation function provided
   if initPopulation is set
      warn that this parameter will be disregarded
   initialize population with elements generated from creation function

In the above initialization procedure, there is no need match types between initial population parameter and creation function output.

@aahaselgrove
Copy link
Contributor Author

aahaselgrove commented Oct 12, 2019

In the above initialization procedure, there is no need match types between initial population parameter and creation function output.

I don't quite follow. In this PR, there is no need to match types between initial population and creation function either. The creation function is never called if a value is provided for initial population.

@wildart
Copy link
Owner

wildart commented Oct 14, 2019

Clearly, we got some misunderstanding. Let's get to the beginning. So, what do you think initial population parameter should be?

@aahaselgrove
Copy link
Contributor Author

aahaselgrove commented Oct 15, 2019

For my project, I am working off aahaselgrove/Evolutionary.jl@0cd421d, in which the function signature for ga is as follows:

# Genetic Algorithms
# ==================
# objfun: Objective fitness function
# individual: Sample data structure representing an individual
# initPopulation: Initial population values as matrix
# populationSize: Size of the population
# crossoverRate: The fraction of the population at the next generation, not including elite children,
# that is created by the crossover function.
# mutationRate: Probability of chromosome to be mutated
# ɛ: Positive integer specifies how many individuals in the current generation
# are guaranteed to survive to the next generation.
# Floating number specifies fraction of population.
#
function ga(objfun::Function, individual::T;
initPopulation::Union{Nothing, Vector{T}} = nothing,
lowerBounds::Union{Nothing, Vector{T}} = nothing,
upperBounds::Union{Nothing, Vector{T}} = nothing,
populationSize::Int = 50,
crossoverRate::Float64 = 0.8,
mutationRate::Float64 = 0.1,
ɛ::Real = 0,
creation::Function = (dims -> rand(eltype(T), dims)),
selection::Function = ((x, n) -> 1:n),
crossover::Function = ((x, y) -> (y, x)),
mutation::Function = (x -> x),
iterations::Integer = 100*prod(size(individual)),
tol = 0.0,
tolitr = 10,
verbose = false,
debug = false,
interim = false) where {T}

Hence, default behaviour is for the initial population to be generated by the default creation function (dims -> rand(eltype(T), dims)), where dims = size(individual).

@wildart
Copy link
Owner

wildart commented Oct 15, 2019

The creation function (dims -> rand(eltype(T), dims)) will only work if T is primitive type. Either you need to specialize type parameter as where {T <: Real} or come-up with a different initialization procedure.

Can you start putting some kind of general initialization procedure for the population from initial parameters for GA, and then try to reuse in in other methods?

Can you update only GA at this point? I want to try specify different API for ES function.

@wildart
Copy link
Owner

wildart commented Oct 15, 2019

I like that in your ga(objfun::Function, individual::T), you pass individual as a positional parameter. That makes it easy to perform multiple dispatch on other form of individuals. Maybe you can rewrite your current PR in that way.

@wildart
Copy link
Owner

wildart commented Oct 15, 2019

Here is my take on initialization:

  • Add population parameter that will hold a working population
    function es( objfun::Function, population::Vector{T};
  • Dispatch on various population initialization parameters

    Evolutionary.jl/src/es.jl

    Lines 109 to 125 in d420050

    # Spawn population from one individual
    function es(objfun::Function, individual::Vector{T}; μ::Integer=1, kwargs...) where {T<:Real}
    N = length(individual)
    population = [individual .* rand(T, N) for i in 1:μ]
    return es(objfun, population; μ=μ, kwargs...)
    end
    # Spawn population from matrix of individuals
    function es(objfun::Function, population::Matrix{T}; kwargs...) where {T<:Real}
    μ = size(population, 2)
    return es(objfun, [population[:,i] for i in axes(population, 2)]; μ=μ, kwargs...)
    end
    # Spawn population using creation function and individual size
    function es(objfun::Function, N::Int; creation=(n)->rand(n), μ::Integer=1, kwargs...)
    return es(objfun, [creation(N) for i in 1:μ]; μ=μ, kwargs...)
    end

In that way, population initialization is moved out of the main algorithm, and the new interface would allow to provide special initialization functions for custom population creation procedure.

@aahaselgrove
Copy link
Contributor Author

aahaselgrove commented Oct 16, 2019

Here is my take on initialization:

I like the multiple dispatch approach for population initialization - it feels appropriate to the language. I still have problems with the initialisation options you provide for the reasons I outlined above.

The creation function (dims -> rand(eltype(T), dims)) will only work if T is primitive type. Either you need to specialize type parameter as where {T <: Real} or come-up with a different initialization procedure.

I believe the existing method of creation also only works only if T is primitive. I don't think it's feasible to be generic enough to cover all kinds of population, and it is reasonable to require the user to perform initialisation for more complex types. However, I could certainly see it being appropriate to throw an error if the user tries to run ga with a more complex individual without defining a creation function or passing an initial population.

@aahaselgrove
Copy link
Contributor Author

Also, can lowerBounds and upperBounds be removed? They don't do anything at the moment.

@wildart
Copy link
Owner

wildart commented Oct 16, 2019

Also, can lowerBounds and upperBounds be removed? They don't do anything at the moment.

Really, I thought they are functional. Anyway, leave them for future development.

I believe the existing method of creation also only works only if T is primitive.

Surely, the default creation parameter is not going to work custom types, but with an appropriate creation function, initialization will work.

it is reasonable to require the user to perform initialization for more complex types

That's why the Matrix or Vector{Vector} population parameters were provided - to initialize whole population in advance.

@aahaselgrove
Copy link
Contributor Author

I think I've hit an effective compromise now. Initialisation can still be completed by passing a vector representing the search space or a matrix, but this is no longer part of the core algorithm, but more like helper functions for common initialisation patterns. The core algorithm is to first use any individuals passed in with the initPopulation vector, and create any remaining individuals with the creation function (if populationSize > length(initPopulation)).

@wildart
Copy link
Owner

wildart commented Oct 20, 2019

Initialisation can still be completed by passing a vector representing the search space or a matrix, but this is no longer part of the core algorithm

I think it's a good thing to move out initialization out of core algorithm. Do not stop in half way move all initializations into the separate function.

The core algorithm is to first use any individuals passed in with the initPopulation vector, and create any remaining individuals with the creation function (if populationSize > length(initPopulation)).

Mixing two types of the population initialization isn't a very nice thing. It could create ambiguities down the line.

@aahaselgrove
Copy link
Contributor Author

Mixing two types of the population initialization isn't a very nice thing. It could create ambiguities down the line.

Do you have any examples of how this would be an issue? MATLAB's ga function works in exactly this manner, allowing the user to specify as many or as few individuals as they want in the initial population and then creates additional as required with the creation function for a total of populationSize individuals.

@wildart
Copy link
Owner

wildart commented Oct 21, 2019

Do you have any examples of how this would be an issue?

Well, you can have population values type-incompatible with the creation function produced individuals. I looked at MATLAB ga parameter description, you were right they allow such scenario. I guess it's alright to have such behavior.

Nevertheless, could you move any initialization code out of the ga main function. I plan to refactor code to have clear functional separations of various parts of the algorithms. That will allow to start work on concurrent implementation.

Do not bother to change ES code. I've already started working on it.

@aahaselgrove
Copy link
Contributor Author

Well, you can have population values type-incompatible with the creation function produced individuals.

Population values could also be incompatible with objfun, crossover or mutation.

Nevertheless, could you move any initialization code out of the ga main function.

I'll take a look at this.

`initPopulation` was a heavily overloaded parameter, with a variety of
types corresponding to a range of possible behaviours. This commit:
- Move the option to pass a vector representing the search space or a
matrix into separate wrapper functions.
- Restricts `population` to be a vector of individuals (where an
individual is a single member of the population)
- Adds a new parameter `creation` to represent a function to create an
individual.

Default behaviour has not been changed, but is now represented by a
creation function.
@wildart
Copy link
Owner

wildart commented May 2, 2020

Implemented in new api, see Docs/Dev/Population

@wildart wildart closed this May 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants