Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major change for creating individuals and population #43

Open
wants to merge 55 commits into
base: master
Choose a base branch
from

Conversation

tpdsantos
Copy link

Based on the approach of the ga package in python, each gene has its own structure, making it easier to create individual and populations.

Another main advantage is that is now very easy to create individuals with different types of genes, being them binary, integer or floating-point genes. All the crossover, mutation and selection functions were modified, and some dispatches created, to accomodate this new approach. Added simple examples in the README file and some basic usage.

Added as well a prototype for parallelization of the population using the DistributedArrays package. It works well but needs more intensive testing with different types of populations.

tpdsantos and others added 24 commits February 19, 2020 18:19
Some parts of the code are now obsolete and the genetic algorithm does not work with with different types of chromossome, that's my goal from now on.
The individual, instead of being a vector of values, is a vector of AbstractGene. Each entry of the vector is a gene, of any type supported (binary, integer and float). Now the individual can have different types of genes.
Already did some testing regarding mutations of any gene, and they're working fine. Now the issue are the crossover and selection
functions, since there's one per Genetic Algorithm run. Need to think better on wht'a best and more efficient.
Started debugging the entire code, created a new file just for the structures and some other global code.
Have to test the selection function
Changed the way the fitting tolerance is calculated, since it was giving some weird results, now it's simpler.
Changed the infinite while loop for a for loop to be able to parallelize it in the future.
The FloatGene has already been tested and works fine. Now I have to start building parallel code.
…ADME to add the functions and behaviours created.
Finally fixed the problem regarding the parent population being also updated during crossovers, forgot that,
for structures, a `copy` is not enough, a `deepcopy` was needed.
Finally finished the first prototype for parallelizing the Genetic Algorithm using DistributedArrays package.
It works quite well, but now it needs to be modified to be able to use piping for communication with external programs.
Now communication through FIFOs works using parallel computation.
For non-parallel computation it does not work because the pipes must be launched in separate processes.
The previous prototype was quite slow due to using remote channels and the pipe reading not being made
in the process it was running. Now the remote channels are gone and the pipe reading and the objective
function are determined in the same process, which is much more efficient and faster.
@stephenll
Copy link

For the Elitism line 157 in ga.jl, could you utilize the full_fitness rather than recalculate to populate fitness? This could help if your objective function is expensive.

I've seen in other GA algorithms, the fitness is kept with the individual and only recalculated if it is from a mutation or crossover.

@tpdsantos
Copy link
Author

you're absolutely right, I'm already aware of that and fixed it after doing this pull request, now the objective function is used only in the full_fitness variable

When not using external programs the GA code worls well with both computers connected, now I have to figure out out to
run properly using external programs
@wildart
Copy link
Owner

wildart commented Mar 18, 2020

@tpdsantos Thanks for the effort. However, can we make these changes gradually. I will not be able to review PR in this form.
Some of the things that I noticed:

  • Why do you need to to create a wrapper type for mutations & etc.?
  • Why the input type for mutating functions changed to BitVector? I understand change from Vector{Bool}, but why the rest?

I deliberately tried to be open-ended with the individual type, so any kind of structure can be accepted. Introducing a supertype, e.g. AbstractGene, seems like a big constraint. In any way, you would write specific mutations and recombinations for the special individual type but they would remain in functional form. This PR would require any additional individual type integrated into the package.

If you need to attach any specific information to individual, e.g. IntegerGene name value, you can create a wrapper function around your type to expose an individual structure to the already available individual modification functions. Similarly to what you do with Selection type wrapper.

@tpdsantos
Copy link
Author

@wildart thanks for the reply. I understand that there are many changes. My main reason was just to add this branch to your project to ease the creation of individual types, since I thought it was confusing and the documentation was lacking.

Following your points:

  • I created the wrapper for mutations because the mutations available were for specific types of genes, and if the individual has different types of genes the mutations must be different as well. I was basically trying to not mess with the ga function. regarding the wrapper around the IntegerGene, I wanted to give the user the ability to easily change the mutation functions BEFORE entering the ga function. My main goal was to compile all that I could before running the algorithm itself and not evaluate the mutation symbol in each iteration.
  • I actually only changed the Vector{Bool} types for BitVector

Anyway, I understand that it can't be done now. I'm still working on this branch, since I want to add more functionalities for parallelism, working out-of-the-box with external programs and easily integrate the code with clusters. If I have time I'll try to make it even more general.

What do you think the branch needs to be accepted?

@wildart
Copy link
Owner

wildart commented Mar 18, 2020

I wanted to give the user the ability to easily change the mutation functions BEFORE entering the ga function. My main goal was to compile all that I could before running the algorithm itself and not evaluate the mutation symbol in each iteration.

Mutations is a parameter to ga, you can overwrite ga to accept mutation symbols instead of function names, and select an appropriate function to pass to the original version of ga.

I'm still working on this branch, since I want to add more functionalities for parallelism, working out-of-the-box with external programs and easily integrate the code with clusters.

I understand that parallelism is long overdue for this package. But, initially one thing should be addressed before implementing it. In #36, I outlined that an individual initialization must be removed from the main algorithm, as it runs only one before starting main algorithm loop. That would make possible to deconstruct main algorithm to a components suitable for parallelization.

I think that specific parallel devices must be created to run parts of the algorithm in single-core, multi-core and multi-threading environments. I haven't started thought of this much. I opened a new issue to discuss the parallelization approach. See #45.

tpdsantos and others added 4 commits March 3, 2021 01:21
Before this update boundary checking was performed AFTER the new variables were saved in the gene vector, which made it much more difficult to have new values inside boundaries.
In this update, boundary checking is made BEFORE new values are saved, drastically increasing the probability of having new values inside boundaries.
@wildart wildart force-pushed the master branch 14 times, most recently from 71d4d34 to b925a2c Compare October 30, 2021 18:25
@wildart wildart force-pushed the master branch 4 times, most recently from f8f9fc4 to cc7ffe2 Compare December 10, 2021 20:29
@wildart wildart force-pushed the master branch 2 times, most recently from 12d8cee to b0f5477 Compare December 20, 2021 02:47
@wildart wildart force-pushed the master branch 2 times, most recently from dd6579c to c81f2c9 Compare December 29, 2021 01:18
@wildart wildart force-pushed the master branch 3 times, most recently from 091f38a to cf3f2fb Compare March 19, 2022 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants