Major change for creating individuals and population #43

tpdsantos · 2020-03-08T10:33:18Z

Based on the approach of the ga package in python, each gene has its own structure, making it easier to create individual and populations.

Another main advantage is that is now very easy to create individuals with different types of genes, being them binary, integer or floating-point genes. All the crossover, mutation and selection functions were modified, and some dispatches created, to accomodate this new approach. Added simple examples in the README file and some basic usage.

Added as well a prototype for parallelization of the population using the DistributedArrays package. It works well but needs more intensive testing with different types of populations.

Some parts of the code are now obsolete and the genetic algorithm does not work with with different types of chromossome, that's my goal from now on.

The individual, instead of being a vector of values, is a vector of AbstractGene. Each entry of the vector is a gene, of any type supported (binary, integer and float). Now the individual can have different types of genes.

Already did some testing regarding mutations of any gene, and they're working fine. Now the issue are the crossover and selection functions, since there's one per Genetic Algorithm run. Need to think better on wht'a best and more efficient.

Started debugging the entire code, created a new file just for the structures and some other global code. Have to test the selection function

Changed the way the fitting tolerance is calculated, since it was giving some weird results, now it's simpler. Changed the infinite while loop for a for loop to be able to parallelize it in the future.

The FloatGene has already been tested and works fine. Now I have to start building parallel code.

…ADME to add the functions and behaviours created.

Finally fixed the problem regarding the parent population being also updated during crossovers, forgot that, for structures, a `copy` is not enough, a `deepcopy` was needed.

…by the Crossover structure

Finally finished the first prototype for parallelizing the Genetic Algorithm using DistributedArrays package. It works quite well, but now it needs to be modified to be able to use piping for communication with external programs.

General

Now communication through FIFOs works using parallel computation. For non-parallel computation it does not work because the pipes must be launched in separate processes.

The previous prototype was quite slow due to using remote channels and the pipe reading not being made in the process it was running. Now the remote channels are gone and the pipe reading and the objective function are determined in the same process, which is much more efficient and faster.

stephenll · 2020-03-15T19:46:13Z

For the Elitism line 157 in ga.jl, could you utilize the full_fitness rather than recalculate to populate fitness? This could help if your objective function is expensive.

I've seen in other GA algorithms, the fitness is kept with the individual and only recalculated if it is from a mutation or crossover.

tpdsantos · 2020-03-15T19:51:06Z

you're absolutely right, I'm already aware of that and fixed it after doing this pull request, now the objective function is used only in the full_fitness variable

When not using external programs the GA code worls well with both computers connected, now I have to figure out out to run properly using external programs

wildart · 2020-03-18T17:34:23Z

@tpdsantos Thanks for the effort. However, can we make these changes gradually. I will not be able to review PR in this form.
Some of the things that I noticed:

Why do you need to to create a wrapper type for mutations & etc.?
Why the input type for mutating functions changed to BitVector? I understand change from Vector{Bool}, but why the rest?

I deliberately tried to be open-ended with the individual type, so any kind of structure can be accepted. Introducing a supertype, e.g. AbstractGene, seems like a big constraint. In any way, you would write specific mutations and recombinations for the special individual type but they would remain in functional form. This PR would require any additional individual type integrated into the package.

If you need to attach any specific information to individual, e.g. IntegerGene name value, you can create a wrapper function around your type to expose an individual structure to the already available individual modification functions. Similarly to what you do with Selection type wrapper.

tpdsantos · 2020-03-18T17:58:31Z

@wildart thanks for the reply. I understand that there are many changes. My main reason was just to add this branch to your project to ease the creation of individual types, since I thought it was confusing and the documentation was lacking.

Following your points:

I created the wrapper for mutations because the mutations available were for specific types of genes, and if the individual has different types of genes the mutations must be different as well. I was basically trying to not mess with the ga function. regarding the wrapper around the IntegerGene, I wanted to give the user the ability to easily change the mutation functions BEFORE entering the ga function. My main goal was to compile all that I could before running the algorithm itself and not evaluate the mutation symbol in each iteration.
I actually only changed the Vector{Bool} types for BitVector

Anyway, I understand that it can't be done now. I'm still working on this branch, since I want to add more functionalities for parallelism, working out-of-the-box with external programs and easily integrate the code with clusters. If I have time I'll try to make it even more general.

What do you think the branch needs to be accepted?

wildart · 2020-03-18T19:14:36Z

I wanted to give the user the ability to easily change the mutation functions BEFORE entering the ga function. My main goal was to compile all that I could before running the algorithm itself and not evaluate the mutation symbol in each iteration.

Mutations is a parameter to ga, you can overwrite ga to accept mutation symbols instead of function names, and select an appropriate function to pass to the original version of ga.

I'm still working on this branch, since I want to add more functionalities for parallelism, working out-of-the-box with external programs and easily integrate the code with clusters.

I understand that parallelism is long overdue for this package. But, initially one thing should be addressed before implementing it. In #36, I outlined that an individual initialization must be removed from the main algorithm, as it runs only one before starting main algorithm loop. That would make possible to deconstruct main algorithm to a components suitable for parallelization.

I think that specific parallel devices must be created to run parts of the algorithm in single-core, multi-core and multi-threading environments. I haven't started thought of this much. I opened a new issue to discuss the parallelization approach. See #45.

Before this update boundary checking was performed AFTER the new variables were saved in the gene vector, which made it much more difficult to have new values inside boundaries. In this update, boundary checking is made BEFORE new values are saved, drastically increasing the probability of having new values inside boundaries.

Bounds

tpdsantos and others added 24 commits February 19, 2020 18:19

Started improving code writing.

592858a

Some parts of the code are now obsolete and the genetic algorithm does not work with with different types of chromossome, that's my goal from now on.

Continued improving code

48ba05a

The individual, instead of being a vector of values, is a vector of AbstractGene. Each entry of the vector is a gene, of any type supported (binary, integer and float). Now the individual can have different types of genes.

Continued improving code

192c283

continued improving code

50484fd

Continued improving code

1e5a3b2

Continued improving code

cee890d

Already did some testing regarding mutations of any gene, and they're working fine. Now the issue are the crossover and selection functions, since there's one per Genetic Algorithm run. Need to think better on wht'a best and more efficient.

Continued improving code

8991a18

Started debugging the entire code, created a new file just for the structures and some other global code. Have to test the selection function

Continued improving code

b9d23cb

Continued improving code

743ac27

Changed the way the fitting tolerance is calculated, since it was giving some weird results, now it's simpler. Changed the infinite while loop for a for loop to be able to parallelize it in the future.

Continued improving code

44dbce5

The FloatGene has already been tested and works fine. Now I have to start building parallel code.

wrote documentation for every major function created. Also updated RE…

7beca54

…ADME to add the functions and behaviours created.

Continued improving code

ea16587

Finally fixed the problem regarding the parent population being also updated during crossovers, forgot that, for structures, a `copy` is not enough, a `deepcopy` was needed.

Fixed bug regarding type inference in the crossover function created …

aa51096

…by the Crossover structure

Created function to present results

eb55c7d

Continued improving code

0e01bcd

Finally finished the first prototype for parallelizing the Genetic Algorithm using DistributedArrays package. It works quite well, but now it needs to be modified to be able to use piping for communication with external programs.

minor aesthetic changes

67ae404

Merge pull request #1 from tpdsantos/general

dcc379f

General

added inbounds in most of the low-level functions

0a6bb64

minor aesthetic changes

61cc3a3

started creating ways to communicate with external programs

678b498

added first prototype for communication with external programs

d20633d

introduced functionality for external programs in the ga function

6414627

Fixed piping communication

d96186e

Now communication through FIFOs works using parallel computation. For non-parallel computation it does not work because the pipes must be launched in separate processes.

Started mesing around with clusters

268072e

When not using external programs the GA code worls well with both computers connected, now I have to figure out out to run properly using external programs

tpdsantos and others added 4 commits March 3, 2021 01:21

fixed bug in mutate function

24eb81e

Merge pull request #4 from tpdsantos/bounds

13eeb3d

Bounds

minor changes

bd44dd9

wildart force-pushed the master branch 14 times, most recently from 71d4d34 to b925a2c Compare October 30, 2021 18:25

wildart force-pushed the master branch 4 times, most recently from f8f9fc4 to cc7ffe2 Compare December 10, 2021 20:29

wildart force-pushed the master branch 2 times, most recently from 12d8cee to b0f5477 Compare December 20, 2021 02:47

wildart force-pushed the master branch 2 times, most recently from dd6579c to c81f2c9 Compare December 29, 2021 01:18

wildart force-pushed the master branch 3 times, most recently from 091f38a to cf3f2fb Compare March 19, 2022 22:33

Create CITATION.cff

1ba9433

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major change for creating individuals and population #43

Major change for creating individuals and population #43

tpdsantos commented Mar 8, 2020

stephenll commented Mar 15, 2020

tpdsantos commented Mar 15, 2020

wildart commented Mar 18, 2020

tpdsantos commented Mar 18, 2020

wildart commented Mar 18, 2020

Major change for creating individuals and population #43

Are you sure you want to change the base?

Major change for creating individuals and population #43

Conversation

tpdsantos commented Mar 8, 2020

stephenll commented Mar 15, 2020

tpdsantos commented Mar 15, 2020

wildart commented Mar 18, 2020

tpdsantos commented Mar 18, 2020

wildart commented Mar 18, 2020