Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add different selection methods #28

Open
trevorstephens opened this issue Apr 27, 2017 · 14 comments
Open

Add different selection methods #28

trevorstephens opened this issue Apr 27, 2017 · 14 comments
Milestone

Comments

@trevorstephens
Copy link
Owner

  • Roulette wheel selection
  • Others?
@trevorstephens trevorstephens added this to the 0.3.0 milestone Apr 27, 2017
@trevorstephens trevorstephens modified the milestones: 0.3.0, 0.4.0 Nov 17, 2017
@echo66
Copy link

echo66 commented Jul 3, 2018

Epsilon Lexicase

@trevorstephens
Copy link
Owner Author

Be great if you could provide a bit more detail @echo66 ...

@trevorstephens trevorstephens modified the milestones: 0.4.0, 0.5.0 Mar 24, 2019
@trevorstephens trevorstephens modified the milestones: 0.4.1, 0.5.0 May 31, 2019
@hwulfmeyer
Copy link

hwulfmeyer commented Jun 22, 2019

I have privately made my own additions to gplearn which includes ParetoGP and EPLEX. See also #33

I would be more than happy to build a PR from my work. It will need some time though since I am currently in the process of finishing up my thesis.

@trevorstephens
Copy link
Owner Author

This would be most welcome @wulfihm 👍

@hwulfmeyer
Copy link

hwulfmeyer commented Oct 9, 2019

Beginning this now. Not sure when I will be done.

@trevorstephens
Copy link
Owner Author

No worries @wulfihm take your time, no rush. I'll be interested to see what you come up with. Abstracting out the selector as its own class possible?

@hwulfmeyer
Copy link

hwulfmeyer commented May 27, 2020

So, as evident I did not get around to doing this.
I do not know if I will ever have the motivation to do it at all as writing it in the first place took some time already. Essentially what I already did was implement ParetoGP and Eplex in my own fork of gplearn (which I created for my bachelor thesis), but which also includes other adaptions I made, which is why a clean PR from this fork is currently not possible. Also I am not sure if my implementations there are programmatically satisfactory for you at all as I did not intend the code to be very well maintainable by other people at all. I didn't even expect anyone to read it in the first place.
See: https://github.com/wulfihm/gplearn_ba

Maybe I will get around to doing a clean PR for this but maybe also not, so in the meantime if anyone wants to do it, feel free. And if you have any questions to my code above I am able to answer them if necessary.

Eplex and ParetoGP are defined here: https://github.com/wulfihm/gplearn_ba/blob/master/gplearn/selection.py
While the ParetoFront is created here: https://github.com/wulfihm/gplearn_ba/blob/master/gplearn/genetic.py

I tried doing NSGA2 but it really did not work at all.

I also implemented other stuff like geometric semantic crossover and mutation, simplification of solutions of gplearn (not finished) and another complexity measure I named 'Kommenda' from M. Kommenda et al. and adding the R2 score for regression.
I also changed the math operators to be more "precise".

Another note, everything above I only implemented with regression in mind. I completely ignored the symbolicTransformer.

If you are interested at all here is my bachelor thesis:
https://www.researchgate.net/publication/335842681_Genetic_Programming_for_Automotive_Modeling_Applications

@trevorstephens
Copy link
Owner Author

Really appreciate you sharing this @wulfihm , if someone wants to take up the torch, it'd be very cool to see these added. Otherwise, maybe I'll take a few rainy weekend days this winter to play with your code 😄

@MilesCranmer
Copy link

+1 for this!
@wulfihm do you have any docs on how to use those techniques in your code? Or even a jupyter notebook with a simple example maybe? Anything helps.

I'm trying to switch to GPLearn from Eureqa lately and I'm also very interested in this. For context, I have a recent paper on converting neural networks into analytic equations to discover new physical laws: https://arxiv.org/abs/2006.11287.
image

We use the following Pareto front technique where we look for the sharpest drop in log-error over length. It seems to work pretty well in a range of noisy datasets rather than jointly optimizing loss and length. But I'd also be interested in trying out these others.
Screen Shot 2020-07-25 at 12 11 58 AM

@hwulfmeyer
Copy link

hwulfmeyer commented Jul 26, 2020

@MilesCranmer What exactly do you mean with "how to use these techniques"? Programmatically, Theoretically? :D

@MilesCranmer
Copy link

I mean programmatically - i.e., how I can configure those methods for GPlearn's .fit() loop for a particular problem if I were to use your fork.

@hwulfmeyer
Copy link

hwulfmeyer commented Jul 26, 2020

I added additional hyperparameters/options:

complexity => 'kommenda' (for the kommenda complexity)
selection => 'eplex'
paretogp => 'True' or 'False'
paretogp_lengths => (a, b)

ParetoGP works by selecting the first parent randomly from the Paretofront (The Archive). The second parent is selection via the selection mechanism (can be anything, i.e. tournament or eplex) from the normal population. See: https://doi.org/10.1007/0-387-23254-0_17
paretogp_lengths is to limit the size of the solutions in the archive, since there is no penalty parameter anymore the individuals could be infinitely large. paretogp_lengths = (5,250) seems large enough to me. Keep the lower limit above 3 or 4, or else it may cause issues.

I used the code here: https://github.com/wulfihm/ba_code/blob/master/main.py works via command line arguments.

The elitism_size command could be interesting to you if you use no ParetoGP. The original GPlearn has the possibility that your population gets worse, since it does not retain the previous generation i.e. the next generation replaces the old one. Elitism also is only in effect if ParetoGP disabled.

@MilesCranmer
Copy link

That's awesome! I'm really looking forward to trying it out this week.

Thanks for putting this online and offering assistance in configuring it.

Cheers,
Miles

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants