-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re factor/implement first tournament strategies #1275
Re factor/implement first tournament strategies #1275
Conversation
Happy to change the prefix etc...
Also make minor docstring amendments. Also made some notes regarding Downing.
I believe the logic was slightly faulty for Tullock: the first 11 moves was correct however once past there we should only be considering the 10 previous moves.
As noted by @id428 we were not giving a "true" fresh start so I've implemented that. I also added the two final defections (when the game length is known).
This is a complete rewrite of the Downing strategy. To be able to do this I've used the description in Downing's 1975 paper. This description itself is not sufficiently clear and so I've had to make some further assumptions which I've clearly documented. Note: there was documentation claiming that there was a bug in the implementation in the original tournament. I believe this was a mistake due to a misinterpretation of one online set of slides where they commented that there was a mistake in the implementation. This however was not a bug and was actually described quite a lot in Axelrod's original tournament: the strategy was implemented to act a particular way in the first two rounds and this had the result of making the strategy a king maker. This however was not a bug, just a particular interpretation of the overall decision rule described in Downing's 1975 paper.
Note that this was not the specific error that @id428 pointed out but having reviewed the papers and source code I found this one minor inaccuracy. @id428 made a point that the strategy should cooperate twice after it's round of retaliations but I do not see this in any of the descriptions of the strategy. Once the strategy has finished retaliating, all the texts indicate that it cooperates again but ready to retaliate.
I've hopefully kept my commits quite modular and in some cases added text describing my approach to the commit message. For example, Downing required me to sift through Downing's 1975 paper: eccd7e2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about having a classifier entry for "in Axelrod's first tournament" and "in Axelrod's second tournament"? Then we can also easily add lists of first and second tournament strategies along side short_run_term_strategies
for ease of use (and possibly a nice tutorial example, maybe even an advanced tutorial comparing to the Fortran implementations with fingerprints). We'd also need to add these classifiers to TFT.
return D | ||
return C | ||
|
||
class FirstByDowning(Player): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since RevisedDowning is in the second tournament, can we also preserve that implementation and/or compare to the fortran implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I meant to write that in my PR: we need to consider what we do with RevisedDowning
. As it's in the second tournament is it worth waiting until we translate https://github.com/Axelrod-Python/TourExec/blob/v0.2.0/src/strategies/K59R.f or implement RevisedDowning
as a modification of the strategy in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current RevisedDowning
looks really similar to the Fortran code. If they are basically the same I'm in favor of leaving RevisedDowning
(eliminating the revised
bool) as the second tournament implementation. Maybe comparing the fingerprints will tell us if it's essentially correct or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good call.
I'm struggling to get axelrod_fortran
to work on my current machine (I blame an OS update), could you or @meatballs if you get time paste fingerprints for "k59r".
Something like:
import axelrod as axl
import axelrod_fortran as axlf
downing = axlf.Player("k59r")
ashlock_fp = axl.AshlockFingerprint(strategy=downing)
data = ashlock_fp.fingerprint() # This will take a little while
p = ashlock_fp.plot()
p.savefig("k59r_ashlock_fingerprint.png")
transitive_fp = axl.TransitiveFingerprint(strategy=downing)
data = transitive_fp.fingerprint()
p = transitive_fp.plot()
p.savefig("k59r_transitive_fingerprint.png")
Here are the equivalent for RevisedDowning
:
Ashlock:
Transitive:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gosh they're incredibly similar though. I'm happy and looking at the history of RevisedDowning
you implemented it from the Fortran code and I suspect you did it right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep it and then tweak it in a follow-up PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me. 👍
Yeah I really like this idea. Perhaps a classifier is not worth doing as it's not something dynamic (we will never implement another strategy from the first tournament) - perhaps a hard coded list is the way to go? Also - unrelated - I think we should just go ahead remove the cheating strategies (I know I've been on the other side of this for a long time), if only to clear up the classifiers. But that's a discussion for another time... |
Thanks @meatballs I think I've addressed all those. |
Hard-coded lists seem fine in this case.
Sure, let's (eventually) at least silo them sufficiently so that no one inadvertently uses them. |
@@ -2034,7 +2034,7 @@ def test_strategy(self): | |||
|
|||
class TestSeconodByDowning(TestPlayer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconod -> Second
Are we sure we don't want to call this one RevisedDowning ? (Here is where the classifier would separate the naming concerns from the inclusion in the first or second tournament...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that'll be similar to TitForTat, Grudger etc... I'll change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a6471e8 renames this and moves it to its own file so that the description at the top of axelrod_second.py
is still accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following a meeting with @drvinceknight today going through the paper Downing (1975) I have made a few comments on the implementation of Downing. Please let me know if something does not make sense 👍
axelrod/strategies/axelrod_first.py
Outdated
S, P(C_o | C_s) and the conditional probability that O will choose C | ||
following D by S, P(C_o, D_s)." | ||
|
||
Throughout the paper the strategy (S) assumes that the opponent (D) is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you meant (O)
axelrod/strategies/axelrod_first.py
Outdated
|
||
EV_TOT = #CC(EV_CC) + #CD(EV_CD) + #DC(EV_DC) + #DD(EV_DD) | ||
|
||
I.E. The player aims to maximise the expected value of being in each state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After our conversation today I feel (not sure) that this might need to re-written. #CC is not a state but the number of times the strategy S cooperated twice...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree, thanks @Nikoleta-v3 I'll work on this tomorrow (long day!).
axelrod/strategies/axelrod_first.py
Outdated
Then the opponent's first cooperation counts as a cooperation in response to | ||
the non existent cooperation of round 0. The total number of cooperations in | ||
response to a cooperation is 1. We need to take in to account that extra | ||
phantom cooperation to estimate the probability alpha=P(C|C) as 1 / 1 = 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one question, why we don't start with alpha=P(C|C) = 0.5
as stated above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that doesn't necessarily always imply 2 defections as an opening which is one of the "stronger" points made in the various literature. (But I'm guessing here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll investigate more and get back to you (and add to the docstring as well) 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6c46483 adds to the docstring on this topic and also no the other points you raised. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation is really good now! One minor comment would be to change P(C | C)
to P(C_o | C_s)
to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup good call.
b9b00a2 adds a tutorial that just reproduces the first tournament (or fails to). I thought I might as well do this after your suggestion @marcharper, let me know what you think. FYI, I'm currently running some code to iterate through random seeds to see if we can get the same results as Axelrod reported (no luck so far but it did give me the couple of examples I use in the tutorial). Here is the code I'm using to do that:
|
axelrod/strategies/axelrod_first.py
Outdated
playing C and the second D etc... | ||
In this case the author uses an argument based on the sequence of plays by | ||
the player (S) so #CC denotes the number of times the player plays C twice | ||
in a row. This is then used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to...
@Nikoleta-v3 could you confirm you're happy with this now when you get a moment? |
Looks good to me @drvinceknight! |
@marcharper @meatballs when either of your have time, I believe this is good to go now. |
LGTM. @meatballs want to take a final look since there were some post-approval changes? |
#1273 noted a number of potential implementation errors in the first tournament strategies.
This is a first draft of addressing these so we can discuss particular implementations.
I believe that a number of errors were a result of confusion between first tournament and second tournament code so I've renamed all the strategies:
FirstBy<author>
andSecondBy<author>
. (I'm open to other suggestions).In some places I've added a number of things to the docstrings to make explicit the assumptions made when descriptions are not clear.
Closes #1273