-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Tranquilizer (K67R) #1126
Add Tranquilizer (K67R) #1126
Conversation
We added tranquiliser to the import so that it is known to the Axelrod library.
Add to axelrod second, fix test_tranquiliserii
No longer needed after an upstream change.
Need to add more tests
Add more tests
…e names), added to axelrod_second and test_axelrod_second and deleted standalone tranquiliser.py and test_tranquiliser.py files
Signed-off-by: Mansour Hakem <[email protected]>
Signed-off-by: Mansour Hakem <[email protected]>
Here is the fingerprint for the fortran strategy: https://github.com/Axelrod-Python/Axelrod-fingerprint#k67r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @MHakem: some initial comments on style. 👍
axelrod/strategies/axelrod_second.py
Outdated
Has a variable 'NK' which increases each time a move is played whilst in state FD = 2. It has an initial value of 1. | ||
Has a variable 'AD' with an initial value of 5 | ||
Has a variable 'NO with an initial value of 0 | ||
When FD = 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently your docs are failing on the automated testing: https://travis-ci.org/Axelrod-Python/Axelrod/jobs/267887570
This is because of how sphinx formats things. The following will fix it:
232 Has a variable 'NO with an initial value of 0
~ 233
~ 234
~ 235 The strategy follows the following algorithm::
~ 236
~ 237 When FD = 0:
~ 238
~ 239 If the opponent's last move (JA) was Cooperate, increase the value of C by 1
~ 240 If Score (K) < 1.75 * Move Number (M), play opponent's last move
~ 241 If (1.75 * M) <= K < (2.25 * M):
~ 242
~ 243 Calculate Probability P:
~ 244 P = 0.25 + C/M - 0.25*S + (K - L)/100 + 4/M
~ 245 Where L is the opponent's score so far
~ 246 If Random (R) <= P:
~ 247
~ 248 Cooperate
~ 249 Else:
~ 250
~ 251 Defect
~ 252
~ 253 If K >= (2.25 * M):
~ 254
~ 255 Calculate probability P:
~ 256 P = 0.95 - (AD + NO - 5)/15 + 1/M**2 - J/4
~ 257 Where J is the opponent's last move
~ 258
~ 259 If Random (R) <= P:
~ 260
~ 261 Cooperate
~ 262
~ 263 Else:
~ 264
~ 265 Set FD = 1
~ 266 Defect
~ 267
+ 268 When FD = 1:
+ 269
+ 270 Set FD = 2
+ 271 Set the variable 'AD':
+ 272 AD = ((AD * AK) + 3 - (3 * J) + (2 * JA) - (JA * J)) / (AK + 1)
+ 273 Where JA is the strategy's last move and J is the opponent's last move (C = 0, D = 1)
+ 274 Increase the value of AK by 1
+ 275 Cooperate
+ 276
+ 277 When FD = 2:
+ 278
+ 279 Set FD = 0
+ 280 Set the variable 'NO':
+ 281 NO = ((NO * NK) + 3 - (3 * J) + (2 * JA) - (JA * J) / (NK + 1)
+ 282 Where JA the strategy's last move and J is the opponent's last move (C = 0, D = 1)
+ 283 Increase the value of NK by 1
+ 284 Cooperate
+ 285
286 Tranquilizer came in 27th place in Axelrod's second torunament.
axelrod/strategies/axelrod_second.py
Outdated
|
||
Names: | ||
|
||
- Craig Feathers: [Axelrod1980]_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for thine: Craig is the name of the author and you've got it in the above docstring.
axelrod/strategies/axelrod_second.py
Outdated
|
||
|
||
Has a variable, 'FD' which can be 0, 1 or 2. It has an initial value of 0 | ||
Has a variable 'S', which counts the consecutive number of times the opponent has played D (i.e. it is reset to 0 if the opponent plays C). It has an initial value of 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reduce these to 80 characters please (PEP8).
axelrod/strategies/axelrod_second.py
Outdated
|
||
if self.FD == 2: | ||
self.FD = 0 | ||
self.ratio_FD2 = ((self.ratio_FD2 * self.ratio_FD2_count + 3 - 3 * self.dict[opponent.history[-1]]) + 2 * self.dict[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and similar lines) are breaking PEP8 (limit to 80 characters) in quite a few places. This is mainly because this is such a mouthful.
Try and fix it by modifying things so that you create the summation in multiple steps perhaps.
axelrod/strategies/axelrod_second.py
Outdated
self.dict = {C: 0, D: 1} | ||
|
||
|
||
def update_stateFD(self, opponent): # Calculates the ratioFD values and P values, as well as sets the states of FD at the start of each turn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take the comment you have at the end of this and put it in a doctsing:
def update(self, opponent):
"""
Calculates the ...
"""
@@ -5,6 +5,8 @@ | |||
import axelrod | |||
from .test_player import TestPlayer | |||
|
|||
from axelrod.interaction_utils import compute_final_score |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe you are using this in your tests (so you can delete the line).
Remove unnecessary code. Signed-off-by: Mansour Hakem <[email protected]>
Thanks for styling suggestions @drvinceknight. |
axelrod/strategies/axelrod_second.py
Outdated
self.ratio_FD2_count += 1 | ||
elif self.FD == 1: | ||
self.FD = 2 | ||
self.ratio_FD1 = ((self.ratio_FD1 * self.ratio_FD1_count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a syntax error here. You have opened two brackets and only closed one.
It's probably always a good idea to run your tests locally (even if you only run the strategy specific ones) to make sure small typos like this don't get in :) (They get picked up by the automated test runners anyway so it's not a big deal but it'll help you get feedback faster :)) 👍
Signed-off-by: Mansour Hakem <[email protected]>
Signed-off-by: Mansour Hakem <[email protected]>
Signed-off-by: Mansour Hakem <[email protected]>
Signed-off-by: Mansour Hakem <[email protected]>
Although the fingerprints are quite similar, they seem a little off on the right side. Do we have a standard in this case for "strategies are equivalent"? I think we need an official policy for the Fortran translations. |
The fingerprint is just there as a quick indication that this is worth spending time to review. We do have an official policy for the Fortran strategies which is just the same policy as for other strategies: we need to review the actual code and ensure the logic is correct. @MHakem (a student working with me) has spent quite a while working with this and reverse engineering the strategy so I'm pretty certain it's going to be if not correct - which it seems like it's not quite - very close to correct and just needs one of us to go through with a fine tooth comb. :) |
axelrod/strategies/axelrod_second.py
Outdated
Prisoner's Dilemma" paper: The rule normally cooperates but | ||
is ready to defect if the other player defects too often. | ||
Thus the rule tends to cooperate for the first dozen or two moves | ||
if the other player is cooperating, but then it throws in a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs are still breaking, sphinx is super sensitive with blank space :)
Here you need to remove the space before the if
.
axelrod/strategies/axelrod_second.py
Outdated
|
||
- Has a variable, 'FD' which can be 0, 1 or 2. It has an initial value of 0 | ||
- Has a variable 'S', which counts the consecutive number of | ||
times the opponent has played D (i.e. it is reset to 0 if the opponent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all of this, when you bullet overhangs it needs to start at the same place:
- Has a variable 'S', which counts the consecutive number of
times the opponent has played D (i.e. it is reset to 0 if the opponent
axelrod/strategies/axelrod_second.py
Outdated
- Has a variable 'AD' with an initial value of 5 | ||
- Has a variable 'NO with an initial value of 0 | ||
|
||
Has a variable 'NO with an initial value of 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be a bullet (so remove the blank line in between this and the previous bullet).
Dictionaries have been implemented to allow for the calculation of the AD and NO values, which both require the numerical value of C and D. current_score = K I'll investigate the reason behind the disparity with regards to the fingerprints. |
Nice work, @MHakem !!! |
There are some supposed match outcomes for tranquilizer here, may be worth testing (but they may not be valid). |
Sorry for taking a while to get to this @MHakem, my plan is to go over things in detail tomorrow morning (at a conference and have some down time tomorrow). 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've definitely found one clear error @MHakem. Of course, there might be others but it's definitely a step in the correct direction.
axelrod/strategies/axelrod_second.py
Outdated
return self.history[-1] | ||
else: | ||
if self.score == "good": | ||
self.num_turns_after_good_defection = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (the change of state from 0 to 1) took me a while to find. Could it (it might not be worth it so just a suggestion) be moved to inside the update_state
method? That way anything relating to updating the state will be in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: I just realized, that the state can only update given the R <= P
, which is not known within the update_state
method, would it be worth setting the value of R and checking within that method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes good point. Let's leave it as it is for now. Let's aim to get the code to be right first :) 👍
axelrod/strategies/axelrod_second.py
Outdated
else: | ||
Tranquilizer.update_state(self, opponent) | ||
if opponent.history[-1] == D: | ||
self.consecutive_defections += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call this opponent_consecutive_defections
.
axelrod/strategies/axelrod_second.py
Outdated
if self.consecutive_defections == 0: | ||
return C | ||
else: | ||
return self.history[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where this line comes from in the description at #1103 and/or in the Fortran code. I believe this is causing an error as it allows for the strategy to defect three times in a row.
Here is a particular match against Bully (http://axelrod.readthedocs.io/en/stable/_modules/axelrod/strategies/titfortat.html#Bully) that shows difference of behaviour.
Here is what the Fortran strategy does (not that this happens across all seeds so there's no variation here):
>>> import axelrod as axl
>>> import axelrod_fortran as axlf
>>> fortran_player = axlf.Player("k67r")
>>> bully = axl.Bully()
>>> for seed in range(10):
... axl.seed(seed)
... match = axl.Match((fortran_player, bully), turns=8)
... results = match.play()
... print(list(zip(*results))[0]) # Printing only the actions of `k67r`
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
note how it's last 3 turns are always two D
s followed by a C
Here is your implementation:
>>> axl_player = axl.Tranquilizer()
>>> bully = axl.Bully()
>>> for seed in range(10):
... axl.seed(seed)
... match = axl.Match((axl_player, bully), turns=8)
... results = match.play()
... print(list(zip(*results))[0])
(C, D, D, C, C, D, D, D)
(C, D, D, C, C, D, D, D)
(C, D, D, C, C, D, D, D)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, D)
(C, D, D, C, C, D, D, D)
(C, D, D, C, C, D, D, C)
(C, D, D, C, C, D, D, D)
(C, D, D, C, C, D, D, C)
We see here that the strategy sometimes defects 3 times in a row (explicitly said to not happen).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"But as long as TRANQUILIZER is maintaining an average payoff of at least 2.25 points per move, it will never defect twice in succession". I may have made a calculation error, but I do believe that the average score per turn is lower than 2.25.
If R is <= P
, then the program will return the value of K67R (which holds either 0 or 1). The only other times when the value of K67R is changed is when R >= P
, the opponent's last move was a cooperation (i.e. consecutive_defections = 0) or if the score is lower than 1.75 per turn.
From lines 402 to 405, we consider the case of when R <= P
as when this is true, it implies that only possible process that could have actually changed the value of K67R was whether the opponent's last move was a cooperation - if it was, then K67R is changed to 0 and hence return C
, if it wasn't then the value of K67R is unaltered and remains the same as the last turn so return self.history[-1]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining that. It makes sense but as you can see it's not matching up with the results against Bully
above. I'm not entirely sure where/why, I suggest you see if you can track it down and I'll do the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this is fixed, let's add the following test for behaviour against Bully
:
actions = [(C, D), (D, D), (D, C), (C, C), (C, D), (D, D), (D, C), (C, C)]
self.versus_test(axl.Bully(), expected_actions=actions, seed=0)
If helpful, I was able to identify the above incorrect result against Here is how they work: >>> opponents = [s() for s in axl.basic_strategies]
>>> fortran_tf_v_short = axl.TransitiveFingerprint(fortran_player, opponents=opponents)
>>> fortran_tf_v_short.fingerprint()
>>> fortran_tf_v_short.plot(display_names=True); >>> axl_tf_v_short = axl.TransitiveFingerprint(axl_player, opponents=opponents)
>>> axl_tf_v_short.fingerprint()
>>> axl_tf_v_short.plot(display_names=True); >>> import matplotlib.pyplot as plt
>>> plt.imshow(fortran_tf_v_short.data - axl_tf_v_short.data) # manually looking at the difference (The names don't look great because I'm using a small strategy list but I was still able to make out that something wasn't right with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Nikoleta-v3 and I have just finished a big session reading through and debugging and we have found a couple of things wrong (in particular these fix the results against Bully but there are still other things not quite right). Our suggestion is to properly separate the stochastic state and the cooperative state updates.
Let me know if this is unclear.
axelrod/strategies/axelrod_second.py
Outdated
two_turns_after_good_defection_ratio and the probability values, | ||
as well as sets the value of num_turns_after_good_defection. | ||
""" | ||
self.current_score = compute_final_score(zip(self.history, opponent.history)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this from this method and just do it in the strategy
(see my comment there).
axelrod/strategies/axelrod_second.py
Outdated
|
||
def strategy(self, opponent: Player) -> Action: | ||
|
||
current_score = compute_final_score(zip(self.history, opponent.history)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this to:
self.current_score = compute_final_score(zip(self.history, opponent.history))
remove the same calculation in update_state
(see my comment there).
axelrod/strategies/axelrod_second.py
Outdated
self.score = "good" | ||
elif (self.current_score[0] / ((len(self.history)) + 1)) >= 1.75: | ||
self.probability = ( | ||
(.25 + (opponent.cooperations / ((len(self.history)) + 1))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually needs to be opponent.cooperations + 1
: the Fortran strategies are passed a dummy initial "cooperation" so the C
in that code is in fact opponent.cooperations + 1
.
axelrod/strategies/axelrod_second.py
Outdated
if len(self.history) == 0: | ||
return C | ||
else: | ||
Tranquilizer.update_state(self, opponent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be:
self.update_state(opponent)
axelrod/strategies/axelrod_second.py
Outdated
return C | ||
else: | ||
Tranquilizer.update_state(self, opponent) | ||
if opponent.history[-1] == D: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is causing a problem with your current code. Your update_state
method is also using (for the case when FD
is 0) opponent_consecutive_defections
however because of the order in which you have done things this is using the count of consecutive defections from the previous round.
The way I suggest you fix this (and this will require a bit of work) is to leave this where it is but:
- Rename
update_state
to beupdate_cooperative_state
and ONLY do the cases corresponding toFD==1
andFD==2
there. - Move the code for the other state (the probabilistic one for the calculation of
self.probability
) to a method calledupdate_stochastic_state
and that can be called after you have updatedself.opponent_consecutive_defections
.
…ify probability equation and hence modify tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm suggesting quite a bit of a refactor here @MHakem. I suggest you add the extra test first and then commit that (checking that the tests pass). After you've done that make small piece meal changes checking things as you go and committing if you see fit.
Perhaps first though have a read through what I'm suggesting (which I think cleans up the code quite a bit) before you implement it :) Any questions: get in touch! :)
"two_turns_after_good_defection_ratio": 0, | ||
"one_turn_after_good_defection_ratio_count": 1, | ||
"two_turns_after_good_defection_ratio_count": 1}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add:
opponent = axelrod.Bully()
actions = [(D, C), (D, D), (C, D), (C, C), (D, C), (D, D), (C, D), (C, C)]
self.versus_test(opponent, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MHakem can you add this test please.
axelrod/strategies/axelrod_second.py
Outdated
|
||
if len(self.history) == 0: | ||
return C | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for else
, you can just unindent everything (the previous return
ends the function there and then).
axelrod/strategies/axelrod_second.py
Outdated
|
||
self.current_score = compute_final_score(zip(self.history, opponent.history)) | ||
|
||
if len(self.history) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to before the self.current_score
(basically on the first move: do nothing at all and just return a C
).
axelrod/strategies/axelrod_second.py
Outdated
return C | ||
else: | ||
self.update_cooperative_state(opponent) | ||
if opponent.history[-1] == D: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this the update of consecutive opponent defections to inside the update_cooperative_state
.
axelrod/strategies/axelrod_second.py
Outdated
if len(self.history) == 0: | ||
return C | ||
else: | ||
self.update_cooperative_state(opponent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this line, let's go for a:
if self.num_turns_after_good_defection in [1, 2]:
self.update_cooperative_state(opponent)
return C
This is a nice quick and easy to debug way to say that if we're in a cooperative state, update the cooperative state and then return a C
. The whole rest of the code can then be cleaned up quite a lot to consider the case of when we're not in a cooperative state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I may have misunderstood something here, but how did you come to reason that given self.num_turns_after_good_defection != 0
, the response would be to return C
? I found the response to be dependent on the action of the opponent on the last turn (due to the fact that the score may drop below 2.25 and hence allow for double defection).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're both a bit wrong here.
These lines: https://github.com/Axelrod-Python/TourExec/blob/v0.3.0/src/strategies/k67r.f#L29:
545 K67R = 0
IF (ABS(FD - 1.5) .EQ. .5) GOTO 599
mean that if FD is 1 or 2 (what ABS(FD - 1.5) .EQ. .5
) is equivalent to then we go to 599 (which is just a return.
Where I am a bit wrong is that this needs to happen after we update the cooperative states
. So: if after updating the cooperative states the state is in [1, 2] then return C
:
self.update_cooperative_state(opponent)
if self.num_turns_after_good_defection in [1, 2]:
return C
This is essentially what lines 15 to 29 do in the Fortran code.
axelrod/strategies/axelrod_second.py
Outdated
self.opponent_consecutive_defections += 1 | ||
else: | ||
self.opponent_consecutive_defections = 0 | ||
self.update_stochastic_state(opponent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get rid of update_stochastic_state
completely I think and just have the logic for the stochastic state directly here. The rest of the strategy then just becomes (please check carefully):
if (self.current_score[0] / ((len(self.history)) + 1)) >= 2.25:
probability = (
(.95 - (((self.one_turn_after_good_defection_ratio)
+ (self.two_turns_after_good_defection_ratio) - 5) / 15))
+ (1 / (((len(self.history))+1) ** 2))
- (self.dict[opponent.history[-1]] / 4)
)
if random.random() <= probability: # I have plans for changing this a bit but let's take that one step at a time
return C
self.num_turns_after_good_defection = 1
return D
if (self.current_score[0] / ((len(self.history)) + 1)) >= 1.75:
probability = (
(.25 + ((opponent.cooperations + 1) / ((len(self.history)) + 1)))
- (self.opponent_consecutive_defections * .25)
+ ((self.current_score[0]
- self.current_score[1]) / 100)
+ (4 / ((len(self.history)) + 1))
)
if random.random() <= probability:
return C
self.num_turns_after_good_defection = 1
return D
return opponent.history[-1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this in turn gets rid of a few of the internal variables (self.probability
and self.score
) you should change the init
method to reflect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor tweak that will require tests changes.
Definitely looking good now!
axelrod/strategies/axelrod_second.py
Outdated
self.two_turns_after_good_defection_ratio= 0 | ||
self.one_turn_after_good_defection_ratio_count = 1 | ||
self.two_turns_after_good_defection_ratio_count = 1 | ||
self.current_score = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this (it doesn't need to be an attribute and can just be calculated on the fly).
axelrod/strategies/axelrod_second.py
Outdated
self.dict = {C: 0, D: 1} | ||
|
||
|
||
def update_cooperative_state(self, opponent): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now updates more than the cooperative state (it also updates the consecutive defections) so let's go back to calling it:
def update_state
axelrod/strategies/axelrod_second.py
Outdated
self.update_cooperative_state(opponent) | ||
if self.num_turns_after_good_defection in [1, 2]: | ||
return C | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move self.current_score = compute_final_score(zip(self.history, opponent.history))
to here. IE move it to after the code block that returns C
when we're in a cooperative state.
axelrod/strategies/axelrod_second.py
Outdated
) | ||
if random.random() <= probability: | ||
return C | ||
self.num_turns_after_good_defection = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be here. As you can see here: https://github.com/Axelrod-Python/TourExec/blob/v0.3.0/src/strategies/k67r.f#L38 we don't change state if this random test fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will require unit tests to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A further request for the change of the docstring.
axelrod/strategies/axelrod_second.py
Outdated
one-quarter of the time. | ||
|
||
|
||
- Has a variable, 'FD' which can be 0, 1 or 2. It has an initial value of 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove all of this part of the docstring and replace it with something along the lines of:
The strategy starts by cooperating this strategy has 3 states.
At the start of the strategy it updates it's states:
- It counts the number of consecutive defections by the opponent.
- If it was in state 2 it moves to state 0 and calculates the following quantities [INCLUDE THE CALCULATIONS]
- If it was in state 1 it moves to state 2 and calculates the following quantities [INCLUDE THE CALCULATIONS]
If after this it is in state 1 or 2 then it cooperates.
If it is in state 0 it will potentially perform 1 of the 2 following stochastic tests:
1. If [CONDITION ON THE SCORE] then it calculates a value [DETAILS ABOUT CALCULATION OF probability] and will cooperate if a random sampled number is less than that value. If it does not cooperate then the strategy moves to state 1 and defects.
2. If [CONDITION ON THE SCORE] then it calculates a value [DETAILS ABOUT CALCULATION OF probability] and will cooperate if a random sampled number is less than that value. If not, it defects.
If none of the above holds the player simply plays tit for tat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting there @MHakem.
Mainly style changes from me at this point as well as that request for the extra test against Bully (see comment with the code).
|
||
""" | ||
Submitted to Axelrod's second tournament by Craig Feathers | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start with this:
Description given in Axelrod's "More Effective Choice in the
Prisoner's Dilemma" paper: The rule normally cooperates but
is ready to defect if the other player defects too often.
Thus the rule tends to cooperate for the first dozen or two moves
if the other player is cooperating, but then it throws in a
defection. If the other player continues to cooperate, then defections
become more frequent. But as long as Tranquilizer is maintaining an
average payoff of at least 2.25 points per move, it will never defect
twice in succession and it will not defect more than
one-quarter of the time.
axelrod/strategies/axelrod_second.py
Outdated
""" | ||
Submitted to Axelrod's second tournament by Craig Feathers | ||
|
||
This strategy is based on the reverse engineering of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This strategy
-> This implementation
axelrod/strategies/axelrod_second.py
Outdated
|
||
- It counts the number of consecutive defections by the opponent. | ||
- If it was in state 2 it moves to state 0 and calculates the | ||
following quantities two_turns_after_good_defection_ratio and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need two spaces:
- If it was ...
following...
two_turns_...
also though include the formulae for two_turns_after_good_defection_ratio
and two_turns_after_good_defection_ratio_count
.
axelrod/strategies/axelrod_second.py
Outdated
following quantities two_turns_after_good_defection_ratio and | ||
two_turns_after_good_defection_ratio_count. | ||
- If it was in state 1 it moves to state 2 and calculates the | ||
following quantities one_turn_after_good_defection_ratio and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to have two extra spaces (as above). Also as above, include the formulae for the two quantities.
axelrod/strategies/axelrod_second.py
Outdated
If none of the above holds the player simply plays tit for tat. | ||
|
||
|
||
The strategy follows the following algorithm:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all of this algorithm (it's just a repetition of above).
axelrod/strategies/axelrod_second.py
Outdated
|
||
def __init__(self): | ||
super().__init__() | ||
self.num_turns_after_good_defection = 0 # equal to FD variable in Fortran code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... in original Fortran code
axelrod/strategies/axelrod_second.py
Outdated
def __init__(self): | ||
super().__init__() | ||
self.num_turns_after_good_defection = 0 # equal to FD variable in Fortran code | ||
self.opponent_consecutive_defections = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a similar inline comment to all of these please?
# equal to ...
- current_score[1]) / 100) | ||
+ (4 / ((len(self.history)) + 1)) | ||
) | ||
if random.random() <= probability: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note for whoever reviews this after me: we could be tempted to use random_choice
but probability
is not really a well defined probability (it can be greater than 1). I suggest we leave it as it is for the sake of simplicity of the code.
|
||
opponent = axelrod.Defector() | ||
actions = [(C, D)] + [(D, D)] * 20 | ||
self.versus_test(opponent, expected_actions=actions, attrs={"num_turns_after_good_defection": 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to fix the PEP8 on all of these. Two options, I have a slight preference for option 2:
Option 1:
` opponent = axelrod.Defector()
actions = [(C, D)] + [(D, D)] * 20
self.versus_test(opponent, expected_actions=actions,
attrs={"num_turns_after_good_defection": 0,
"one_turn_after_good_defection_ratio": 5,
"two_turns_after_good_defection_ratio": 0,
"one_turn_after_good_defection_ratio_count": 1,
"two_turns_after_good_defection_ratio_count": 1})
Option 2:
` opponent = axelrod.Defector()
actions = [(C, D)] + [(D, D)] * 20
expected_attrs = {"num_turns_after_good_defection": 0,
"one_turn_after_good_defection_ratio": 5,
"two_turns_after_good_defection_ratio": 0,
"one_turn_after_good_defection_ratio_count": 1,
"two_turns_after_good_defection_ratio_count": 1})
self.versus_test(opponent, expected_actions=actions, attrs=expected_attrs)
"two_turns_after_good_defection_ratio": 0, | ||
"one_turn_after_good_defection_ratio_count": 1, | ||
"two_turns_after_good_defection_ratio_count": 1}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MHakem can you add this test please.
axelrod/strategies/axelrod_second.py
Outdated
return C | ||
self.num_turns_after_good_defection = 1 | ||
return D | ||
elif (current_score[0] / ((len(self.history)) + 1)) >= 1.75: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not need to be an elif
: just an if
. Also include a blank line to separate things out a bit (just after the return D
).
For completeness, here are some more fingerprints differences which show agreement to within a percent. This coupled with careful examination of the implementation mean that I'm now sure this is correctly implemented.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One final thing from me, could you modify the table at here: https://github.com/Axelrod-Python/Axelrod/blob/master/docs/reference/overview_of_strategies.rst#axelrods-second-tournament
to note that the strategy is implemented (see how the other implemented strategies are implemented).
I believe this is now an accurate implementation of the Fortran strategy. Both based on the fingerprints but more so because of the fact that @Nikoleta-v3 and I very carefully examined the original code. |
axelrod/strategies/axelrod_second.py
Outdated
Fortran strategy K67R from Axelrod's second tournament. | ||
Reversed engineered by: Owen Campbell, Will Guo and Mansour Hakem. | ||
|
||
The strategy starts by cooperating this strategy has 3 states. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The strategy starts by cooperating and has 3 states.
axelrod/strategies/axelrod_second.py
Outdated
|
||
The strategy starts by cooperating this strategy has 3 states. | ||
|
||
At the start of the strategy it updates it's states: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its rather than it's
axelrod/strategies/axelrod_second.py
Outdated
""" | ||
Calculates the ratio values for the one_turn_after_good_defection_ratio, | ||
two_turns_after_good_defection_ratio and the probability values, | ||
as well as sets the value of num_turns_after_good_defection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'and' instead of 'as well as'
axelrod/strategies/axelrod_second.py
Outdated
|
||
def strategy(self, opponent: Player) -> Action: | ||
|
||
if len(self.history) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be neater as if not self.history
Looking good! Just some minor stuff from me - mainly grammatical nit picking!! |
This implements the Tranquilizer strategy as referenced in issue #1103 (one of the strategies in Axelrod's second tournament).
Reverse engineering was a joint contribution with Will Guo.
This strategy is stochastic, so testing that the program is the correct implementation of the Fortran code is not immediate. However, as indicated by @drvinceknight I have ran the fingerprint for the Tranquilizer program: