Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re factor/implement first tournament strategies #1275

Merged
merged 31 commits into from
Dec 11, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
7520eec
Rename second tournament strategies to `SecondBy`
drvinceknight Nov 18, 2019
b5e8310
Modify name of Davis and Feld.
drvinceknight Nov 20, 2019
0bf356c
Add clone argument to Graaskamp.
drvinceknight Nov 20, 2019
8c59778
First first by Grofman.
drvinceknight Nov 20, 2019
d4e1552
Modify docstring and name of Joss.
drvinceknight Nov 20, 2019
33c903f
Fix Nydegger.
drvinceknight Nov 20, 2019
aa417a1
Rename Shubik but make notes for further investigation.
drvinceknight Nov 20, 2019
3dc60fb
Rename and fix Tullock.
drvinceknight Nov 20, 2019
75ee7c5
Rename Unnamed strategy.
drvinceknight Nov 21, 2019
b54c272
Rename Stein and Rapoport.
drvinceknight Nov 21, 2019
061e7f2
Add a note to Graaskamp.
drvinceknight Nov 21, 2019
0dde6e2
Revise Tideman and Cheruzzi.
drvinceknight Nov 21, 2019
eccd7e2
Re-implement Downing.
drvinceknight Nov 28, 2019
dc7a710
Modify Shubik.
drvinceknight Nov 30, 2019
659d997
Adjust tests.
drvinceknight Nov 30, 2019
ce1d6e6
Adjust doctests.
drvinceknight Nov 30, 2019
6e6a573
Make minor modifications suggested by @marcharper.
drvinceknight Dec 2, 2019
d8cee8d
Address comments from Owen.
drvinceknight Dec 2, 2019
f4e162b
Address comments from @marcharper (RevisedDowning)
drvinceknight Dec 3, 2019
ecbd9c0
Update docs.
drvinceknight Dec 3, 2019
0c31588
Add a list with all First strategies.
drvinceknight Dec 3, 2019
e6b9349
Fix a typo.
drvinceknight Dec 3, 2019
e7cf5ad
Fix failing test.
drvinceknight Dec 3, 2019
a6471e8
Rename SecondByDowning -> RevisedDowning.
drvinceknight Dec 5, 2019
6c46483
Add docstring about alpha=beta=1/2 in 1st 2 rounds.
drvinceknight Dec 5, 2019
e396646
Correct name of RevisedDowning in docs.
drvinceknight Dec 5, 2019
b9b00a2
Write tutorial using Axelrods first strategies.
drvinceknight Dec 5, 2019
49e8983
Run black on plotting script.
drvinceknight Dec 5, 2019
d24c304
Address @nikoleta-v3's comments.
drvinceknight Dec 5, 2019
468a829
Move test file to correct location.
drvinceknight Dec 5, 2019
36e82a2
Documentation modification.
drvinceknight Dec 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions axelrod/strategies/_strategies.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
FirstByGrofman,
FirstByJoss,
FirstByNydegger,
RevisedDowning,
FirstByDowning,
FirstByShubik,
FirstBySteinAndRapoport,
FirstByTidemanAndChieruzzi,
Expand Down Expand Up @@ -397,7 +397,7 @@
Retaliate,
Retaliate2,
Retaliate3,
RevisedDowning,
FirstByDowning,
SecondByRichardHufford,
Ripoff,
RiskyQLearner,
Expand Down
208 changes: 140 additions & 68 deletions axelrod/strategies/axelrod_first.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,107 +73,181 @@ def strategy(self, opponent: Player) -> Action:
return D
return C

# TODO Split this in to ttwo strategies, it's not clear to me from the internet
# sources that the first implentation was buggy as opposed to just "poorly
# thought out". The flaw is actually clearly described in the paper's
# description: "Initially, they are both assumed to be .5, which amounts to the
# pessimistic assumption that the other player is not responsive"
# The revised version should be put in it's own module.
# I also do not understand where the decision rules come from.
# Need to read https://journals.sagepub.com/doi/10.1177/003755007500600402 to
# gain understanding of decision rule.
class RevisedDowning(Player):
class FirstByDowning(Player):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since RevisedDowning is in the second tournament, can we also preserve that implementation and/or compare to the fortran implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I meant to write that in my PR: we need to consider what we do with RevisedDowning. As it's in the second tournament is it worth waiting until we translate https://github.com/Axelrod-Python/TourExec/blob/v0.2.0/src/strategies/K59R.f or implement RevisedDowning as a modification of the strategy in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current RevisedDowning looks really similar to the Fortran code. If they are basically the same I'm in favor of leaving RevisedDowning (eliminating the revised bool) as the second tournament implementation. Maybe comparing the fingerprints will tell us if it's essentially correct or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good call.

I'm struggling to get axelrod_fortran to work on my current machine (I blame an OS update), could you or @meatballs if you get time paste fingerprints for "k59r".

Something like:

import axelrod as axl
import axelrod_fortran as axlf

downing = axlf.Player("k59r")
ashlock_fp = axl.AshlockFingerprint(strategy=downing)
data = ashlock_fp.fingerprint()  # This will take a little while
p = ashlock_fp.plot()
p.savefig("k59r_ashlock_fingerprint.png")

transitive_fp = axl.TransitiveFingerprint(strategy=downing)
data = transitive_fp.fingerprint() 
p = transitive_fp.plot()
p.savefig("k59r_transitive_fingerprint.png")

Here are the equivalent for RevisedDowning:

Ashlock:

downing_ashlock_fingerprint

Transitive:

downing_transitive_fingerprint

Copy link
Member

@marcharper marcharper Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are really similar but not 100% identical

transitive

ashlock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gosh they're incredibly similar though. I'm happy and looking at the history of RevisedDowning you implemented it from the Fortran code and I suspect you did it right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it and then tweak it in a follow-up PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me. 👍

"""
Submitted to Axelrod's first tournament by Downing

The description written in [Axelrod1980]_ is:

> "This rule selects its choice to maximize its own long- term expected payoff on
> "This rule selects its choice to maximize its own longterm expected payoff on
> the assumption that the other rule cooperates with a fixed probability which
> depends only on whether the other player cooperated or defected on the previous
> move. These two probabilities estimates are con- tinuously updated as the game
> move. These two probabilities estimates are continuously updated as the game
> progresses. Initially, they are both assumed to be .5, which amounts to the
> pessimistic assumption that the other player is not responsive. This rule is
> based on an outcome maximization interpretation of human performances proposed
> by Downing (1975)."

This strategy attempts to estimate the next move of the opponent by estimating
the probability of cooperating given that they defected (:math:`p(C|D)`) or
cooperated on the previous round (:math:`p(C|C)`). These probabilities are
continuously updated during play and the strategy attempts to maximise the long
term play. Note that the initial values are :math:`p(C|C)=p(C|D)=.5`.
The Downing (1975) paper is "The Prisoner's Dilemma Game as a
Problem-Solving Phenomenon" [Downing1975]_ and this is used to implement the
strategy.

# TODO: This paragraph is not correct (see note above)
Downing is implemented as `RevisedDowning`. Apparently in the first tournament
the strategy was implemented incorrectly and defected on the first two rounds.
This can be controlled by setting `revised=True` to prevent the initial defections.
There are a number of specific points in this paper, on page 371:

This strategy came 10th in Axelrod's original tournament but would have won
if it had been implemented correctly.
> "[...] In these strategies, O's [the opponent's] response on trial N is in
some way dependent or contingent on S's [the subject's] response on trial N-
1. All varieties of these lag-one matching strategies can be defined by two
parameters: the conditional probability that O will choose C folloging C by
marcharper marked this conversation as resolved.
Show resolved Hide resolved
S, P(C_o | C_s) and the conditional probability that O will choose C
following D by S, P(C_o, D_s)."

Throughout the paper the strategy (S) assumes that the opponent (D) is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you meant (O)

playing a reactive strategy defined by these two conditional probabilities.

The strategy aims to maximise the long run utility against such a strategy
and the mechanism for this is described in Appendix A (more on this later).

One final point from the main text is, on page 372:

> "For the various lag-one matching strategies of O, the maximizing
strategies of S will be 100% C, or 100% D, or for some strategies all S
strategies will be functionaly equivalent."
marcharper marked this conversation as resolved.
Show resolved Hide resolved

This implies that the strategy S will either always cooperate or always
defect (or be indifferent) dependent on the opponent's defining
probabilities.

To understand the particular mechanism that describes the strategy S, we
refer to Appendix A of the paper on page 389.

The state goal of the strategy is to maximize (using the notation of the
marcharper marked this conversation as resolved.
Show resolved Hide resolved
paper):

EV_TOT = #CC(EV_CC) + #CD(EV_CD) + #DC(EV_DC) + #DD(EV_DD)

I.E. The player aims to maximise the expected value of being in each state
Copy link
Member

@Nikoleta-v3 Nikoleta-v3 Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After our conversation today I feel (not sure) that this might need to re-written. #CC is not a state but the number of times the strategy S cooperated twice...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree, thanks @Nikoleta-v3 I'll work on this tomorrow (long day!).

weighted by the number of times we expect to be in that state.

On the second page of the appendix, figure 4 (page 390) supposedly
identifies an expression for EV_TOT however it is not clear how some of the
steps are carried out. To the best guess, it seems like an asymptotic
marcharper marked this conversation as resolved.
Show resolved Hide resolved
argument is being used. Furthermore, a specific term is made to disappear in
the case of T - R = P - S (which is not the case for the standard
(R, P, S, T) = (3, 1, 0, 5)):

> "Where (t - r) = (p - s), EV_TOT will be a function of alpha, beta, t, r,
p, s and N are known and V which is unknown.

V is the total number of cooperations of the player S (this is noted earlier
in the abstract) and as such the final expression (with only V as unknown)
can be used to decide if V should indicate that S always cooperates or not.

Given the lack of usable details in this paper, the following interpretation
is used to implement this strategy:

1. On any given turn, the strategy will estimate alpha = P(C_o | C_s) and
beta = P(C_o | D_s).
2. The stragy will calculate the expected utility of always playing C OR
marcharper marked this conversation as resolved.
Show resolved Hide resolved
always playing D against the estimage probabilities. This corresponds to:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

estimated?


a. In the case of the player always cooperating:

P_CC = alpha and P_CD = 1 - alpha

b. In the case of the player always defecting:

P_DC = beta and P_DD = 1 - beta


Using this we have:

E_C = alpha R + (1 - alpha) S
E_D = beta T + (1 - beta) P

Thus at every turn, the strategy will calculate those two values and
cooperate if E_C > E_D and will defect if E_C < E_D.

In the case of E_C = E_D, the player will alternate from their previous
move. This is based on specific sentence from Axelrod's original paper:

> "Under certain circumstances, DOWNING will even determine that the best
> strategy is to alternate cooperation and defection."

One final important point is the early game behaviour of the strategy. It
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the "bug" that is later referred to? If so can we explicitly mention that here in the docstring?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it wasn't actually a "bug", Axelrod discusses it at length in the paper so I'm relatively sure (I wouldn't bet my house on it) that the strategy was implemented as intended. However it seems that it was accepted afterwards that the intent was a mistake. I'll add that in future tournaments this strategy was implemented with a modified initial behaviour.

has been noted that this strategy was implemented in a way that assumed that
alpha and beta were both 1/2:

> "Initially, they are both assumed to be .5, which amounts to the
> pessimistic assumption that the other player is not responsive."

Thus, the player opens with a defection in the first two rounds. Note that
from the Axelrod publications alone there is nothing to indicate defections
on the first two rounds, although a defection in the opening round is clear.
However there is a presentation available at
http://www.sci.brooklyn.cuny.edu/~sklar/teaching/f05/alife/notes/azhar-ipd-Oct19th.pdf
That clearly states that Downing defected in the first two rounds, thus this
is assumed to be the behaviour.

Note that response to the first round allows us to estimate
beta = P(C_o | D_s) and we will use the opening play of the player to
estimate alpha = P(C_o | C_s). This is an assumption with no clear
indication from the literature.

--
This strategy came 10th in Axelrod's original tournament.

Names:

- Revised Downing: [Axelrod1980]_
"""

name = "Revised Downing"
name = "First tournament by Downing"
marcharper marked this conversation as resolved.
Show resolved Hide resolved

classifier = {
"memory_depth": float("inf"),
"stochastic": False,
"makes_use_of": set(),
"makes_use_of": {"game"},
"long_run_time": False,
"inspects_source": False,
"manipulates_source": False,
"manipulates_state": False,
}

def __init__(self, revised: bool = True) -> None:
def __init__(self) -> None:
super().__init__()
self.revised = revised
self.good = 1.0
self.bad = 0.0
self.nice1 = 0
self.nice2 = 0
self.total_C = 0 # note the same as self.cooperations
self.total_D = 0 # note the same as self.defections
self.number_opponent_cooperations_in_response_to_C = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we track this in the state distributions in the history class, but perhaps you want to be more explicit here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd completely forgotten about our new shiny history class!

In this instance though I don't think we do track this:

>>> import axelrod as axl

>>> players = (axl.CyclerCCD(), axl.CyclerDC())
>>> match = axl.Match(players)
>>> interactions = match.play()
>>> players[0].history.state_distribution
Counter({(C, D): 67, (C, C): 67, (D, D): 33, (D, C): 33})

Unless I'm mistaken that's returning the strategy distributions however I there need to count the response to C.

I've looked through the other properties and don't see what we need in there but that could be something nice to add?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, let's leave it as is for now and we can consider adding something to the history class later.

self.number_opponent_cooperations_in_response_to_D = 0

def strategy(self, opponent: Player) -> Action:
round_number = len(self.history) + 1
# According to internet sources, the original implementation defected
# on the first two moves. Otherwise it wins (if this code is removed
# and the comment restored.
# http://www.sci.brooklyn.cuny.edu/~sklar/teaching/f05/alife/notes/azhar-ipd-Oct19th.pdf

if self.revised:
if round_number == 1:
return C
elif not self.revised:
if round_number <= 2:
return D

# Update various counts
if round_number > 2:
if self.history[-1] == D:
if opponent.history[-1] == C:
self.nice2 += 1
self.total_D += 1
self.bad = self.nice2 / self.total_D
else:
if opponent.history[-1] == C:
self.nice1 += 1
self.total_C += 1
self.good = self.nice1 / self.total_C
# Make a decision based on the accrued counts
c = 6.0 * self.good - 8.0 * self.bad - 2
alt = 4.0 * self.good - 5.0 * self.bad - 1
if c >= 0 and c >= alt:
move = C
elif (c >= 0 and c < alt) or (alt >= 0):
move = self.history[-1].flip()
else:
move = D
return move
if round_number == 1:
return D
if round_number == 2:
if opponent.history[-1] == C:
self.number_opponent_cooperations_in_response_to_C += 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is a response to a D since this strategy always defects on round 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah no that's not what's happening here but I understand the confusion so I'm happy to improve how this is written.

What is happening:

I assume that the player is using these first two turns to build up an early estimate of alpha (P(C|C) - probability the opponent cooperates in turn k + 1 if the player cooperated in turn k) and beta (P(C|D) - probability the opponent cooperates in turn k + 1 if the player defected in turn k).

I use the opening player of the opponent (k = 1) to estimate alpha - ie I assume the opponent plays their opening round as if the player had cooperated in round k = 0 (a round that does not exist).

I use the second play of the opponent (k=2) to estimate beta as the player defects in round k=1.

The player does also defect in round k=2.

Thus, if round_number == 2: and if opponent.history[-1] == C: then the opponent cooperated as their opening move and so self.number_opponent_cooperations_in_response_to_C += 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks!

return D


if self.history[-2] == C and opponent.history[-1] == C:
self.number_opponent_cooperations_in_response_to_C += 1
if self.history[-2] == D and opponent.history[-1] == C:
self.number_opponent_cooperations_in_response_to_D += 1

alpha = (self.number_opponent_cooperations_in_response_to_C /
(self.cooperations + 1)) # Adding 1 to count for opening move
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused again here -- isn't the first move always a defection? Or is this a bug in the original implementation that we're preserving?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the confusion, this can probably be improved.

The adjustment here is to account for the "phantom" non existent round 0 cooperation by the player that is used to estimate alpha.

So for example, if the plays are:

[(D, C), (D, C)]

Then the opponent's first cooperation counts as a cooperation in response to the non existent cooperation of round 0. The total number of cooperations in response to a cooperation is 1. We need to take in to account that extra phantom cooperation to estimate the probability alpha=P(C|C) as 1 / 1 = 1.

This is all a best guess of course.

Copy link
Member

@marcharper marcharper Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense. Let's comment this well in the code.

beta = (self.number_opponent_cooperations_in_response_to_D /
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment staying that we're never dividing by zero here

(self.defections))

R, P, S, T = self.match_attributes["game"].RPST()
expected_value_of_cooperating = alpha * R + (1 - alpha) * S
expected_value_of_defecting = beta * T + (1 - beta) * P

if expected_value_of_cooperating > expected_value_of_defecting:
return C
if expected_value_of_cooperating < expected_value_of_defecting:
return D
return self.history[-1].flip()


class FirstByFeld(Player):
Expand Down Expand Up @@ -278,8 +352,6 @@ class FirstByGraaskamp(Player):
so it plays Tit For Tat. If not it cooperates and randomly defects every 5
to 15 moves.

# TODO Compare this to Fortran code.

Note that there is no information about 'Analogy' available thus Step 5 is
a "best possible" interpretation of the description in the paper.

Expand Down
27 changes: 10 additions & 17 deletions axelrod/tests/strategies/test_axelrod_first.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,50 +43,43 @@ def test_strategy(self):
self.versus_test(opponent, expected_actions=actions)


class TestRevisedDowning(TestPlayer):
class TestFirstByDowning(TestPlayer):

name = "Revised Downing: True"
player = axelrod.RevisedDowning
name = "First tournament by Downing"
player = axelrod.FirstByDowning
expected_classifier = {
"memory_depth": float("inf"),
"stochastic": False,
"makes_use_of": set(),
"makes_use_of": {"game"},
"long_run_time": False,
"inspects_source": False,
"manipulates_source": False,
"manipulates_state": False,
}

def test_strategy(self):
actions = [(C, C), (C, C), (C, C)]
actions = [(D, C), (D, C), (C, C)]
self.versus_test(axelrod.Cooperator(), expected_actions=actions)

actions = [(C, D), (C, D), (D, D)]
actions = [(D, D), (D, D), (D, D)]
self.versus_test(axelrod.Defector(), expected_actions=actions)

opponent = axelrod.MockPlayer(actions=[D, C, C])
actions = [(C, D), (C, C), (C, C), (C, D)]
actions = [(D, D), (D, C), (D, C), (D, D)]
self.versus_test(opponent, expected_actions=actions)

opponent = axelrod.MockPlayer(actions=[D, D, C])
actions = [(C, D), (C, D), (D, C), (D, D)]
actions = [(D, D), (D, D), (D, C), (D, D)]
self.versus_test(opponent, expected_actions=actions)

opponent = axelrod.MockPlayer(actions=[C, C, D, D, C, C])
actions = [(C, C), (C, C), (C, D), (C, D), (D, C), (D, C), (D, C)]
actions = [(D, C), (D, C), (C, D), (D, D), (D, C), (D, C), (D, C)]
self.versus_test(opponent, expected_actions=actions)

opponent = axelrod.MockPlayer(actions=[C, C, C, C, D, D])
actions = [(C, C), (C, C), (C, C), (C, C), (C, D), (C, D), (C, C)]
actions = [(D, C), (D, C), (C, C), (D, C), (D, D), (C, D), (D, C)]
self.versus_test(opponent, expected_actions=actions)

def test_not_revised(self):
# Test not revised
player = self.player(revised=False)
opponent = axelrod.Cooperator()
match = axelrod.Match((player, opponent), turns=2)
self.assertEqual(match.play(), [(D, C), (D, C)])


class TestFristByFeld(TestPlayer):
marcharper marked this conversation as resolved.
Show resolved Hide resolved

Expand Down
1 change: 1 addition & 0 deletions docs/reference/bibliography.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ documentation.
.. [Bendor1993] Bendor, Jonathan. "Uncertainty and the Evolution of Cooperation." The Journal of Conflict Resolution, 37(4), 709–734.
.. [Beaufils1997] Beaufils, B. and Delahaye, J. (1997). Our Meeting With Gradual: A Good Strategy For The Iterated Prisoner’s Dilemma. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.4041
.. [Berg2015] Berg, P. Van Den, & Weissing, F. J. (2015). The importance of mechanisms for the evolution of cooperation. Proceedings of the Royal Society B-Biological Sciences, 282.
.. [Downing1975] Downing, Leslie L. "The Prisoner's Dilemma game as a problem-solving phenomenon: An outcome maximization interpretation." Simulation & Games 6.4 (1975): 366-391.
.. [Eckhart2015] Eckhart Arnold (2016) CoopSim v0.9.9 beta 6. https://github.com/jecki/CoopSim/
.. [Frean1994] Frean, Marcus R. "The Prisoner's Dilemma without Synchrony." Proceedings: Biological Sciences, vol. 257, no. 1348, 1994, pp. 75–79. www.jstor.org/stable/50253.
.. [Harper2017] Harper, M., Knight, V., Jones, M., Koutsovoulos, G., Glynatsi, N. E., & Campbell, O. (2017) Reinforcement learning produces dominant strategies for the Iterated Prisoner’s Dilemma. PloS one. https://doi.org/10.1371/journal.pone.0188046
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/overview_of_strategies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ An indication is given as to whether or not this strategy is implemented in the
"Grudger", "James W Friedman", ":class:`Grudger <axelrod.strategies.grudger.Grudger>`"
"Davis", "Morton Davis", ":class:`Davis <axelrod.strategies.axelrod_first.FirstByDavis>`"
"Graaskamp", "Jim Graaskamp", ":class:`Graaskamp <axelrod.strategies.axelrod_first.FirstByGraaskamp>`"
"Downing", "Leslie Downing", ":class:`RevisedDowning <axelrod.strategies.axelrod_first.RevisedDowning>`"
"FirstByDowning", "Leslie Downing", ":class:`RevisedDowning <axelrod.strategies.axelrod_first.FirstByDowning>`"
"Feld", "Scott Feld", ":class:`Feld <axelrod.strategies.axelrod_first.FirstByFeld>`"
"Joss", "Johann Joss", ":class:`Joss <axelrod.strategies.axelrod_first.FirstByJoss>`"
"Tullock", "Gordon Tullock", ":class:`Tullock <axelrod.strategies.axelrod_first.FirstByTullock>`"
Expand Down