-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dbs #976
Dbs #976
Conversation
The failed test for Py 3.6 was due to hypothesis, just restarted... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! On the first pass there are some formatting issues, please address those and I'll make another pass with the reference to check all the logic.
axelrod/strategies/dbs.py
Outdated
C, D = Actions.C, Actions.D | ||
|
||
def action_to_int(action): | ||
return (1 if action==C else 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion for readability:
if action == C:
return 1
return 0
Or use a dictionary:
d = {C: 1, D: 0}
return d[action]
axelrod/strategies/dbs.py
Outdated
class DBS(Player): | ||
""" | ||
Desired Belief Strategy as described in: | ||
Accident or Intention: That Is the Question (in the Noisy Iterated Prisoner's Dilemma) by Tsz-Chiu Au and Dana Nau from University of Maryland |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for including the reference. We have a citation file and a specific format we prefer, can you try to convert? We can help...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the full citation is:
T.-C. Au and D. S. Nau. Accident or intention: That is the question (in the iterated prisoner’s dilemma). In Proc. Int. Conf. Auton. Agents and Multiagent Syst. (AAMAS), pp. 561–568, 2006
I guess it becomes [Au2006] in the specific format ? I'll add it to the bibliography.rst file and change the description
axelrod/strategies/dbs.py
Outdated
Accident or Intention: That Is the Question (in the Noisy Iterated Prisoner's Dilemma) by Tsz-Chiu Au and Dana Nau from University of Maryland | ||
http://www.cs.utexas.edu/%7Echiu/papers/Au06NoisyIPD.pdf | ||
|
||
A strategy that learns the opponent's strategy, and use symbolic noise de- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we wrap lines at 80 and not split words with hyphens?
axelrod/strategies/dbs.py
Outdated
Default values for the parameters are the suggested values in the article | ||
When more noise you can try to diminish violation_threshold and rejection_threshold | ||
|
||
Parameters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see our formatting for other strategies for the parameter docs (and thanks for including them)
axelrod/strategies/dbs.py
Outdated
# by the opponent; else G[i]=0 | ||
# F[i] = 1 if cond j was True at turn i-1; else G[i]=0 | ||
# initial hypothesized policy is TitForTat | ||
self.history_by_cond[(C,C)]=([1],[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use PEP8's recommended formatting. For this line it would be:
self.history_by_cond[(C, C)] = ([1], [1])
(and similarly below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I'll review the code with PEP8's rules
axelrod/strategies/dbs.py
Outdated
return MoveGen((self.history[-1],opponent.history[-1]),self.Pi,depth_search_tree=self.tree_depth) | ||
|
||
|
||
# Policy as defined in the article, i.e. a set of (last_move,p) where p is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the Policy class above the strategy?
axelrod/strategies/dbs.py
Outdated
# Policy as defined in the article, i.e. a set of (last_move,p) where p is | ||
# the probability to cooperate in the next move considering last move | ||
class Policy(dict): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we write a good docstring and move the comments just above into the docstring?
axelrod/strategies/dbs.py
Outdated
# the probability to cooperate in the next move considering last move | ||
class Policy(dict): | ||
|
||
def __init__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the default __init__
in this case so these two lines are unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have deleted the Policy class. Policies are now represented by a simple dictionary, and there is a create_policy methods to instantiate them.
axelrod/strategies/dbs.py
Outdated
@classmethod | ||
def prob_policy(cls,pCC,pCD,pDC,pDD): | ||
pol = cls() | ||
pol[(C,C)]=pCC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pol[(C, C) = pCC
axelrod/strategies/dbs.py
Outdated
return 1 | ||
|
||
# tree search function (minimax search procedure) | ||
def F(begin_node,policy,max_depth): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's give F
a good name. If it's called F
in the reference then let's say so in the doc string
This looks great @edouardArgenson, thanks for the contribution. As well as all of @marcharper's comments this is also failing on coverage (which checks that every line of code in the source files are hit during testing). Currently the following lines are not tested:
If we can assist with writing tests for those please let us know. 👍 |
…method create_policy that returns a dict
@drvinceknight ok I'll look for tests that cover those lines |
Hi @drvinceknight, I added a test that covers the following: But I have trouble writing a test for the last 2: |
@marcharper I've pushed some changes, the code should now respect PEP8 and the formatting issues you've raised |
No problemo :) This is the kind of thing that would work: import axelrod
+import unittest
from .test_player import TestPlayer
C, D = axelrod.Actions.C, axelrod.Actions.D
+class TestNode(unittest.TestCase):
+ """
+ Tests for the base class
+ """
+ node = axelrod.dbs.Node()
+
+ def test_get_siblings(self):
+ with self.assertRaises(NotImplementedError) as context:
+ self.node.get_siblings()
+
+ def test_is_stochastic(self):
+ with self.assertRaises(NotImplementedError) as context:
+ self.node.is_stochastic()
+
+
+ So there I'm suggesting testing the base abstract class directly. I've checked those lines and they bump the coverage up to 100% 👍 |
There's a failure on the documentation. This should be:
The I also noticed:
Could you use |
I made some changes on that and pushed it |
axelrod/strategies/dbs.py
Outdated
Desired Belief Strategy as described in [Au2006]_ | ||
http://www.cs.utexas.edu/%7Echiu/papers/Au06NoisyIPD.pdf | ||
|
||
A strategy that learns the opponent's strategy, and use symbolic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: 'uses' rather than 'use'
axelrod/strategies/dbs.py
Outdated
article. When noise increases you can try to diminish | ||
violation_threshold and rejection_threshold | ||
|
||
Parameters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring format for these parameters isn't quite correct. We use the numpy standard described at https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
e.g. The first parameter description should be:
discount_factor: float
used when computing discounted frequencies to learn opponent's strategy
Must be between 0 and 1. The default is 0.75
axelrod/strategies/dbs.py
Outdated
opposite_action = 1 | ||
k = 1 | ||
count = 0 | ||
# We iterates on the history, while we do not encounter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: 'iterate' rather than 'iterates'
axelrod/strategies/dbs.py
Outdated
k = 1 | ||
count = 0 | ||
# We iterates on the history, while we do not encounter | ||
# counter-exemples of r_plus, i.e. while we do not encounter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: 'example' rather than 'exemple'
axelrod/strategies/dbs.py
Outdated
== opposite_action | ||
and self.history_by_cond[r_plus[0]][1][1:][-k] == 1) | ||
): | ||
# We count every occurence of r_plus in history |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: 'occurrence' rather than 'occurence'
axelrod/strategies/dbs.py
Outdated
def should_demote(self, r_minus, violation_threshold=4): | ||
if(self.violation_counts[r_minus[0]] >= violation_threshold): | ||
return True | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: This could be done in one line with:
return(self.violation_counts[r_minus[0]] >= violation_threshold)
axelrod/strategies/dbs.py
Outdated
def is_stochastic(self): | ||
return False | ||
|
||
def get_value(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: this might be better with a dict mapping a tuple of action1 and action2 to a value:
values = {
(C, C): 3,
(C, D): 0,
(D, C): 5,
(D, D): 1
}
return values[(action1, action2)]
axelrod/strategies/dbs.py
Outdated
|
||
Parameters | ||
---------- | ||
discount_factor : float, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of optional
, can we put in the defaults from init into the docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The defaults from init are currently specified in the description of each parameters. I can change that, but here I followed the numpy standard suggested by meatballs https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
axelrod/strategies/dbs.py
Outdated
where p is the probability to cooperate after prev_move, | ||
where prev_move can be (C, C), (C, D), (D, C) or (D, D) | ||
""" | ||
pol = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine as is but you might consider:
pol = {(C, C): pCC, (C, D): pCD, (D, C): pDC, (D, D): pDD}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's cleaner
axelrod/strategies/dbs.py
Outdated
# The stochastic node value is the expected values of siblings | ||
node_value = ( | ||
begin_node.pC * minimax_tree_search( | ||
siblings[0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent should be <---- (4 spaces deeper than the line above), or:
node_value = (
begin_node.pC * minimax_tree_search(
siblings[0], policy, max_depth),
...
axelrod/strategies/dbs.py
Outdated
max_depth) | ||
) | ||
return node_value | ||
else: # determinist node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deterministic
axelrod/strategies/dbs.py
Outdated
return node_value | ||
else: # determinist node | ||
if begin_node.depth == max_depth: | ||
# this is an end node, we just return its outcome value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For comments please capitalize and use complete sentences when possible
@@ -0,0 +1,108 @@ | |||
"""Tests DBS strategy.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have tests for the tree search functions as well?
axelrod/strategies/dbs.py
Outdated
|
||
# default opponent's policy is TitForTat | ||
self.Rd = create_policy(1, 1, 0, 0) | ||
self.Rc = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the short variables can you explain with a comment what they are? I assume these are from the paper but we don't want anyone reading the code to have to read the paper to understand the algorithm.
axelrod/strategies/dbs.py
Outdated
self.violation_threshold = violation_threshold | ||
self.promotion_threshold = promotion_threshold | ||
self.tree_depth = tree_depth | ||
self.v = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment explaining the the variable v
axelrod/strategies/dbs.py
Outdated
self.history_by_cond[(D, D)] = ([0], [1]) | ||
|
||
def should_promote(self, r_plus, promotion_threshold=3): | ||
if r_plus[1] == C: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function needs a docstring
axelrod/strategies/dbs.py
Outdated
return False | ||
|
||
def should_demote(self, r_minus, violation_threshold=4): | ||
return (self.violation_counts[r_minus[0]] >= violation_threshold) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring
axelrod/strategies/dbs.py
Outdated
def should_demote(self, r_minus, violation_threshold=4): | ||
return (self.violation_counts[r_minus[0]] >= violation_threshold) | ||
|
||
def update_history_by_cond(self, opponent_history): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring
axelrod/strategies/dbs.py
Outdated
F.append(0) | ||
|
||
def compute_prob_rule(self, outcome, alpha): | ||
G = self.history_by_cond[outcome][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring
axelrod/strategies/dbs.py
Outdated
if (self.history_by_cond[r_plus[0]][1][1:][-k] == 1): | ||
count += 1 | ||
k += 1 | ||
if(count >= promotion_threshold): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after if
F = self.history_by_cond[outcome][1] | ||
discounted_g = 0 | ||
discounted_f = 0 | ||
alpha_k = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does alpha_k ever change? Seems redundant below if it is always equal to 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes alpha_k is iterated in the loop:
alpha_k = alpha * alpha_k
axelrod/strategies/dbs.py
Outdated
discounted_f = 0 | ||
alpha_k = 1 | ||
for g,f in zip(G[::-1], F[::-1]): | ||
discounted_g += alpha_k*g |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spaces around operators please
axelrod/strategies/dbs.py
Outdated
super().__init__() | ||
|
||
# default opponent's policy is TitForTat | ||
self.Rd = create_policy(1, 1, 0, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've called policies "four vectors" elsewhere (in strategies/memoryone.py), are they always dictionaries of size 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can be of size 0, 1, 2, 3 or 4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question: maybe I should put the definition of create_policy inside the DBS class, as it is only used inside the class ?
axelrod/strategies/dbs.py
Outdated
return True | ||
|
||
|
||
class DeterministNode(Node): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename to DeterministicNode
Making good progress -- it needs more comments and more tests, and there are some minor style issues. |
axelrod/strategies/dbs.py
Outdated
violation_threshold and rejection_threshold | ||
|
||
Parameters | ||
---------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests are failing because we don't quite use numpy notation exactly for strategies. In this case we need to remove the heading. So not:
Parameters
----------------
but simply
Parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Thanks, working on all that |
…e to DeterministicNode
…rmats comments and parameters
… and is_stochastic functions. Corrects syntax error when initializing history_by_cond in constructor
@edouardArgenson there is now a merge conflict in the |
merged conflict resolved ;) |
Hello,
I have written code for one of the desired new strategies, (DBS, DesiredBeliefStrategy).
ref: http://www.cs.utexas.edu/%7Echiu/papers/Au06NoisyIPD.pdf
I've followed the algorithm description in the article, and used the same variable names. When some details where not specified I made a few tests to ensure I had a good implementation.
I've checked that it work as expected in some situations, and that it performs good in noisy tournaments as it should.
I've written and run tests, but I don't really know if those are sufficient. Strategy tests, unit tests and integration tests are working, but I get warnings with strategy test.
This is my first contribution to an open source project so I don't really know if I need to check or detail more things, please let me know if I'm missing some points or anything