Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classifying Meta Players #343

Closed
marcharper opened this issue Oct 4, 2015 · 18 comments
Closed

Classifying Meta Players #343

marcharper opened this issue Oct 4, 2015 · 18 comments
Assignees

Comments

@marcharper
Copy link
Member

I think that the MetaPlayers could be considered cheaters by the standard Axelrod rules because they effectively observe the outcomes of third party plays.

I propose a new classifier dimension, something like "observes_others" or "uses_outside_observations".

@drvinceknight
Copy link
Member

Yeah, I'm happy with this. 👍

However, I have been thinking about cheaters etc for a while (since you, @meatballs and I chatted about this). I suggest we don't change the is_cheater definition and keep that strictly to catch stuff that 'messes with code' (as is current).

Similarly I'd suggest the persistant_memory strategies (the ones that use a pesistant set of q values or neural network etc...) would not return True for is_cheater.

Perhaps we could have another function that returns True if the strategy obeys the original Axelrod rules? obey_axelrod

@drvinceknight
Copy link
Member

(I mentioned this to @meatballs earlier, I think I'm coming around to what you were both saying, it's just taken me a little while :))

@marcharper
Copy link
Member Author

I like obey_axelrod. Generally the literature has not well-explored the possibilities of outside observations being incorporated into a strategy.

What the Meta strategies do is a bit different though since it is dynamic. I would not consider a pre-trained neural network to be a cheater, but if it trains on the fly on the output of other games in the same tournament it's not as straightforward IMO.

@marcharper
Copy link
Member Author

Actually I've thought about it, and maybe they are not cheaters by the traditional Axelrod rules, because they don't actually know which strategies are in a given tournament, just the pool from which they are drawn. It's a fine line I guess.

[Cross-ref #291.]

@drvinceknight
Copy link
Member

I like obey_axelrod. Generally the literature has not well-explored the possibilities of outside observations being incorporated into a strategy.

Cool, obey_axelrod, should be simple enough to implement right?

I would not consider a pre-trained neural network to be a cheater, but if it trains on the fly on the output of other games in the same tournament it's not as straightforward IMO.

I would argue that it is a type of cheater as it observes_others. Ultimately I think the fact that this is subjective is why it's better to just define a cheater as something that messes with Python. (Having said that do we have to rethink Darwin?)

These strategies also come really close to "inspecting the source" -- while they don't actually read the opponent's source, they definitely use it for a simulation of sorts (against third party players). In other words, the Meta strategies sort of "access the source of opponents and third parties", or if you will, read/copy the opponent's "binary".

Do they though? They arguably just have a really good recall of how good other strategies would have done so far no? I'm perhaps forgetting, do the 3rd party players actually play (abeit in a simulated way) so as to modify the (simulated) opponent behaviour) or do they simply say what they would have done if faced with the opponent's history? If the second then they're just a clever way of keeping track of things...

As far as this issue is concerned, I suggest:

  • Going with the classification dimension you suggest
  • Implementing an obey_axelrod function
  • Putting in the documentation that a cheater from the point of view of the library is something that makes use of python tricks? (so the meta and persistant_memory strategies would not be considered cheaters)

What do you reckon?

@marcharper
Copy link
Member Author

do they simply say what they would have done if faced with the opponent's history?

Yes, the Meta player does the following, for each member of the team:

team_member.strategy(opponent)

So they are really just passing the opponent's state to a lot of other strategies, not actually invoking the opponent's strategy. For most players (the non-cheaters), this is effectively no different than passing in the opponent's history.

And while it certainly looks like the meta player is observing matches between other players, it could simply aggregate all their strategy methods into a single method (its own). That may require "inspecting" the source of known third party players but not necessarily the actual opponent. This would only really be cheating if the Meta player used the source of an opponent with a previously unknown strategy -- one for which the code is not publicly known, which is not really the case for any of our strategies, but could be for a "black box" player. But presumably such a player would not be in the pool of strategies that the meta player can draw from. So I'm not sure that observes_others makes sense in this case, and I don't think it breaks the Axelrod rules.

As for obeys_axerlod, right now we have basically that obeys_axelrod == not is_cheater. Note that the meta strategies use is_cheater to filter their teams already. So we could just change the name of is_cheater and avoid the discussion of whether something cheats or not within the library itself (which you've advocated in the past).

@drvinceknight
Copy link
Member

And while it certainly looks like the meta player is observing matches between other players, it could simply aggregate all their strategy methods into a single method (its own). That may require "inspecting" the source of known third party players but not necessarily the actual opponent.

I agree, the fact that Meta Hunter has a subteam is really more a case of a well written piece of code than a necessity (the third party strategies are tokens of some sort really)

So we could just change the name of is_cheater and avoid the discussion of whether something cheats or not within the library itself (which you've advocated in the past).

Hehe, lol, yeah I think I'm actually coming full circle. My main (recent) thoughts were about the persistant memory strategies in that I thought they would be considered cheaters by Axelrod's original rules so I thought best to differentiate cheating in terms of the following two things:

  • Clever use of python.
  • Something that does not make use of a python trick/shanigan but would not have been considered ok in Axelrod's original tournament.

So perhaps obey_axelrod need only be different from not is_cheater when they (the persistant memory strategies) come in? OR (as I seem to be alone in thinking they're cheaters) perhaps in a much more non subjective way we simply go with your suggestion of is_cheater being renamed (I think I like this: we're making no decisions just going along with Axelrod's original rules: which we should then write down in the docs)...

@marcharper
Copy link
Member Author

I agree -- for the library's current purposes, obeys_axelrod seems good enough, and we can modify it later as needed. Clever use of python is fine IMO, as long as it's within the rules.

Re: persistent memory, I think it's not against axelrod's rules if it's static. In other words, carrying in some precomputed thing such as an arbitrarily long string of C and D that one plays from is ok, because it's no different from a piece of code that generates the same string. Even a (trained) neural network is just a (potentially very complicated) function. As long as the strategy "starts from the same place" against each opponent, I think it's fine.

Sharing state between rounds, or updating this string throughout the tournament, seems to be against the rules, as far as I understand them. That's what I understand persistent memory to be. It gets tricky when you start to think about preserving state between tournaments. How is that different from tweaking the strategy manually each time based on the last tournament's results?

@drvinceknight
Copy link
Member

Good point re persistent memory.

Not sure what you saying about clever use of python: you might just be
saying what we've always said which is to cheat away but that this needs to
classified properly (that's what I meant by clever use of python which I
know realise is ambiguous).

On Sun, 4 Oct 2015 21:27 Marc Harper, PhD [email protected] wrote:

I agree -- for the library's current purposes, obeys_axelrod seems good
enough, and we can modify it later as needed. Clever use of python is fine
IMO, as long as it's within the rules.

Re: persistent memory, I think it's not against axelrod's rules if it's
static. In other words, carrying in some precomputed thing such as an
arbitrarily long string of C and D that one plays from is ok, because it's
no different from a piece of code that generates the same string. Even a
(trained) neural network is just a (potentially very complicated) function.
As long as the strategy "starts from the same place" against each opponent,
I think it's fine.

Sharing state between rounds, or updating this string throughout the
tournament, seems to be against the rules, as far as I understand them.
That's what I understand persistent memory to be. It gets tricky when you
start to think about preserving state between tournaments. How is that
different from tweaking the strategy manually each time based on the last
tournament's results?


Reply to this email directly or view it on GitHub
#343 (comment)
.

@langner
Copy link
Member

langner commented Oct 5, 2015

When I wrote MetaPlayers my intention was to save some hassle. I wanted to combine two or more strategies to get a better strategy, but did not want to copy the code around. That is all.

I find the discussion around cheating more or less orthogonal to what the library was originally meant to do - although that is not a bad thing. Cheaters do things that are unexpected or even unwanted, but only because they can. Another way of looking at this is that there are design flaws in the library. If reproducing the Axelrod tournament is the goal and code tricks are considered a distraction (and I think ultimately they are), there are simple ways to make them impossible.

@drvinceknight
Copy link
Member

When I wrote MetaPlayers my intention was to save some hassle. I wanted to combine two or more strategies to get a better strategy, but did not want to copy the code around. That is all.

Yeah, when I got my head around that I have to say I thought it was really nicely written, I'm not sure they do warrant a new classification. We have classified MindReader with 'inspects_source': True, # Finds out what opponent will do to take up the fact that it observes the future plays of it's opponent.

I find the discussion around cheating more or less orthogonal to what the library was originally meant to do - although that is not a bad thing. Cheaters do things that are unexpected or even unwanted, but only because they can. Another way of looking at this is that there are design flaws in the library. If reproducing the Axelrod tournament is the goal and code tricks are considered a distraction (and I think ultimately they are), there are simple ways to make them impossible.

I don't feel that it's necessarily been a big distraction. I think this ongoing discussion is just a bit of a hangover (perhaps on my part) from the move to classifiers #291 and the attempt to remove the classification responsibilities from the library (all strategies are just in axelrod.strategies although some hopefully helpful tools remain).

I like cheaters, they're fun. I would argue that the library does not have a design flaw at all with regards to that: they're welcome, they just have to be easy to put to 1 side. Some people have contributed to the library only through cheater. I think they're a feature.

The discussion in this issue is probably no longer about the hunter strategies but about the proposed persistant_memory strategies. I think that in that particular case I've more or less (slowly) come around to the consensus which is that they're not against Axelrod's rules (so it's my fault for :)).

As such I think there's not much of a problem here:

  1. Change is_cheater to obey_axelrod so that there's no ambiguity/subjectivity as to what that means (I think this is a nice outcome from the discussion).
  2. Clarify that in the documentation (ongoing Documentation revamp #299).
  3. Keep the current cheaters in and welcome more if they turn up (we had one at the sprint): they're fun (@langner and Jason's original back and forth with these was fun to watch). As long as they're put to the side they do no harm. Ultimately they meet the 3rd goal of the library: https://github.com/Axelrod-Python/Axelrod#axelrod

As discussed in a very very early PR to the library, I don't like the idea of 'banning' them through the use of a test or otherwise. It would only be a matter of time till someone does something cleverer than our test (perhaps?) and so we'd still need to carefully check through for PRs (so nothing is gained from that).

@drvinceknight
Copy link
Member

#346 is a first attempt at this :)

@marcharper
Copy link
Member Author

I no longer think the meta players are cheaters by the Axelrod rules -- if they observed completely third party games then they would be, but as we discussed above, they are really just asking the advice of other players regarding the history of play so far (and not even really that, as @langner explained above). I'm happy with #346 and we can deal with persistent_memory once something actually lands in the library.

@drvinceknight
Copy link
Member

Cool, I suggest we can close this issue but will wait in case @langner would like to continue to chat :)

@langner
Copy link
Member

langner commented Oct 5, 2015

I don't have anything more to add :) except maybe to clarify what I meant with specifics. For example, if we had chosen to pass only the opponent's history to the strategy method... then most of the current "cheating" would not exist and we would not be having these conversations. The classifications recently added would also be simpler, I think. Still the library would reproduce the original tournament and we would be happy... that is the kind of design decision I was referring to. It's not that mind reading and bending is not fun (hell, I wrote some of those cheaters), it's just that they're irrelevant to what the library is supposed to be.

@marcharper
Copy link
Member Author

It is in scope now though, since the library is supposed to be the de facto library for IPD research, and people do study the case where the source code is available. Sorry to open this can of worms again!

@langner
Copy link
Member

langner commented Oct 5, 2015

Hmmm... yes, good point. In that case I suppose it is. Can you point me to some interesting references?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants