-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classifying Meta Players #343
Comments
Yeah, I'm happy with this. 👍 However, I have been thinking about cheaters etc for a while (since you, @meatballs and I chatted about this). I suggest we don't change the Similarly I'd suggest the Perhaps we could have another function that returns True if the strategy obeys the original Axelrod rules? |
(I mentioned this to @meatballs earlier, I think I'm coming around to what you were both saying, it's just taken me a little while :)) |
I like What the Meta strategies do is a bit different though since it is dynamic. I would not consider a pre-trained neural network to be a cheater, but if it trains on the fly on the output of other games in the same tournament it's not as straightforward IMO. |
Actually I've thought about it, and maybe they are not cheaters by the traditional Axelrod rules, because they don't actually know which strategies are in a given tournament, just the pool from which they are drawn. It's a fine line I guess. [Cross-ref #291.] |
Cool,
I would argue that it is a type of cheater as it
Do they though? They arguably just have a really good recall of how good other strategies would have done so far no? I'm perhaps forgetting, do the 3rd party players actually play (abeit in a simulated way) so as to modify the (simulated) opponent behaviour) or do they simply say what they would have done if faced with the opponent's history? If the second then they're just a clever way of keeping track of things... As far as this issue is concerned, I suggest:
What do you reckon? |
Yes, the Meta player does the following, for each member of the team:
So they are really just passing the opponent's state to a lot of other strategies, not actually invoking the opponent's strategy. For most players (the non-cheaters), this is effectively no different than passing in the opponent's history. And while it certainly looks like the meta player is observing matches between other players, it could simply aggregate all their strategy methods into a single method (its own). That may require "inspecting" the source of known third party players but not necessarily the actual opponent. This would only really be cheating if the Meta player used the source of an opponent with a previously unknown strategy -- one for which the code is not publicly known, which is not really the case for any of our strategies, but could be for a "black box" player. But presumably such a player would not be in the pool of strategies that the meta player can draw from. So I'm not sure that As for |
I agree, the fact that Meta Hunter has a subteam is really more a case of a well written piece of code than a necessity (the third party strategies are tokens of some sort really)
Hehe, lol, yeah I think I'm actually coming full circle. My main (recent) thoughts were about the persistant memory strategies in that I thought they would be considered cheaters by Axelrod's original rules so I thought best to differentiate cheating in terms of the following two things:
So perhaps |
I agree -- for the library's current purposes, Re: persistent memory, I think it's not against axelrod's rules if it's static. In other words, carrying in some precomputed thing such as an arbitrarily long string of C and D that one plays from is ok, because it's no different from a piece of code that generates the same string. Even a (trained) neural network is just a (potentially very complicated) function. As long as the strategy "starts from the same place" against each opponent, I think it's fine. Sharing state between rounds, or updating this string throughout the tournament, seems to be against the rules, as far as I understand them. That's what I understand persistent memory to be. It gets tricky when you start to think about preserving state between tournaments. How is that different from tweaking the strategy manually each time based on the last tournament's results? |
Good point re persistent memory. Not sure what you saying about clever use of python: you might just be On Sun, 4 Oct 2015 21:27 Marc Harper, PhD [email protected] wrote:
|
When I wrote MetaPlayers my intention was to save some hassle. I wanted to combine two or more strategies to get a better strategy, but did not want to copy the code around. That is all. I find the discussion around cheating more or less orthogonal to what the library was originally meant to do - although that is not a bad thing. Cheaters do things that are unexpected or even unwanted, but only because they can. Another way of looking at this is that there are design flaws in the library. If reproducing the Axelrod tournament is the goal and code tricks are considered a distraction (and I think ultimately they are), there are simple ways to make them impossible. |
Yeah, when I got my head around that I have to say I thought it was really nicely written, I'm not sure they do warrant a new classification. We have classified
I don't feel that it's necessarily been a big distraction. I think this ongoing discussion is just a bit of a hangover (perhaps on my part) from the move to classifiers #291 and the attempt to remove the classification responsibilities from the library (all strategies are just in I like cheaters, they're fun. I would argue that the library does not have a design flaw at all with regards to that: they're welcome, they just have to be easy to put to 1 side. Some people have contributed to the library only through cheater. I think they're a feature. The discussion in this issue is probably no longer about the As such I think there's not much of a problem here:
As discussed in a very very early PR to the library, I don't like the idea of 'banning' them through the use of a test or otherwise. It would only be a matter of time till someone does something cleverer than our test (perhaps?) and so we'd still need to carefully check through for PRs (so nothing is gained from that). |
#346 is a first attempt at this :) |
I no longer think the meta players are cheaters by the Axelrod rules -- if they observed completely third party games then they would be, but as we discussed above, they are really just asking the advice of other players regarding the history of play so far (and not even really that, as @langner explained above). I'm happy with #346 and we can deal with |
Cool, I suggest we can close this issue but will wait in case @langner would like to continue to chat :) |
I don't have anything more to add :) except maybe to clarify what I meant with specifics. For example, if we had chosen to pass only the opponent's history to the |
It is in scope now though, since the library is supposed to be the de facto library for IPD research, and people do study the case where the source code is available. Sorry to open this can of worms again! |
Hmmm... yes, good point. In that case I suppose it is. Can you point me to some interesting references? |
I think that the MetaPlayers could be considered cheaters by the standard Axelrod rules because they effectively observe the outcomes of third party plays.
I propose a new classifier dimension, something like "observes_others" or "uses_outside_observations".
The text was updated successfully, but these errors were encountered: