google-deepmind · AussieSeaweed · Dec 7, 2024 · Dec 7, 2024 · Dec 7, 2024
diff --git a/docs/games.md b/docs/games.md
@@ -57,6 +57,7 @@ Status                                                                 | Game
 🟢                                                                      | Mean Field Game: linear-quadratic                                                              | n/a     | ❌              | ✅            | Players are uniformly distributed and are then incentivized to gather at the same point (The lower the distanbce wrt. the distribution mean position, the higher the reward). A mean-reverting term pushes the players towards the distribution, a gaussian noise term perturbs them. The players' actions alter their states linearly (alpha * a * dt) and the cost thereof is quadratic (K * a^2 * dt), hence the name. There exists an exact, closed form solution for the fully continuous version of this game. References: [Perrin & al. 2019](https://arxiv.org/abs/2007.03458).
 🟢                                                                      | Mean Field Game: predator prey                                                                 | n/a     | n/a            | n/a          | References: [Scaling up Mean Field Games with Online Mirror Descent](https://arxiv.org/abs/2103.00623), [Scalable Deep Reinforcement Learning Algorithms for Mean Field Games](https://arxiv.org/abs/2203.11973), [Learning in Mean Field Games: A Survey](https://arxiv.org/abs/2205.12944).
 🟢                                                                      | Mean Field Game: routing                                                                       | n/a     | ❌              | ✅            | Representative player chooses at each node where they go. They has an origin, a destination and a departure time and chooses their route to minimize their travel time. Time spent on each link is a function of the distribution of players on the link when the player reaches the link. References: [Cabannes et. al. '21, Solving N-player dynamic routing games with congestion: a mean field approach](https://arxiv.org/pdf/2110.11943.pdf).
+🔶                                                                      | [m,n,k-game](https://en.wikipedia.org/wiki/M,n,k-game)                                         | 2       | ✅              | ✅            | Players place tokens to try and form a k-in-a-row pattern in an m-by-n board.
 🔶                                                                      | [Morpion Solitaire (4D)](https://en.wikipedia.org/wiki/Join_five)                              | 1       | ✅              | ✅            | A single player game where player aims to maximize lines drawn on a grid, under certain limitations.
 🟢                                                                      | Negotiation                                                                                    | 2       | ❌              | ❌            | Agents with different utilities must negotiate an allocation of resources. References: [Lewis et al. '17](https://arxiv.org/abs/1706.05125). [Cao et al. '18](https://arxiv.org/abs/1804.03980).
 🔶                                                                      | [Nim](https://en.wikipedia.org/wiki/Nim)                                                       | 2       | ✅              | ✅            | Two agents take objects from distinct piles trying to either avoid taking the last one or take it. Any positive number of objects can be taken on each turn given they all come from the same pile.

diff --git a/open_spiel/games/CMakeLists.txt b/open_spiel/games/CMakeLists.txt
@@ -120,6 +120,8 @@ set(GAME_SOURCES
   mfg/dynamic_routing.h
   mfg/garnet.cc
   mfg/garnet.h
+  mnk/mnk.cc
+  mnk/mnk.h
   morpion_solitaire/morpion_solitaire.cc
   morpion_solitaire/morpion_solitaire.h
   negotiation/negotiation.cc
@@ -509,6 +511,10 @@ add_executable(matrix_games_test matrix_games/matrix_games_test.cc ${OPEN_SPIEL_
                $<TARGET_OBJECTS:tests>)
 add_test(matrix_games_test matrix_games_test)
 
+add_executable(mnk_test mnk/mnk_test.cc ${OPEN_SPIEL_OBJECTS}
+               $<TARGET_OBJECTS:tests>)
+add_test(mnk_test mnk_test)
+
 add_executable(morpion_solitaire_test morpion_solitaire/morpion_solitaire_test.cc ${OPEN_SPIEL_OBJECTS}
         $<TARGET_OBJECTS:tests>)
 add_test(morpion_solitaire_test morpion_solitaire_test)

diff --git a/open_spiel/games/mnk/mnk.cc b/open_spiel/games/mnk/mnk.cc
@@ -0,0 +1,262 @@
+// Copyright 2019 DeepMind Technologies Limited
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//      http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "open_spiel/games/mnk/mnk.h"
+
+#include <algorithm>
+#include <memory>
+#include <utility>
+#include <vector>
+
+#include "open_spiel/spiel_utils.h"
+#include "open_spiel/utils/tensor_view.h"
+
+namespace open_spiel {
+namespace mnk {
+namespace {
+
+// Facts about the game.
+const GameType kGameType{
+    /*short_name=*/"mnk",
+    /*long_name=*/"m,n,k-game",
+    GameType::Dynamics::kSequential,
+    GameType::ChanceMode::kDeterministic,
+    GameType::Information::kPerfectInformation,
+    GameType::Utility::kZeroSum,
+    GameType::RewardModel::kTerminal,
+    /*max_num_players=*/2,
+    /*min_num_players=*/2,
+    /*provides_information_state_string=*/true,
+    /*provides_information_state_tensor=*/false,
+    /*provides_observation_string=*/true,
+    /*provides_observation_tensor=*/true,
+    /*parameter_specification=*/
+    {{"m", GameParameter(kDefaultNumCols)},
+     {"n", GameParameter(kDefaultNumRows)},
+     {"k", GameParameter(kDefaultNumInARow)}}
+};
+
+std::shared_ptr<const Game> Factory(const GameParameters& params) {
+  return std::shared_ptr<const Game>(new MNKGame(params));
+}
+
+REGISTER_SPIEL_GAME(kGameType, Factory);
+
+RegisterSingleTensorObserver single_tensor(kGameType.short_name);
+
+}  // namespace
+
+CellState PlayerToState(Player player) {
+  switch (player) {
+    case 0:
+      return CellState::kCross;
+    case 1:
+      return CellState::kNought;
+    default:
+      SpielFatalError(absl::StrCat("Invalid player id ", player));
+      return CellState::kEmpty;
+  }
+}
+
+std::string StateToString(CellState state) {
+  switch (state) {
+    case CellState::kEmpty:
+      return ".";
+    case CellState::kNought:
+      return "o";
+    case CellState::kCross:
+      return "x";
+    default:
+      SpielFatalError("Unknown state.");
+  }
+}
+
+bool BoardHasLine(const std::vector<std::vector<CellState>>& board,
+                  const Player player,
+                  int k,
+                  int r,
+                  int c,
+                  int dr,
+                  int dc) {
+  CellState state = PlayerToState(player);
+  int count = 0;
+
+  for (int i = 0;
+       i < k && 0 <= r && r < board.size() && 0 <= c && c < board[r].size();
+       ++i, r += dr, c += dc)
+    count += board[r][c] == state;
+
+  return count == k;
+}
+
+bool BoardHasLine(const std::vector<std::vector<CellState>>& board,
+                  const Player player,
+                  int k) {
+  for (int r = 0; r < board.size(); ++r)
+    for (int c = 0; c < board[r].size(); ++c)
+      for (int dr = -1; dr <= 1; ++dr)
+        for (int dc = -1; dc <= 1; ++dc)
+          if (dr || dc)
+            if (BoardHasLine(board, player, k, r, c, dr, dc))
+              return true;
+
+  return false;
+}
+
+void MNKState::DoApplyAction(Action move) {
+  auto [row, column] = ActionToCoordinates(move);
+  SPIEL_CHECK_EQ(board_[row][column], CellState::kEmpty);
+  board_[row][column] = PlayerToState(CurrentPlayer());
+  if (HasLine(current_player_)) {
+    outcome_ = current_player_;
+  }
+  current_player_ = 1 - current_player_;
+  num_moves_ += 1;
+}
+
+std::pair<int, int> MNKState::ActionToCoordinates(Action move) const {
+  return {move / NumCols(), move % NumCols()};
+}
+
+int MNKState::CoordinatesToAction(int row, int column) const {
+  return row * NumCols() + column;
+}
+
+int MNKState::NumRows() const {
+  return std::static_pointer_cast<const MNKGame>(game_)->NumRows();
+}
+
+int MNKState::NumCols() const {
+  return std::static_pointer_cast<const MNKGame>(game_)->NumCols();
+}
+
+int MNKState::NumCells() const {
+  return std::static_pointer_cast<const MNKGame>(game_)->NumCells();
+}
+
+int MNKState::NumInARow() const {
+  return std::static_pointer_cast<const MNKGame>(game_)->NumInARow();
+}
+
+std::vector<Action> MNKState::LegalActions() const {
+  if (IsTerminal())
+    return {};
+
+  // Can move in any empty cell.
+  std::vector<Action> moves;
+
+  for (int r = 0; r < board_.size(); ++r)
+    for (int c = 0; c < board_[r].size(); ++c)
+      if (board_[r][c] == CellState::kEmpty)
+        moves.push_back(CoordinatesToAction(r, c));
+
+  return moves;
+}
+
+std::string MNKState::ActionToString(Player player,
+                                           Action action_id) const {
+  return game_->ActionToString(player, action_id);
+}
+
+bool MNKState::HasLine(Player player) const {
+  return BoardHasLine(board_, player, NumInARow());
+}
+
+bool MNKState::IsFull() const { return num_moves_ == NumCells(); }
+
+MNKState::MNKState(std::shared_ptr<const Game> game) : State(game) {
+  board_.resize(NumRows());
+
+  for (int r = 0; r < board_.size(); ++r)
+    board_[r].resize(NumCols(), CellState::kEmpty);
+}
+
+std::string MNKState::ToString() const {
+  std::string str;
+  for (int r = 0; r < NumRows(); ++r) {
+    for (int c = 0; c < NumCols(); ++c) {
+      absl::StrAppend(&str, StateToString(BoardAt(r, c)));
+    }
+    if (r < (NumRows() - 1)) {
+      absl::StrAppend(&str, "\n");
+    }
+  }
+  return str;
+}
+
+bool MNKState::IsTerminal() const {
+  return outcome_ != kInvalidPlayer || IsFull();
+}
+
+std::vector<double> MNKState::Returns() const {
+  if (HasLine(Player{0})) {
+    return {1.0, -1.0};
+  } else if (HasLine(Player{1})) {
+    return {-1.0, 1.0};
+  } else {
+    return {0.0, 0.0};
+  }
+}
+
+std::string MNKState::InformationStateString(Player player) const {
+  SPIEL_CHECK_GE(player, 0);
+  SPIEL_CHECK_LT(player, num_players_);
+  return HistoryString();
+}
+
+std::string MNKState::ObservationString(Player player) const {
+  SPIEL_CHECK_GE(player, 0);
+  SPIEL_CHECK_LT(player, num_players_);
+  return ToString();
+}
+
+void MNKState::ObservationTensor(Player player,
+                                 absl::Span<float> values) const {
+  SPIEL_CHECK_GE(player, 0);
+  SPIEL_CHECK_LT(player, num_players_);
+
+  for (int r = 0; r < NumRows(); ++r) {
+    for (int c = 0; c < NumCols(); ++c) {
+      int i = static_cast<int>(board_[r][c]);
+      int j = CoordinatesToAction(r, c);
+      values[i * NumCells() + j] = 1.0;
+    }
+  }
+}
+
+void MNKState::UndoAction(Player player, Action move) {
+  auto [r, c] = ActionToCoordinates(move);
+  board_[r][c] = CellState::kEmpty;
+  current_player_ = player;
+  outcome_ = kInvalidPlayer;
+  num_moves_ -= 1;
+  history_.pop_back();
+  --move_number_;
+}
+
+std::unique_ptr<State> MNKState::Clone() const {
+  return std::unique_ptr<State>(new MNKState(*this));
+}
+
+std::string MNKGame::ActionToString(Player player,
+                                    Action action_id) const {
+  return absl::StrCat(StateToString(PlayerToState(player)), "(",
+                      action_id / NumCols(), ",", action_id % NumCols(), ")");
+}
+
+MNKGame::MNKGame(const GameParameters& params)
+    : Game(kGameType, params) {}
+
+}  // namespace mnk
+}  // namespace open_spiel