Consider a game with
Each player
Each player
The goal of each player is to minimize their cost by selecting an optimal action:
In the static setting, the state of each player is fixed and does not change over time. Therefore, actions are chosen once, and the population profile remains static.
-
The main difference between the static and dynamic settings is that in the dynamic setting, the state of each player evolves over time based on the actions taken. As a result, the population profile
$\underline{a}$ must adapt to these state changes. This dynamic aspect will be explored in the next section. -
In this setup, we consider pure strategies, meaning that the actions chosen by each player are deterministic. No randomness is involved in the decision-making process of any player. This contrasts with mixed strategies, where players would choose actions based on probabilities.
A population profile
Here,
In simple terms, at a Nash Equilibrium, no player can reduce their cost by changing their action while the actions of all other players remain the same.
Here are a few examples to illustrate Nash Equilibrium in different settings:
Each player chooses an action
-
Equilibrium: All players independently choose
$a^i = a^*$ . - Explanation: Since there is no interaction between players, each player's cost is minimized by moving directly to the target position.
Players are attracted to the mean position of the group, aiming to minimize the distance between their action and the average action:
where
-
Equilibrium: Any population profile
$\underline{a}$ where all players choose the same action,$a^i = c$ for some constant$c$ , is a Nash Equilibrium. -
Explanation: When all players act identically, the mean
$\bar{a}$ equals their action, minimizing the cost for everyone. However, the solution is not unique, as this works for any constant$c$ .
Players aim to avoid choosing the same action as others, minimizing the number of players taking the same action:
where
-
Equilibrium: Players spread out across the action space
$\mathcal{A}$ to ensure no two players choose the same action, if the size of$\mathcal{A}$ allows it. - Explanation: Each player minimizes their cost by choosing a unique action, avoiding overlap with others.
In the classic rock-paper-scissors game, each player
-
Rock beats Scissors:
$f^i(\underline{a}) = 1$ if$a^i = \text{Rock}$ and$a^j = \text{Scissors}$ . -
Scissors beats Paper:
$f^i(\underline{a}) = 1$ if$a^i = \text{Scissors}$ and$a^j = \text{Paper}$ . -
Paper beats Rock:
$f^i(\underline{a}) = 1$ if$a^i = \text{Paper}$ and$a^j = \text{Rock}$ . -
Otherwise,
$f^i(\underline{a}) = 0$ in case of a draw or$-1$ if the other player wins. -
Pure Strategy Solution: There is no Nash Equilibrium in pure strategies. For any choice of
$a^i$ , the other player$j$ can always switch their action to increase their own payoff,$f^j(\underline{a})$ . -
Explanation: The cyclical nature of rock-paper-scissors ensures that no single pair of actions maximizes payoffs for both players simultaneously in pure strategies. However, a mixed strategy Nash Equilibrium exists, where each player chooses each action with equal probability,
$\frac{1}{3}$ .
In the Prisoner's Dilemma, two players decide whether to cooperate (
- If both cooperate:
$f^i(\underline{a}) = 3$ . - If one defects while the other cooperates: The defector gets
$f^i(\underline{a}) = 5$ , and the cooperator gets$f^i(\underline{a}) = 1$ . - If both defect:
$f^i(\underline{a}) = 2$ .
Cooperate ( |
Defect ( |
|
---|---|---|
Cooperate | ||
Defect |
-
Nash Equilibrium: Both players choose
$a^i = D$ (defect). - Explanation: If one player cooperates, the other minimizes their cost by defecting. Since defecting is always the better response regardless of the other player's choice, mutual defection is the Nash Equilibrium. However, it is not Pareto-optimal, as mutual cooperation results in a better outcome for both players.
In a mixed strategy setting, each player does not choose a deterministic action
The population profile of strategies is then represented by:
and a realization of actions from these strategies is denoted by:
Given the population profile of actions
where
As
The cost for player
Here:
-
$\underline{\pi}$ governs the random realization of the population profile$\underline{a}$ . -
$f^i(\underline{a})$ is the payoff function for player$i$ given the realized actions$\underline{a}$ .
A mixed strategy profile
-
$\hat{\underline{\pi}}^{-i}$ represents the strategies of all players except player$i$ . - Player
$i$ cannot decrease their expected cost by unilaterally changing their mixed strategy$\pi^i$ .
In a static mean field game, we analyze the interaction of a large number of players under the assumptions of homogeneity and anonymity.
- Homogeneity: All players share the same cost function.
- Anonymity: Each player's cost depends on their own action and the overall distribution of actions, not on the identity of other players.
The cost function for a representative player is defined as:
where
For a finite population of
where:
-
$a^i$ is the action of player$i$ . -
$\mu_{\underline{a}} = \frac{1}{N} \sum_{j=1}^N \delta_{a^j}$ is the empirical distribution of the population's actions. - Note that
$\mu_{\underline{a}}$ is different from the mean action$\frac{1}{N} \sum_{j=1}^N a^j$ , which is a single value and does not capture the full distribution.
As the number of players
In the mean field setting, each player minimizes their expected cost given the population distribution
where:
-
$\pi \in \mathcal{P}(\mathcal{A})$ is the player's strategy. -
$\pi'$ is the population distribution of actions. - The expectation
$\mathbb{E}_{a \sim \pi}$ is taken over the player's action distribution$\pi$ .
The goal is to find the strategy
In a static mean field game, we analyze the interaction of a large number of players under the assumptions of homogeneity and anonymity.
- Homogeneity: All players share the same cost function.
- Anonymity: Each player's cost depends on their own action and the overall distribution of actions, not on the identity of other players.
The cost function for a representative player is defined as:
where
For a finite population of
where:
-
$a^i$ is the action of player$i$ . -
$\mu_{\underline{a}} = \frac{1}{N} \sum_{j=1}^N \delta_{a^j}$ is the empirical distribution of the population's actions. - Note that
$\mu_{\underline{a}}$ is different from the mean action$\frac{1}{N} \sum_{j=1}^N a^j$ , which is a single value and does not capture the full distribution.
As the number of players
A Mean Field Nash Equilibrium is a pair
- The player minimizes their expected cost given the population distribution
$\pi'$ :
where the expected cost is:
- The player's optimal strategy
$\hat{\pi}$ matches the population distribution$\pi'$ :
The Mean Field Nash Equilibrium can equivalently be characterized as a fixed point:
This fixed point formulation highlights that the equilibrium strategy
An
A mixed strategy profile
where:
-
$J(\pi^i, \hat{\pi}^{-i})$ is the expected cost for player$i$ when they use strategy$\pi^i$ and all other players use$\hat{\pi}^{-i}$ . -
$\hat{\pi}^{-i} = {\hat{\pi}^1, \dots, \hat{\pi}^{i-1}, \hat{\pi}^{i+1}, \dots, \hat{\pi}^N}$ denotes the strategies of all other players.
- In an
$\epsilon$ -Nash Equilibrium, no player can unilaterally change their strategy to improve their cost by more than$\epsilon$ . - When
$\epsilon = 0$ , the$\epsilon$ -Nash Equilibrium reduces to the standard Nash Equilibrium.
In mean field games with large populations,
-
Approximation of Nash Equilibria: In practical applications, solving for an exact Nash Equilibrium might be infeasible, and an
$\epsilon$ -Nash Equilibrium provides a good approximation. -
Scalability: As the number of players
$N \to \infty$ , the$\epsilon$ value can often decrease, reflecting how individual deviations have diminishing effects in large populations.
The slide provides a proof sketch for an Approximate Nash Equilibrium (NE) in a mean field game setting. Here's a breakdown and explanation:
The goal is to compare the cost of the
is decomposed as:
into three parts:
- The first term relates to differences between the
$N$ -player game cost and the mean field cost. - The second term captures the difference between mean field costs for
$\hat{\pi}$ and$\pi$ . - The third term accounts for the difference between the mean field cost and the
$N$ -player cost for$\pi$ .
This repository contains a Python implementation of an SEIR (Susceptible–Exposed–Infected–Recovered) model extended to include behavioral dynamics. The behavioral component introduces an effort variable
- Nash Equilibrium: Each individual optimizes their own cost in a decentralized manner (self-interest).
-
Societal Optimum: A central planner (or the society as a whole) selects the effort trajectory
$b(t)$ to minimize the total societal cost.
The model integrates the Pontryagin’s Maximum Principle (via costate variables) and gradient-based iterative methods to solve for equilibrium strategies.
We start with an SEIR model:
-
$S(t)$ is the proportion of susceptible individuals. -
$E(t)$ is the proportion of exposed (infected but not yet infectious). -
$I(t)$ is the proportion of infectious individuals. -
$R(t)$ is the proportion of recovered (or removed) individuals. -
$\beta(t)$ is the effective contact rate (transmission rate). -
$\alpha$ is the rate at which exposed individuals become infectious ($\frac{1}{\alpha}$ is the latent period). -
$\gamma$ is the recovery rate ($\frac{1}{\gamma}$ is the average infectious period).
We assume a basic reproduction number
We let individuals choose an effort
(Alternatively, you might set
Cost Components:
-
Effort cost:
$\text{cost}_\text{effort}(b)$ . -
Infection cost: A per-capita cost
$r_I$ upon infection.
In this example, we used a specific functional form for the effort cost:
which increases as
Under the Nash scenario, each individual chooses
-
$P(t)$ : Probability that a representative individual will eventually become infected (from the perspective of time$t$ ). -
$C(t)$ : Accumulated cost to this individual (effort + infection cost).
To track these, we define:
$$ \begin{aligned} P'(t) &= b(t) , I(t) ,\bigl(1 - P(t)\bigr), \ C'(t) &= \Bigl(\text{cost}\text{effort}\bigl(b(t)\bigr) + \text{cost}\text{infection}(I(t)) \cdot b(t),I(t)\Bigr) \cdot \bigl(1 - P(t)\bigr). \end{aligned} $$
-
$(1 - P(t))$ captures the fraction of individuals not yet infected by time$t$ . - If an individual is already infected (
$P(t) = 1$ ), their incremental cost from infection does not increase further.
Because each individual only considers their own risk, the costate variables that arise from the Pontryagin Maximum Principle will reflect a self-interested perspective.
For each state
where
We then solve these costate equations backwards in time, applying terminal conditions
In the societal scenario, a central planner chooses $ b(t) $ to minimize the aggregate cost:
subject to the same SEIR constraints. The difference from Nash is that the costate equations now reflect collective objectives (e.g., the total fraction of the population infected), and not purely individual infection probabilities.
- Time Grid:
- Epidemiological Parameters:
- Cost: $ r_I = 300 $ (infection cost).
- Initial Conditions:
For numerical stability, we store
- Effort Cost:
- Effort Cost Derivative:
- Infection Cost:
(A constant for each infected individual.)
We define a function beta_constant(time)
returning
We expand the state to
$$ \begin{aligned} S'(t) &= - b(t),S(t),I(t), \ E'(t) &= b(t),S(t),I(t) - \alpha , E(t), \ \log(I)'(t) &= \frac{\alpha , E(t)}{I(t)} - \gamma \quad \text{(if } I>1e^{-12}\text{)},\ R'(t) &= \gamma , I(t), \ P'(t) &= b(t),I(t),\bigl(1 - P(t)\bigr), \ C'(t) &= \Bigl[\text{cost}\text{effort}(b(t)) + \text{cost}\text{infection}(I),b(t),I(t)\Bigr],(1 - P(t)). \end{aligned} $$
These ODEs are numerically integrated forward in time using odeint
.
The costates
Here, the Hamiltonian $ H $ for each individual is:
We define a function deriv_costate_nash(...)
to compute:
To find the function
In code, we approximate:
$$ \nabla_{b} J_\text{Nash}(t) \approx \text{cost}_\text{effort}'(b(t)),(1 - P(t))
- r_I , I(t),(1 - P(t))
- I(t) , y_I(t). $$
We perform an iterative procedure:
- Start with an initial guess
$b_\text{init}(t)$ . - Solve forward the SEIR system to get
$(S(t), E(t), I(t), P(t))$ . - Solve backward the costate system to get
$(y_S(t), y_E(t), y_I(t))$ . - Compute the gradient of the cost wrt
$b(t)$ . - Update
$b(t)$ via a gradient descent step:
$$
b_{\text{new}}(t) = b_{\text{old}}(t) - \eta ,\nabla_{b}J(t)
$$
and clip
- Repeat until convergence or max iterations.
For the societal optimum, we use a similar approach, but the costate equations and gradient reflect aggregate costs. In the provided code, we have a placeholder that is structurally similar to the Nash approach, but one would typically replace the costate system with the one derived from a societal-level objective.
-
Baseline (No Effort): We solve the standard SEIR with
$\beta(t) = \beta_0$ . - Nash Equilibrium: We run the iterative scheme, printing iteration counts and final cost.
- Societal Optimum: We run a second iterative scheme starting from the Nash solution.
-
Plots: We compare
$S(t), E(t), I(t), R(t)$ , and$b(t)$ under both solutions. We also plot the cost convergence over iterations.
- Python >= 3.7
- NumPy
- SciPy
- Matplotlib
All can typically be installed via pip install numpy scipy matplotlib
.
- Clone or Download the repository.
-
Install Dependencies if needed:
pip install numpy scipy matplotlib
-
Run the script:
python seir_nash_vs_societal.py
-
Output:
- Two windows of figures should appear (if using an interactive environment).
-
SEIR compartment dynamics (S, E, I, R) + contact rate
$b(t)$ . - Cost convergence plots for Nash vs. Societal.
-
SEIR compartment dynamics (S, E, I, R) + contact rate
- An image file
seir_nash_vs_societal.png
is saved with publication-quality resolution.
- Two windows of figures should appear (if using an interactive environment).
-
$\textbf{R. Elie, E. Hubert, G. Turinici}$ “COVID-19 pandemic control: balancing detection policy and lockdown intervention under ICU sustainability” Mathematical Modelling of Natural Phenomena, Volume 15. 2020.