Example: Stochastic game with sequential moves
sGameSolver allows for stochastic games with different actions across states and players. An extreme case of differing actions across states are stochastic games with sequential moves. In such games, only one of the players can move in any given state, while the remaining players have only one action, namely do nothing.
A famous example of a stochastic game with sequential moves is the price competition game by Maskin and Tirole (1988).
There are two firms, each producing a homogeneous product at zero marginal cost. The two firms compete on price to maximize the net present value of profits given a common discount factor. Time runs in discrete periods and firms take turns to set their prices. In odd periods, firm 0 sets its price and firm 1’s price is locked in, and vice versa in even periods. Each period, firms face market demand \(d(p) = 2 - p\) which goes to the firm with the lowest price. In case of equal prices, demand is split evenly. Finally, firms choose prices from a grid \(P = \{0.0, 0.1, ..., 1.0, 1.1\}\).
A state \(s\) in this game is a tuple \(s = (i, p_{-i})\) consisting of the firm \(i\) next to set its price and the price \(p_{-i}\) currently locked in by the other firm.
Actions are state-dependent: Firm \(i\) currently moving can choose any price from the grid while the other firm \(-i\) has no choice but to stick to its price.
Finally, state transitions are given by price choices. If firm \(i\) in state \(s = (i, p_{-i})\) chooses price \(p_i\), the next state will be \(s' = (-i, p_i)\) with the other firm \(-i\) reacting to the price \(p_i\) of firm \(i\).
The game can be implemented in sGameSolver as follows.
Some preparation:
import numpy as np
import itertools
NUM_PLAYERS = 2
MARGINAL_COSTS = 0
PRICE_GRID = np.linspace(0, 1.1, 12)
def demand_fun(price: float) -> float:
return 2 - price
def profit_fun(prices: np.ndarray) -> np.ndarray:
# number of firms at minimum price, market shares and demand
num_players_at_min_price = (prices == prices.min()).sum()
market_shares = [1/num_players_at_min_price if price == prices.min() else 0 for price in prices]
demand = np.array([market_shares[player] * demand_fun(price) for player, price in enumerate(prices)])
return (prices - MARGINAL_COSTS) * demand
# state space [(player_to_move, competitor_prices)]
state_space: list[tuple[int, np.ndarray]] = []
for player in range(NUM_PLAYERS):
for other_prices in itertools.product(PRICE_GRID, repeat=NUM_PLAYERS - 1):
state_space.append((player, np.array(other_prices)))
def get_prices(state: tuple[int, np.ndarray], price: float) -> np.ndarray:
player, other_prices = state
return np.insert(other_prices, player, price)
def get_next_state(state: tuple[int, np.ndarray], price: float) -> np.ndarray:
player, other_prices = state
prices = get_prices(state, price)
next_player = (player + 1) % NUM_PLAYERS
next_other_prices = np.delete(prices, next_player)
return (next_player, next_other_prices)
Now we can define a game table or the corresponding arrays.
The full game table has 289 rows. We abbreviate it here.
state |
a_player0 |
a_player1 |
u_player0 |
u_player1 |
to_state |
|---|---|---|---|---|---|
delta |
0 |
0 |
|||
(0, 0) |
0 |
0 |
0 |
0 |
(1, 0) |
(0, 0) |
0.1 |
0 |
0 |
0 |
(1, 0.1) |
… |
… |
0 |
0 |
0 |
… |
(0, 0) |
1.1 |
0 |
0 |
0 |
(1, 1.1) |
(0, 0.1) |
0 |
0.1 |
0 |
0 |
(1, 0) |
(0, 0.1) |
0.1 |
0.1 |
0.95 |
0.95 |
(1, 0.1) |
… |
… |
0.1 |
0 |
0.19 |
… |
(0, 0.1) |
1.1 |
0.1 |
0 |
0.19 |
(1, 1.1) |
… |
… |
… |
… |
… |
… |
(0, 1.1) |
0 |
1.1 |
0 |
0 |
(1, 0) |
(0, 1.1) |
0.1 |
1.1 |
0.19 |
0 |
(1, 0.1) |
… |
… |
1.1 |
… |
… |
… |
(0, 1.1) |
1.1 |
1.1 |
0.495 |
0.495 |
(1, 1.1) |
(1, 0) |
0 |
0 |
0 |
0 |
(0, 0) |
(1, 0) |
0 |
0.1 |
0 |
0 |
(0, 0.1) |
… |
0 |
… |
0 |
0 |
… |
(1, 0) |
0 |
1.1 |
0 |
0 |
(0, 1.1) |
(1, 0.1) |
0.1 |
0 |
0 |
0 |
(0, 0) |
(1, 0.1) |
0.1 |
0.1 |
0.95 |
0.95 |
(0, 0.1) |
… |
0.1 |
… |
0.19 |
0 |
… |
(1, 0.1) |
0.1 |
1.1 |
0.19 |
0 |
(0, 1.1) |
… |
… |
… |
… |
… |
… |
(1, 1.1) |
1.1 |
0 |
0 |
0 |
(0, 0) |
(1, 1.1) |
1.1 |
0.1 |
0 |
0.19 |
(0, 0.1) |
… |
1.1 |
… |
… |
… |
… |
(1, 1.1) |
1.1 |
1.1 |
0.495 |
0.495 |
(0, 1.1) |
The game table can be generated in Python as follows.
import pandas as pd
common_discount_factor = 0.95
game_table = pd.DataFrame(data=[['delta'] + ['']*NUM_PLAYERS + [common_discount_factor]*NUM_PLAYERS + [np.nan]],
columns=['state', 'a_p0', 'a_p1', 'u_p0', 'u_p1', 'to_state'])
for state in state_space:
player, other_prices = state
for price in PRICE_GRID:
prices = get_prices(state, price)
payoffs = profit_fun(prices)
next_state = get_next_state(state, price)
new_row = pd.DataFrame(data=[[str(state)] + [str(p) for p in prices] + list(payoffs) + [str(next_state)]],
columns=['state', 'a_p0', 'a_p1', 'u_p0', 'u_p1', 'to_state'])
game_table = pd.concat([game_table, new_row], ignore_index=True)
As always, the game table can be imported by calling from_table():
import sgamesolver
# import Pandas DataFrame:
game = sgamesolver.SGame.from_table(game_table)
# or Excel file:
game = sgamesolver.SGame.from_table('path/to/table.xlsx')
def make_payoff_matrix(state: tuple[int, np.ndarray]) -> np.ndarray:
player, other_prices = state
# player can choose a price, other players have only one dummy action
a_dims = np.ones(NUM_PLAYERS, dtype=np.int32)
a_dims[player] = len(PRICE_GRID)
payoff_matrix = np.nan * np.ones((NUM_PLAYERS,) + tuple(a_dims))
for idx, price in enumerate(PRICE_GRID):
prices = get_prices(state, price)
payoffs = profit_fun(prices)
action_profile = np.zeros(NUM_PLAYERS, dtype=np.int32)
action_profile[player] = idx
for p in range(NUM_PLAYERS):
payoff_matrix[(p,) + tuple(action_profile)] = payoffs[p]
return payoff_matrix
def make_transition_matrix(state: tuple[int, np.ndarray]) -> np.ndarray:
player, other_prices = state
a_dims = np.ones(NUM_PLAYERS, dtype=np.int32)
a_dims[player] = len(PRICE_GRID)
transition_matrix = np.zeros(tuple(a_dims) + (len(state_space),))
for idx, price in enumerate(PRICE_GRID):
next_state = get_next_state(state, price)
transition_probs = [1 if s == next_state else 0 for s in state_space]
action_profile = np.zeros(NUM_PLAYERS, dtype=np.int32)
action_profile[player] = idx
for s in range(len(state_space)):
transition_matrix[tuple(action_profile) + (s,)] = transition_probs[s]
return transition_matrix
payoff_matrices = [make_payoff_matrix(state) for state in state_space]
transition_matrices = [make_transition_matrix(state) for state in state_space]
common_discount_factor = 0.95
The SGame class can then be initialized as usual:
import sgamesolver
game = sgamesolver.SGame(payoff_matrices=payoff_matrices,
transition_matrices=transition_matrices,
discount_factors=common_discount_factor)