Lecture Note Set 2

(1)

IE675 Game Theory

Lecture Note Set 2

Wayne F. Bialas¹

Wednesday, January 19, 2005

2 TWO-PERSON GAMES

2.1 Two-Person Zero-Sum Games 2.1.1 Basic ideas

Definition 2.1. A game (in extensive form) is said to be zero-sum if and only if, at each terminal vertex, the payoff vector(p₁,. . ., p_n)satisfies^Pⁿ_i=1p_i =0.

Two-person zero sum games in normal form. Here’s an example. . . A=





−1 −3 −3 −2

0 1 −2 −1

2 −2 0 1





The rows represent the strategies of Player 1. The columns represent the strategies of Player 2. The entries a_ij represent the payoff vector(a_ij,−a_ij). That is, if Player 1 chooses rowiand Player 2 chooses columnj, then Player 1 winsa_ij and Player 2 losesa_ij. Ifa_ij <0, then Player 1 pays Player 2|a_ij|.

Note 2.1. We are using the term strategy rather than action to describe the player’s options. The reasons for this will become evident in the next chapter when we use this formulation to analyze games in extensive form.

Note 2.2. Some authors (in particular, those in the field of control theory) prefer to represent the outcome of a game in terms of losses rather than profits. During the semester, we will use both conventions.

1Department of Industrial Engineering, University at Buffalo, 301 Bell Hall, Buffalo, NY 14260- 2050 USA; E-mail: [email protected]; Web: http://www.acsu.buffalo.edu/˜bialas. Copyright c° MMV Wayne F. Bialas. All Rights Reserved. Duplication of this work is prohibited without written permission. This document produced January 19, 2005 at 3:33 pm.

(2)

How should each player behave? Player 1, for example, might want to place a bound on his profits. Player 1 could ask “For each of my possible strategies, what is the least desirable thing that Player 2 could do to minimize my profits?” For each of Player 1’s strategiesi, compute

α_i=min

j a_ij

and then choose thatiwhich produces max_iα_i. Suppose this maximum is achieved fori=i^∗. In other words, Player 1 is guaranteed to get at least

V(A) =min

j a_i^∗_j ≥min

j a_ij i=1,. . ., m The valueV(A)is called the gain-floor for the gameA.

In this caseV(A) =−2 withi^∗ ∈ {2,3}.

Player 2 could perform a similar analysis and find thatj^∗ which yields V(A) =max

i a_ij^∗≤max

i a_ij j=1,. . ., n The valueV(A)is called the loss-ceiling for the gameA.

In this caseV(A) =0 withj^∗ =3.

Now, consider the joint strategies(i^∗, j^∗). We immediately get the following:

Theorem 2.1. For every (finite) matrix gameA=^£a_ij^¤ 1. The valuesV(A)andV(A)are unique.

2. There exists at least one security strategy for each player given by(i^∗, j^∗).

3. minja_i^∗_j =V(A)≤V(A) =maxia_ij^∗

Proof: (1) and (2) are easy. To prove (3) note that for anykand`, minj a_kj ≤a_k` ≤max

i a_i`

and the result follows.

(3)

2.1.2 Discussion

Let’s examine the decision-making philosophy that underlies the choice of(i^∗, j^∗).

For instance, Player 1 appears to be acting as if Player 2 is trying to do as much harm to him as possible. This seems reasonable since this is a zero-sum game.

Whatever, Player 1 wins, Player 2 loses.

As we proceed through this presentation, note that this same reasoning is also used in the field of statistical decision theory where Player 1 is the statistician, and Player 2 is “nature.” Is it reasonable to assume that “nature” is a malevolent opponent?

2.1.3 Stability

Consider another example A=





−4 0 1

0 1 −3

−1 −2 −1





Player 1 should consideri^∗ = 3 (V = −2) and Player 2 should considerj^∗ = 1 (V =0).

However, Player 2 can continue his analysis as follows

• Player 2 will choose strategy 1

• So Player 1 should choose strategy 2 rather than strategy 3

• But Player 2 would predict that and then prefer strategy 3 and so on.

Question 2.1. When do we have a stable choice of strategies?

The answer to the above question gives rise to some of the really important early results in game theory and mathematical programming.

We can see that ifV(A) =V(A), then both Players will settle on(i^∗, j^∗)with minj a_i^∗_j =V(A) =V(A) =max

i a_ij^∗ Theorem 2.2. IfV(A) =V(A)then

1. Ahas a saddle point

(4)

2. The saddle point corresponds to the security strategies for each player 3. The value for the game isV =V(A) =V(A)

Question 2.2. SupposeV(A) < V(A). What can we do? Can we establish a

“spy-proof” mechanism to implement a strategy?

Question 2.3. Is it ever sensible to use expected loss (or profit) as a perfor- mance criterion in determining strategies for “one-shot” (non-repeated) decision problems?

2.1.4 Developing Mixed Strategies

Consider the following matrix game. . . A=

"

3 −1

0 1

#

For Player 1, we haveV(A) =0 andi^∗ =2. For Player 2, we haveV(A) =1 and j^∗=2. This game does not have a saddle point.

Let’s try to create a “spy-proof” strategy. Let Player 1 randomize over his two pure strategies. That is Player 1 will pick the vector of probabilitiesx= (x₁, x₂)where P

ix_i =1 andx_i ≥0 for alli. He will then select strategyiwith probabilityx_i. Note 2.3. When we formalize this, we will call the probability vectorx, a mixed strategy.

To determine the “best” choice ofx, Player 1 analyzes the problem, as follows. . .

(5)

-1 0 1 2 3

x₁ = 0 x₂ = 1

x₁ = 1 x₂ = 0 x₁ = 1/5

3/5

Player 2 might do the same thing using probability vector y = (y₁, y₂) where P

iy_i=1 andy_i≥0 for alli.

-1 0 1 2 3

y₁ = 0 y₂ = 1

y₁ = 1 y₂ = 0 y₁ = 2/5

3/5

(6)

If Player 1 adopts mixed strategy (x₁, x₂) and Player 2 adopts mixed strategy (y₁, y₂), we obtain an expected payoff of

V = 3x₁y₁+0(1−x₁)y₁−x₁(1−y₁) +(1−x₁)(1−y₁)

= 5x₁y₁−y₁−2x₁+1 Suppose Player 1 usesx^∗₁ = ¹₅, then

V =5 µ1

5

¶

y₁−y₁−2 µ1

5

¶

+1= 3 5

which doesn’t depend ony! Similarly, suppose Player 2 usesy₁^∗= ²₅, then V =5x₁

µ2 5

¶

− µ2

5

¶

−2x₁+1= 3 5 which doesn’t depend onx!

Each player is solving a constrained optimization problem. For Player 1 the problem

is max{v}

st: +3x₁+0x₂ ≥ v

−1x₁+1x₂ ≥ v x₁+x₂ = 1

x_i ≥ 0 ∀i

which can be illustrated as follows:

(7)

-1 0 1 2 3

x₁ = 0 x₂ = 1

x₁ = 1 x₂ = 0

v

This problem is equivalent to

maxx min{(3x₁+0x₂),(−x₁+x₂)}

For Player 2 the problem is

min{v}

st: +3y₁−1y₂ ≤ v +0y₁+1y₂ ≤ v y₁+y₂ = 1

y_j ≥ 0 ∀j

which is equivalent to

miny max{(3y₁−y₂),(0y₁+y₂)}

We recognize these as dual linear programming problems.

Question 2.4. We now have a way to compute a “spy-proof” mixed strategy for each player. Modify these two mathematical programming problems to produce the pure security strategy for each player.

(8)

In general, the players are solving the following pair of dual linear programming problems:

max{v}

st: ^P_ia_ijx_i ≥ v ∀j P

ix_i = 1

x_i ≥ 0 ∀i

and min{v}

st: ^P_ja_ijy_j ≤ v ∀i P

iy_i = 1

y_i ≥ 0 ∀j

Note 2.4. Consider, once again, the example game A=

"

3 −1

0 1

#

If Player 1 (the maximizer) uses mixed strategy(x₁,(1−x₁)), and if Player 2 (the minimizer) uses mixed strategy(y₁,(1−y₁))we get

E(x, y) =5x₁y₁−y₁−2x₁+1

and lettingx^∗ = ¹₅ andy^∗ = ²₅ we getE(x^∗, y) =E(x, y^∗) = ³₅ for anyxandy.

These choices forx^∗andy^∗make the expected value independent of the opposing strategy. So, if Player 1 becomes a minimizer (or if Player 2 becomes a maximizer) the resulting mixed strategies would be the same!

Note 2.5. Consider the game

A=

"

1 3 4 2

#

By “factoring” the expression forE(x, y), we can write

E(x, y) = x₁y₁+3x₁(1−y₁) +4(1−x₁)y+2(1−x₁)(1−y₁)

= −4x₁y₁+x₁+2y₁+2

= −4(x₁y₁−x₁ 4 − y₁

2 +1

8) +2+1 2

= −4(x₁−1

2)(y₁−1 4) + 5

2 It’s now easy to see thatx^∗₁ = ¹₂,y^∗₁ = ¹₄ andv= ⁵₂.

(9)

2.1.5 A more formal statement of the problem

Suppose we are given a matrix gameA_(m×n) ≡ ^£a_ij^¤. Each row ofA is a pure strategy for Player 1. Each column ofAis a pure strategy for Player 2. The value ofa_ij is the payoff from Player 1 to Player 2 (it may be negative).

For Player 1 let

V(A) =max

i min

j a_ij For Player 2 let

V(A) =min

j max

i a_ij {Case 1} (Saddle Point Case whereV(A) =V(A) =V)

Player 1 can assure himself of getting at leastV from Player 2 by playing his maximin strategy.

{Case 2} (Mixed Strategy Case whereV(A)< V(A)) Player 1 uses probability vector

x= (x₁,. . ., x_m) ^X

i

x_i =1 x_i ≥0 Player 2 uses probability vector

y= (y₁,. . ., y_n) ^X

j

y_j =1 y_j ≥0

If Player 1 usesxand Player 2 uses strategyj, the expected payoff is E(x, j) =^X

i

x_ia_ij =xA_j

whereA_j is columnjfrom matrixA.

If Player 2 usesyand Player 1 uses strategyi, the expected payoff is E(i, y) =^X

j

a_ijy_j =Aⁱy^T whereAⁱ is rowifrom matrixA.

(10)

Combined, if Player 1 usesxand Player 2 usesy, the expected payoff is E(x, y) =^X

i

X

j

x_ia_ijy_j =xAy^T

The players are solving the following pair of dual linear programming problems:

max{v}

st: ^P_ia_ijx_i ≥ v ∀j P

ix_i = 1

x_i ≥ 0 ∀i

and min{v}

st: ^P_ja_ijy_j ≤ v ∀i P

iy_i = 1

y_i ≥ 0 ∀j

The Minimax Theorem (von Neumann, 1928) states that there exists mixed strate- gies x^∗ andy^∗ for Players 1 and 2 which solve each of the above problems with equal objective function values.

2.1.6 Proof of the Minimax Theorem

Note 2.6. (From Ba¸sar and Olsder [2]) The theory of finite zero-sum games dates back to Borel in the early 1920’s whose work on the subject was later translated into English (Borel, 1953). Borel introduced the notion of a conflicting decision situation that involves more than one decision maker, and the concepts of pure and mixed strategies, but he did not really develop a complete theory of zero-sum games. Borel even conjectured that the Minimax Theorem was false.

It was von Neumann who first came up with a proof of the Minimax Theorem, and laid down the foundations of game theory as we know it today (von Neumann 1928, 1937).

We will provide two proofs of this important theorem. The first proof (Theorem 2.4) uses only the Separating Hyperplane Theorem. The second proof (Theorem 2.5) uses the similar, but more powerful, tool of duality from the theory linear programming.

(11)

Our first, and direct, proof of the Minimax Theorem is based on the proof by von Neumann and Morgenstern [7]. It also appears in the book by Ba¸sar and Olsder [2].

It depends on the Separating Hyperplane Theorem:¹

Theorem 2.3. (From [1]) Separating Hyperplane Theorem. LetS andT be two non-empty, convex sets inRⁿ with no interior point in common. Then there exists a pair(p, c)withp∈Rⁿ6=0 andc∈Rsuch that

px ≥ c ∀x∈S py ≤ c ∀y∈T

i.e., there is a hyperplaneH(p, c) ={x∈Rⁿ|px=c}that separatesS andT. Proof: Define S−T ={x−y ∈ Rⁿ|x ∈ S, y ∈ T}. S−T is convex. Then 0 ∈/ int(S−T)(if it was, i.e., if 0∈int(S−T), then there is anx ∈int(S)and y ∈ int(T) such that x−y = 0, or simplyx = y, which would be a common interior point). Thus, we can “separate” 0 fromS −T, i.e., there existsp ∈ Rⁿ wherep6=0 andc ∈Rsuch thatp·(x−y)≥candp·0≤c. But, this implies that

p·0=0≤c≤p·(x−y)

which impliesp·(x−y) ≥0. Hence,px≥py for allx ∈S and for ally ∈T. That is, there must be ac∈Rsuch that

py≤c≤px ∀x∈Sand∀y∈T

A version of Theorem 2.3 also appears in a paper by Gale [5] and a text by Boot [3].

Theorem 2.3 can be used to produce the following corollary that we will use to prove the Minimax Theorem:

Corollary 2.1. LetAbe an arbitrary(m×n)-dimensional matrix. Then either (i) there exists a nonzero vectorx∈R^m,x≥0 such thatxA≥0, or

(ii) there exists a nonzero vectory∈Rⁿ,y≥0 such thatAy^T ≤0.

Theorem 2.4. Minimax Theorem. Let A =^£a_ij^¤be anm×nmatrix of real numbers. LetΞ^rdenote the set of allr-dimensional probability vectors, that is,

Ξ^r={x∈R^r|^P^r_i=1x_i =1 andx_i≥0}

1I must thank Yong Bao for his help in finding several errors in a previous version of these notes.

(12)

We sometimes callΞ^rthe probability simplex.

Letx∈Ξ^mandy∈Ξⁿ. Define

V_m(A) ≡ max

x min

y xAy^T V_m(A) ≡ min

y max

x xAy^T ThenV_m(A) =V_m(A).

Proof: First we will prove that

V_m(A)≤V_m(A) (1)

To do so, note thatxAy^T, maxxxAy^Tand minyxAy^Tare all continuous functions of(x, y),xandy, respectively. Any continuous, real-valued function on a compact set has an extermum. Therefore, there existsx⁰andy⁰such that

V_m(A) = min

y x⁰Ay^T V_m(A) = max

x xAy^0T It is clear that

V_m(A)≤x⁰Ay^0T ≤V_m(A) (2)

Thus relation (1) is true.

Now we will show that one of the following must be true:

V_m(A)≤0 or V_m(A)≥0 (3)

Corollary 2.1 provides that, for any matrixA, one of the two conditions (i) or (ii) in the corollary must be true. Suppose that condition (ii) is true. Then there exists y⁰∈Ξⁿsuch that²

Ay^0T ≤ 0

⇒ xAy^0T ≤ 0 ∀x∈Ξ^m

⇒ max

x xAy^0T ≤ 0

Hence

V_m(A) =min

y max

x xAy^T≤0

2Corollary 2.1 says that there must exist such ay⁰∈Rⁿ. Why doesn’t it make a difference when we useΞⁿrather thanRⁿ?

(13)

Alternatively, if (i) is true then we can similarly show that V_m(A) =max

x min

y xAy^T≥0

Define the(m×n)matrixB= [b_ij]whereb_ij =a_ij −cfor all(i, j)and wherec is a constant. Note that

V_m(B) =V_m(A)−c and V_m(B) =V_m(A)−c

SinceAwas an arbitrary matrix, the previous results also hold forB. Hence either V_m(B) =V_m(A)−c ≤ 0 or

V_m(B) =V_m(A)−c ≥ 0 Thus, for any constantc, either

V_m(A) ≤ c or V_m(A) ≥ c Relation (1) guarantees that

V_m(A)≤V_m(A) Therefore, there exists a∆≥0 such that

V_m(A) +∆=V_m(A).

Suppose∆>0. Choosec=∆/2 and we have found acsuch that both V_m(A) ≥ c and

V_m(A) ≤ c

are true. This contradicts our previous result. Hence∆=0 andV_m(A) =V_m(A).

2.1.7 The Minimax Theorem and duality

The next version of the Minimax Theorem uses duality and provides several fun- damental links between game theory and the theory of linear programming.³ Theorem 2.5. Consider the matrix gameA with mixed strategiesx and y for Player 1 and Player 2, respectively. Then

3This theorem and proof is from my own notebook from a Game Theory course taught at Cornell in the summer of 1972. The course was taught by Professors William Lucas and Louis Billera. I believe, but I cannot be sure, that this particular proof is from Professor Billera.

(14)

1. minimax statement

maxx min

y E(x, y) =min

y max

x E(x, y)

2. saddle point statement (mixed strategies) There existsx^∗andy^∗such that E(x, y^∗)≤E(x^∗, y^∗)≤E(x^∗, y)

for allxandy.

2a. saddle point statement (pure strategies) Let E(i, y) denote the expected value for the game if Player 1 uses pure strategyiand Player 2 uses mixed strategyy. LetE(x, j) denote the expected value for the game if Player 1 uses mixed strategyxand Player 2 uses pure strategyj. There existsx^∗and y^∗such that

E(i, y^∗)≤E(x^∗, y^∗)≤E(x^∗, j) for alliandj.

3. LP feasibility statement There existsx^∗,y^∗, andv⁰=v⁰⁰such that P

iaijx^∗_i ≥ v⁰ ∀j P

ix^∗_i = 1

x^∗_i ≥ 0 ∀i

P

jaijy_j^∗ ≤ v⁰⁰ ∀i P

jy^∗_j = 1

y_j^∗ ≥ 0 ∀j

4. LP duality statement The objective function values are the same for the following two linear programming problems:

max{v}

st: P

iaijx^∗_i ≥ v ∀j P

ix^∗_i = 1

x^∗_i ≥ 0 ∀i

min{v}

st: P

jaijy_j^∗ ≤ v ∀i P

iy_j^∗ = 1

y_j^∗ ≥ 0 ∀j

Proof: We will sketch the proof for the above results by showing that (4)⇒(3)⇒(2)⇒(1)⇒(3)⇒(4)

and

(2)⇔(2a) .

(15)

{(4)⇒(3)} (3) is just a special case of (4).

{(3)⇒(2)} Let 1ndenote a column vector ofnones. Then (3) implies that there existsx^∗,y^∗, andv⁰ =v⁰⁰such that

x^∗A ≥ v⁰1n

x^∗Ay^T ≥ v⁰(1_ny^T) =v⁰ ∀y

and

Ay^∗T ≤ v⁰⁰1_m

xAy^∗T ≤ xv⁰⁰1m =v⁰⁰(x1_m) =v⁰⁰ ∀x Hence,

E(x^∗, y)≥v⁰ =v⁰⁰ ≥E(x, y^∗) ∀x, y and

E(x^∗, y^∗) =v⁰=v⁰⁰=E(x^∗, y^∗)

{(2)⇒(2a)} (2a) is just a special case of (2) using mixed strategiesxwithx_i =1 andx_k=0 fork6=i.

{(2a)⇒(2)} For eachi, consider all convex combinations of vectorsxwithx_i= 1 andx_k =0 fork6=i. SinceE(i, y^∗)≤v, we must haveE(x^∗, y^∗)≤v.

{(2)⇒(1)}

• {Case≥}

E(x, y^∗) ≤ E(x^∗, y) ∀x, y maxx E(x, y^∗) ≤ E(x^∗, y) ∀y maxx E(x, y^∗) ≤ min

y E(x^∗, y) miny max

x E(x, y)≤max

x E(x, y^∗) ≤ min

y E(x^∗, y)≤max

x min

y E(x, y)

• {Case≤}

miny E(x, y) ≤ E(x, y) ∀x, y maxx

·

miny E(x, y)

¸

≤ max

x E(x, y) ∀y maxx

·

miny E(x, y)

¸

≤ min

y

h

maxx E(x, y)ⁱ

(16)

{(1)⇒(3)}

maxx

·

miny E(x, y)

¸

=min

y

h

maxx E(x, y)ⁱ

Let f(x) = min_yE(x, y). From calculus, there exists x^∗ such that f(x)attains its maximum value atx^∗. Hence

miny E(x^∗, y) =max

x

·

miny E(x, y)

¸

{(3)⇒(4)} This is direct from the duality theorem of LP. (See Chapter 13 of Dantzig’s text.)

Question 2.5. Can the LP problem in section (4) of Theorem 2.5 have alternate optimal solutions. If so, how does that affect the choice of(x^∗, y^∗)?⁴

2.2 Two-Person General-Sum Games 2.2.1 Basic ideas

Two-person general-sum games (sometimes called “bi-matrix games”) can be rep- resented by two (m ×n) matrices A = ^£a_ij^¤ and B = ^£b_ij^¤ where a_ij is the

“payoff” to Player 1 andb_ij is the “payoff” to Player 2. IfA=−Bthen we get a two-person zero-sum game,A.

Note 2.7. These are non-cooperative games with no side payments.

Definition 2.2. The (pure) strategy(i^∗, j^∗)is a Nash equilibrium solution to the game(A, B)if

a_i^∗_,j^∗ ≥ a_i,j^∗ ∀i b_i^∗_,j^∗ ≥ b_i^∗_,j ∀j

Note 2.8. If both players are placed on their respective Nash equilibrium strategies (i^∗, j^∗), then each player cannot unilaterally move away from that strategy and improve his payoff.

4Thanks to Esra E. Aleisa for this question.

(17)

Question 2.6. Show that ifA=−B (zero-sum case), the above definition of a Nash solution corresponds to our previous definition of a saddle point.

Note 2.9. Not every game has a Nash solution using pure strategies.

Note 2.10. A Nash solution need not be the best solution, or even a reasonable solution for a game. It’s merely a stable solution against unilateral moves by a single player. For example, consider the game

(A, B) =

"

(4,0) (4,1) (5,3) (3,2)

#

This game has two Nash equilibrium strategies,(4,1) and(5,3). Note that both players prefer(5,3)when compared with(4,1).

Question 2.7. What is the solution to the following simple modification of the above game:⁵

(A, B) =

"

(4,0) (4,1) (4,2) (3,2)

#

Example 2.1. (Prisoner’s Dilemma) Two suspects in a crime have been picked up by police and placed in separate rooms. If both confess (C), each will be sentenced to 3 years in prison. If only one confesses, he will be set free and the other (who didn’t confess (N C)) will be sent to prison for 4 years. If neither confesses, they will both go to prison for 1 year.

This game can be represented in strategic form, as follows:

C N C

C (-3,-3) (0,-4) N C (-4,0) (-1,-1)

This game has one Nash equilibrium strategy,(−3,−3). When compared with the other solutions, note that it represents one of the worst outcomes for both players.

2.2.2 Properties of Nash strategies

5Thanks to Esra E. Aleisa for this question.

(18)

Definition 2.3. The pure strategy pair(i₁, j₁)weakly dominates(i₂, j₂) if and only if

a_i₁_,j₁ ≥ a_i₂_,j₂ b_i₁_,j₁ ≥ b_i₂_,j₂ and one of the above inequalities is strict.

Definition 2.4. The pure strategy pair(i₁, j₁)strongly dominates(i₂, j₂)if and only if

a_i₁_,j₁ > a_i₂_,j₂ b_i₁_,j₁ > b_i₂_,j₂

Definition 2.5. (Weiss [8]) The pure strategy pair(i, j)is inadmissible if there exists some strategy pair(i₀, j₀)that weakly dominates(i, j).

Definition 2.6. (Weiss [8]) The pure strategy pair(i, j)is admissible if it is not inadmissible.

Example 2.2. Consider again the game (A, B) =

"

(4,0) (4,1) (5,3) (3,2)

#

With Nash equilibrium strategies,(4,1)and(5,3). Only(5,3)is admissible.

Note 2.11. If there exists multiple admissible Nash equilibria, then side-payments (with collusion) may yield a “better” solution for all players.

Definition 2.7. Two bi-matrix games(A.B)and(C, D)are strategically equiv- alent if there existsα₁ >0,α₂>0 and scalarsβ₁,β₂such that

a_ij = α₁c_ij+β₁ ∀i, j b_ij = α₂d_ij +β₂ ∀i, j

Theorem 2.6. If bi-matrix games(A.B)and(C, D)are strategically equivalent and(i^∗, j^∗)is a Nash strategy for(A, B), then(i^∗, j^∗)is also a Nash strategy for (C, D).

Note 2.12. This was used to modify the original matrices for the Prisoners’

Dilemma problem in Example 2.1.

(19)

2.2.3 Nash equilibria using mixed strategies

Sometimes the bi-matrix game(A, B)does not have a Nash strategy using pure strategies. As before, we can use mixed strategies for such games.

Definition 2.8. The (mixed) strategy(x^∗, y^∗)is a Nash equilibrium solution to the game(A, B)if

x^∗Ay^∗T ≥ xAy^∗T ∀x∈Ξ^m x^∗By^∗T ≥ x^∗By^T ∀y∈Ξⁿ

whereΞ^ris ther-dimensional probability simplex.

Question 2.8. Consider the game (A, B) =

"

(−2,−4) (0,−3) (−3,0) (1,−1)

#

Can we find mixed strategies (x^∗, y^∗) that provide a Nash solution as defined above?

Theorem 2.7. Every bi-matrix game has at least one Nash equilibrium solution in mixed strategies.

Proof: (This is the sketch provided by the text for Proposition 33.1; see Chapter 3 for a complete proofs forN ≥2 players.)

Consider the setsΞⁿandΞ^m consisting of the mixed strategies for Player 1 and Player 2, respectively. Note that Ξⁿ×Ξ^m is non-empty, convex and compact.

Since the expected payoff functionsxAy^TandxBy^Tare linear in(x, y), the result follows using Brouwer’s fixed point theorem,

2.2.4 Finding Nash mixed strategies Consider again the game

(A, B) =

"

(−2,−4) (0,−3) (−3,0) (1,−1)

#

For Player 1

xAy^T = −2x₁y₁−3(1−x₁)y₁+ (1−x₁)(1−y₁)

= 2x₁y₁−x₁−4y₁+1

(20)

For Player 2

xBy^T = −2x₁y₁−2x₁+y₁−1

In order for(x^∗, y^∗)to be a Nash equilibrium, we must have for all 0≤x₁≤1 x^∗Ay^∗T ≥ xAy^∗T ∀x∈Ξ^m

(4)

x^∗By^∗T ≥ x^∗By^T ∀y∈Ξⁿ (5)

For Player 1 this means that we want(x^∗, y^∗)so that for allx₁ 2x^∗₁y^∗₁−x^∗₁−4y₁^∗+1 ≥ 2x₁y₁^∗−x₁−4y₁^∗+1

2x^∗₁y₁^∗−x^∗₁ ≥ 2x₁y₁^∗−x₁

Let’s tryy₁^∗ = ¹₂. We get 2x^∗₁

µ1 2

¶

−x^∗₁ ≥ 2x₁ µ1

2

¶

−x₁

0 ≥ 0

Therefore, if y^∗ = (¹₂,¹₂) then any x^∗ can be chosen and condition (4) will be satisfied.

Note that only condition (4) and Player 1’s matrix A was used to get Player 2’s strategyy^∗.

For Player 2 the same thing happens if we usex^∗₁ = ¹₂ and condition (5). That is, for all 0≤y₁≤1

−2x^∗₁y^∗₁−2x^∗₁+y₁^∗−1 ≥ −2x₁y^∗₁−2x₁+y^∗₁−1

−2x^∗₁y^∗₁+y₁^∗ ≥ −2x₁y^∗₁+y₁

−2 µ1

2

¶

y^∗₁+y₁^∗ ≥ −2 µ1

2

¶

y₁^∗+y₁

0 ≥ 0

How can we get the values of(x^∗, y^∗) that will work? One suggested approach from (Ba¸sar and Olsder [2]) uses the following:

(21)

Theorem 2.8. Any mixed Nash equilibrium solution(x^∗, y^∗) in the interior of Ξ^m×Ξⁿmust satisfy

Xn

j=1

y_j^∗(a_ij −a_1j) = 0 ∀i6=1 (6)

Xm

i=1

x^∗_i(b_ij −b_i1) = 0 ∀j 6=1 (7)

Proof: Recall that

E(x, y) =xAy^T = Xm

i=1

Xn

j=1

x_iy_ja_ij

= Xn

j=1

Xm

i=1

x_iy_ja_ij

Sincex₁=1−^P^m_i=2x_i, we have xAy^T =

Xn

j=1

"_m X

i=2

x_iy_ja_ij + Ã

1− Xm

i=2

x_i

! y_ja_1j

#

= Xn

j=1

"

y_ja_1j+y_j Xm

i=2

x_i(a_ij −a_1j)

#

= Xn

j=1



y_ja_1j+ Xm

i=2

x_i Xn

j=1

y_j(a_ij−a_1j)





If(x^∗, y^∗)is an interior maximum (or minimum) then

∂

∂x_ixAy^T = Xn

j=1

y_j(a_ij −a_1j) =0 fori=2,. . ., m Which provide the Equations 6.

The derivation of Equations 7 is similar.

Note 2.13. In the proof we have the equation xAy^T =

Xn

j=1

y_ja_1j+ Xm

i=2

x_i



Xⁿ

j=1

y_j(a_ij−a_1j)





(22)

Any Nash solution(x^∗, y^∗)in the interior ofΞ^m×Ξⁿhas Xn

j=1

y^∗_j(a_ij −a_1j) =0 ∀i6=1 So this choice ofy^∗produces

xAy^T= Xn

j=1

y_ja_1j+ Xm

i=2

x_i[0]

making this expression independent ofx.

Note 2.14. Equations 6 and 7 only provide necessary (not sufficient) conditions, and only characterize solutions on the interior of the probability simplex (i.e., where every component ofxandyare strictly positive).

For our example, these equations produce

y₁^∗(a₂₁−a₁₁) +y^∗₂(a₂₂−a₁₂) = 0 x^∗₁(b₁₂−b₁₁) +x^∗₂(b₂₂−b₂₁) = 0 Sincex^∗₂ =1−x^∗₁ andy^∗₂ =1−y₁^∗, we get

y₁^∗(−3−(−2)) + (1−y₁^∗)(1−0) = 0

−y^∗₁+ (1−y₁^∗) = 0 y^∗₁ = 1 2 x^∗₁(−3−(−4)) + (1−x^∗₁)(−1−0) = 0

x^∗₁−(1−x^∗₁) = 0 x^∗₁ = 1 2

But, in addition, one must check that x^∗₁ = ¹₂ and y₁^∗ = ¹₂ are actually Nash solutions.

2.2.5 The Lemke-Howson algorithm

Lemke and Howson [6] developed a quadratic programming technique for finding mixed Nash strategies for two-person general sum games(A, B)in strategic form.

Their method is based on the following fact, provided in their paper:

(23)

Lete_kdenote a column vector ofkones, and letxandybe row vectors of dimension mandn, respectively. Letpandqdenote scalars. We will also assume thatAand Bare matrices, each withmrows andncolumns.

A mixed strategy is defined by a pair(x, y)such that xe_m=ye_n=1, and x≥0, y ≥0 (8)

with expected payoffs

xAy^T and xBy^T. (9)

A Nash equilibrium solution is a pair(x,¯ y)¯ satisfying (8) such that for all(x, y) satisfying (8),

xAy¯^T ≤xA¯ y¯^T and xBy¯ ^T ≤xB¯ y¯^T. (10)

But this implies that

Ay¯^T ≤xA¯ y¯^Te_m and xB¯ ≤xB¯ y¯^Te^T_n. (11)

Conversely, suppose (11) holds for(x,¯ y)¯ satisfying (8). Now choose an arbitrary (x, y) satisfying (8). Multiply the first expression in (11) on the left by x and second expression in (11) on the right byy^T to get (10). Hence, (8) and (11) are, together, equivalent to (8) and (10).

This serves as the foundation for the proof of the following theorem:

Theorem 2.9. Any mixed strategy(x^∗, y^∗)for bi-matrix game(A, B)is a Nash equilibrium solution if and only ifx^∗,y^∗,p^∗ andq^∗ solve problem (LH):

(LH): max_x,y,p,q{xAy^T+xBy^T−p−q}

st: Ay^T ≤ pe_m B^Tx^T ≤ qe_n

x_i ≥ 0 ∀i

y_j ≥ 0 ∀j

P_m

i=1x_i = 1 P_n

j=1y_j = 1 Proof:(⇒)

Every feasible solution(x, y, p, q)to problem (LH) must satisfy the constraints Ay^T ≤ pe_m

xB ≤ qe^T_n.

(24)

Multiply both sides of the first constraint on the left byxand multiply the second constraint on the right byy^T. As a result, we see that a feasible(x, y, p, q)must satisfy

xAy^T ≤ p xBy^T ≤ q.

Hence, for any feasible(x, y, p, q). the objective function must satisfy xAy^T+xBy^T−p−q≤0.

Suppose(x^∗, y^∗)is any Nash solution for(A, B). Let p^∗ = x^∗Ay^∗T q^∗ = x^∗By^∗T. Because of (10) and (11), this implies

Ay^∗T ≤ x^∗Ay^∗Te_m = p^∗e_m x^∗B ≤ x^∗By^∗Te^T_n = q^∗e^T_n.

So this choice of(x^∗, y^∗, p^∗, q^∗)is feasible, and results in the objective function equal to zero. Hence it’s an optimal solution to problem (LH)

(⇐)

Suppose(x,¯ y,¯ p,¯ q)¯ solves problem (LH). From Theorem 2.7, there is at least one Nash solution (x^∗, y^∗). Using the above argument, (x^∗, y^∗) must be an optimal solution to (LH) with an objective function value of zero. Since(x,¯ y,¯ p,¯ q)¯ is an optimal solution to (LH), we must then have

xA¯ y¯^T+xB¯ y¯^T−p¯−q¯=0 (12)

with(x,¯ y,¯ p,¯ q)¯ satisfying the constraints Ay¯^T ≤ pe¯ _m (13)

xB¯ ≤ qe¯ ^T_n. (14)

Now multiply (13) on the left by ¯xand multiply (14) on the right by ¯y^T to get xA¯ y¯^T ≤ p¯

(15)

xB¯ y¯^T ≤ q.¯ (16)

(25)

Then (12), (15), and (16) together imply xA¯ y¯^T = p¯ xB¯ y¯^T = q.¯ So (13), and (14) can now be rewritten as

Ay¯^T ≤ xA¯ y¯^Te_m (17)

xB¯ ≤ xB¯ y¯^Te_n. (18)

Choose an arbitrary(x, y)∈Ξ^m×Ξⁿand, this time, multiply (17) on the left by xand multiply (18) on the right byy^Tto get

xAy¯^T ≤ xA¯ y¯^T (19)

xBy¯ ^T ≤ xB¯ y¯^T (20)

for all(x, y)∈Ξ^m×Ξⁿ. Hence(x,¯ y)¯ is a Nash equilibrium solution.

2.3 BIBLIOGRAPHY

[1] Anon., The history of economic thought web site, Department of Economics, New School University (2003)

http://cepa.newschool.edu/het/home.htm

[2] T. Ba¸sar and G. Olsder, Dynamic noncooperative game theory, Academic Press (1982).

[3] John C. G. Boot, Quadratic programming, North-Holland, Amsterdam, (1964).

[4] G. Debreu, Separation theorems for convex sets, in T. C. Koopmans and A.

F. Bausch, Selected topics in economics involving mathematical reasoning, SIAM Review 1. (1959) 79–148.

[5] D. Gale, The basic theorems of real linear equations, inequalities, linear pro- gramming and game theory, Navel Research Logistics Quarterly, Vol. 3 (1956) 193–200.

[6] C. E. Lemke and J. T. Howson, Jr., Equilibrium points of bimatrix games, SIAM Journal, Volume 12, Issue 2 (Jun., 1964), pp 413–423.

[7] J. von Neumann and O. Morgenstern, Theory of games and economic behavior, Princeton Univ. Press (1947).

[8] L. Weiss, Statistical decision theory, McGraw-Hill (1961).