New Complexity Results about Nash Equilibria

We provide a single reduction that demonstrates that in normal-form games: 1) it is NP -complete to determine whether Nash equilibria with certain natural properties exist (these results are similar to those obtained by Gilboa and Zemel [17]), 2) more signiﬁcantly, the problems of maximizing certain properties of a Nash equilibrium are inapproximable (unless P = NP ), and 3) it is # P -hard to count the Nash equilibria. We also show that determining whether a pure-strategy Bayes-Nash equilibrium exists in a Bayesian game is NP -complete, and that determining whether a pure-strategy Nash equilibrium exists in a Markov (stochastic) game is PSPACE -hard even if the game is unobserved (and that this remains NP -hard if the game has ﬁnite length). All of our hardness results hold even if there are only two players and the game is symmetric.


Introduction
Game theory provides a normative framework for analyzing strategic interactions.However, in order for anyone to play according to the solutions that it prescribes, these solutions must be computed.There are many different ways in which this can happen: a player can consciously solve the game (possibly with the help of a computer 1 ); some players can perhaps eyeball the game and find the solution by intuition, even without being aware of the general solution concept; and in some cases, the players can converge to the solution by following simple learning rules.In each case, some computational machinery (respectively, one player's conscious brain, a computer, one player's subconscious brain, or the system consisting of all players together) arrives at the solution using some procedure, or algorithm.
Some of the most basic computational problems in game theory concern the computation of Nash equilibria of a finite normal-form game.An example problem is to compute one Nash equilibrium-any equilibrium will do.What are good algorithms for solving such a problem?Certainly, we want the algorithm to always return a correct solution.Moreover, we are interested in how fast the algorithm returns a solution.Generally, as the size of the game (more generally, the problem instance) increases, so does the running time of the algorithm.Whether the algorithm is practical for solving larger instances depends on how rapidly its running time increases.An algorithm is generally considered efficient if its running time is at most a polynomial function of the size of the instance (game).There are certainly other properties that one may want the algorithm to have-for example, one may be interested in learning algorithms that are simple enough for people to use-but the algorithm should at least be correct and computationally efficient.
The same computational problem may admit both efficient and inefficient algorithms.The theory of computational complexity aims to analyze the inherent complexity of the problem itself: how fast is the fastest (correct) algorithm for a given problem?P is the class of problems that admit at least one efficient (polynomial-time) algorithm. 2While many problems have been proved to be in P (generally by explicitly giving an algorithm and proving a bound on its running time), it is extremely rare that someone proves that a problem is not in P. Instead, to show that a problem is hard, computer scientists generally prove results of the form: "If this problem can be solved efficiently, then so can every member of the class X of problems."This is usually shown using a reduction from one problem to another (we will give more detail on reductions in Section 2).If this has been proven, the problem is said to be X -hard (and X -complete if, additionally, the problem has also been shown to lie in X ).The strength of such a hardness result depends on the class X used.Usually, the class N P is used (we will describe N P in more detail in Section 2), and most problems of interest turn out to be either in P or N P-hard.N P contains P, and it is generally considered unlikely that P = N P. Exhibiting a polynomial-time algorithm for an N P-hard problem (thereby showing P = N P) would constitute a truly major upset: among other things, it would (at least in a theoretical sense, and possibly in a practical sense) break current approaches to cryptography, and it would allow a computer to find a proof of any theorem that has a proof of reasonable length.
The problem of finding just one Nash equilibrium of a finite normal-form game is one of the rare interesting problems that have neither been shown to be in P, nor shown to be N P-hard.Not too long ago, it was dubbed "a most fundamental computational problem whose complexity is wide open" and "together with factoring, [...] the most important concrete open question on the boundary of P today" [39].A recent sequence of breakthrough papers [6,7,11,13] shows that the problem is PPAD-complete, even in the two-player case.(An earlier result shows that the problem is no easier if all utilities are required to be in {0, 1} [1].)This gives some evidence that the problem is indeed hard, although not nearly as much is known about the class PPAD as about N P. The best-known algorithm for finding a Nash equilibrium, the Lemke-Howson algorithm [28], has been shown to indeed have exponential running time on some instances (and is therefore not a polynomial-time algorithm) [45].More recent algorithms for computing Nash equilibria have focused on guessing which of the players' pure strategies receive positive probability in the equilibrium: after this guess, only a simple linear feasibility problem needs to be solved [14,42,44].These algorithms clearly require exponentially many guesses, and hence exponential time, on some instances, although they are often quite fast in practice.
The interest in the problem of computing a single Nash equilibrium has in large part been driven by the fact that it posed a challenge to complexity theorists.However, from the perspective of a game theorist, this is not always the relevant computational problem.One may, for example, be more interested in what the best equilibrium of the game is (for some definition of "best"), or whether a given pure strategy is played in any equilibrium, etc. Gilboa and Zemel [17] have demonstrated that many of these problems are in fact N P-hard.In Section 3, we continue this line of research by providing a single reduction that proves many results of this type.One important improvement over Gilboa and Zemel's results is that our reduction also shows inapproximability results: for example, not even an equilibrium that is approximately optimal can be found in polynomial time, unless P = N P. 3 We also use the reduction to show that counting the number of Nash equilibria (or connected sets of Nash equilibria) is #P-hard.
We proceed to prove some additional results (not based on the main reduction).In Section 4, we consider Bayesian games and show that determining whether a purestrategy Bayes-Nash equilibrium exists is N P-complete.Finally, in Section 5 we show that determining whether a pure-strategy Nash equilibrium exists in a Markov game is PSPACE-hard even if the game is unobserved, and that this remains N P-hard if the game has finite length.("Unobserved" means that the players never receive any information about what happened earlier in the game.)All of the hardness results in this paper hold even if there are only two players and the game is symmetric.These results suggest that for sufficiently large games, we cannot expect the players to always play according to these solution concepts, whether they are naïve learning players or sophisticated game theorists armed with state-of-the-art computing equipment.

Brief review of reductions and complexity
A key concept in computational complexity theory is that of a reduction from one problem A to another problem B. Informally, a reduction maps every instance of computational problem A to a corresponding instance of computational problem B, in such a way that the answer to the former instance can be easily inferred from the answer to the latter instance.Moreover, we require that this mapping is itself easy to compute.
If such a reduction exists, then we know that, in a sense, problem A is computationally at most as hard to solve as problem B: if we had an efficient algorithm for problem B, then we could use the reduction together with this algorithm to solve problem A.
The most directly useful reductions are those that reduce a problem of interest to a problem for which we already have an efficient algorithm.However, another (backward) use of reductions is to reduce a problem that is known or conjectured to be hard to the problem of interest.Such a reduction tells us that we cannot hope to find an efficient algorithm for the problem of interest without (implicitly) also finding such an algorithm for the hard problem.
Certain problems have been shown to be hard for a large class of problems (such as N P).Problem A is hard for class X if any problem in X can be reduced to problem A. Thus, exhibiting an efficient algorithm for the hard problem entails exhibiting an efficient algorithm for every problem in the class.Once one problem A has been shown hard for a class, the task of proving that another problem B is hard for the same class generally becomes much easier: we can do so by reducing A to B. A problem is complete for a class if 1) it is hard for the class and 2) the problem is itself in the class.
The class for which problems are most often shown to be hard (or complete) is N P. N P is the class of all decision problems (problems that require a "yes" or "no" answer) such that if the answer to a problem instance is "yes", then there exists a polynomialsized certificate for that instance that proves that the answer is "yes".More precisely, such a certificate can be used to check that the answer is "yes" in polynomial time.The most famous complete problem for N P is satisfiability (SAT).An instance of satisfiability is given by a Boolean formula in conjunctive normal form (CNF)-that is, an "AND" of "ORs" of ground literals (Boolean variables and their negations).We are asked whether there exists some assignment of truth values to the variables such that the formula evaluates to true.For example, the formula (x 1 ∨ x 2 ) ∧ (−x 1 ) ∧ (x 1 ∨ −x 2 ∨ −x 3 ) is satisfiable by setting x 1 to false, x 2 to true, and x 3 to false.(This assignment is also a certificate for the instance, since it is easy to check that it makes the formula evaluate to true.)However, if we add a fourth clause (x 1 ∨ −x 2 ∨ x 3 ), then the formula is no longer satisfiable.Satisfiability was the first problem shown to be N P-complete [10], but many other problems have been shown N P-complete since then (often by reducing satisfiability to them).
There are other classes of problems that are even larger 4 than N P, and for which natural problems are sometimes shown to be hard, constituting even stronger evidence that there is no efficient algorithm for the problem.One of these classes is #P, the class of problems counting how many solutions a particular instance has.(It is required that solutions can be verified efficiently.)An example problem in #P is counting how many satisfying assignments a CNF formula has.(This problem is in fact #Pcomplete [50].)Another class is PSPACE, the class of problems that can be solved using only polynomial space.

The main reduction and its implications
In this section, we give our main reduction, which maps every instance of satisfiability (given by a formula in conjunctive normal form) to a finite symmetric two-player normal-form game.This reduction has no direct complexity implications for the problem of finding one (any) Nash equilibrium.However, it has significant implications for many related problems.Most significantly, it shows that, for many properties, deciding whether an equilibrium with that property exists is N P-hard.For example, it shows that deciding whether an equilibrium with social welfare at least k is N P-hard (hence it is also hard to find the social-welfare maximizing equilibrium, arguably a key problem in equilibrium selection).As another example, it shows that deciding whether a certain pure strategy occurs in the support of at least one Nash equilibrium is N P-hard.This has indirect implications for the problem of finding one Nash equilibrium: several recent algorithms for that problem operate by guessing the equilibrium supports and subsequently checking whether the guess is correct [14,42,44].The result above implies that it is N P-hard to determine whether such an algorithm can safely restrict attention to guesses in which a particular pure strategy is included in the support.
These are not the first results of this nature; Gilboa and Zemel provide a number of N P-hardness results in the same spirit [17].Our reduction demonstrates (sometimes stronger versions of) most of their hardness results, as well as some new ones.Significantly, for the problems that concern an optimization (e.g., maximizing social welfare), we show not only N P-hardness but also inapproximability: unless P = N P, there is no polynomial-time algorithm that always returns a Nash equilibrium that is close to obtaining the optimal value.We also use the reduction to show that counting the number of equilibria of a game is #P-hard.(One may argue that it is impossible to have a good overview of all the Nash equilibria of a game if one cannot even count them.) For completeness, we review the following basic definitions.
Definition 1 In a normal-form game, we are given a set of players A, and for each player i ∈ A, a (pure) strategy set Σ i and a utility function We will assume throughout that games have finite size.
Definition 2 A mixed strategy σ i for player i is a probability distribution over Σ i .A special case of a mixed strategy is a pure strategy, where all of the probability mass is on one element of Σ i .
Definition 3 (Nash [36]) Given a normal-form game, a Nash equilibrium (NE) is vector of mixed strategies, one for each player i, such that no player has an incentive to deviate from her mixed strategy given that the others do not deviate.That is, for any i and any alternative mixed strategy σ i , we have where each s j is drawn from σ j , and s i from σ i .
It is well-known that every finite game has at least one Nash equilibrium [36].We are now ready to present our reduction. 5efinition 4 Let φ be a Boolean formula in conjunctive normal form (representing a SAT instance).Let V be its set of variables (with |V | = n), L the set of corresponding literals (a positive and a negative one for each variable 6 ), and C its set of clauses.The function v : L → V gives the variable corresponding to a literal, e.g., v(x 1 ) = v(−x 1 ) = x 1 .We define G (φ) to be the following finite symmetric 2-player game in normal form.
Let the utility functions be We will show in Theorem 1 that each satisfying assignment of φ corresponds to a Nash equilibrium of G (φ), and that there is one additional equilibrium.The following example illustrates this.

Example 1
The following table shows the game G (φ) where φ = (x version of this work.The reason is that the new reduction presented here implies inapproximability results that the original reduction does not. 6Thus, if x i is a variable, +x i and −x i are literals.Often, the + is dropped from the positive literal (especially when writing CNF formulas), but it is helpful for distinguishing positive literals from variables.
The only two solutions to the SAT instance defined by φ is to either set both variables to true, or both to false.The only equilibria of the game G (φ) are those where: 1. both players randomize uniformly over {+x 1 , +x 2 }; 2. both players randomize uniformly over {−x 1 , −x 2 }; 3. both players play f .
We are now ready to prove the result in general.
Theorem 1 If (l 1 , l 2 , . . ., l n ) (where v(l i ) = x i ) satisfies φ, then there is a Nash equilibrium of G (φ) where both players play l i with probability 1 n , with expected utility n − 1 for each player.The only other Nash equilibrium is the one where both players play f , and receive expected utility each.
Proof: We first demonstrate that these combinations of mixed strategies indeed do constitute Nash equilibria.If (l 1 , l 2 , . . ., l n ) (where v(l i ) = x i ) satisfies φ and the other player plays l i with probability 1 n , playing one of these l i as well gives utility n − 1.On the other hand, playing the negation of one of these l i gives utility Playing some variable v gives utility 1 n (0)+ n−1 n (n) = n−1 (since one of the l i that the other player sometimes plays has v(l i ) = v).Playing some clause c gives utility at most 1 n (0) + n−1 n (n) = n − 1 (since at least one of the l i that the other player sometimes plays occurs in clause c, since the l i satisfy φ).Finally, playing f gives utility n − 1.It follows that playing any one of the l i that the other player sometimes plays is an optimal response, and hence that both players playing each of these l i with probability 1 n is a Nash equilibrium.Clearly, both players playing f is also a Nash equilibrium since playing anything else when the other plays f gives utility 0. Now we demonstrate that there are no other Nash equilibria.If the other player always plays f , the unique best response is to also play f since playing anything else will give utility 0. Otherwise, given a mixed strategy for the other player, consider a player's expected utility given that the other player does not play f .(That is, the probability distribution over the other player's strategies is proportional to the probability distribution constituted by that player's mixed strategy, except f occurs with probability 0).If this expected utility is less than n − 1, the player is strictly better off playing f (which gives utility n − 1 when the other player does not play f , and also performs better than the original strategy when the other player does play f ).So this cannot occur in equilibrium.
As we pointed out, here are no Nash equilibria where one player always plays f but the other does not, so suppose both players play f with probability less than one.Consider the expected social welfare (E[u 1 + u 2 ]), given that neither player plays f .It is easily verified that there is no outcome with social welfare greater than 2n − 2. Also, any outcome in which one player plays an element of V or C has social welfare at most n − 4 + n < 2n − 2. It follows that if either player ever plays an element of V or C, the expected social welfare given that neither player plays f is strictly below 2n − 2. By linearity of expectation it follows that the expected utility of at least one player is strictly below n − 1 given that neither player plays f , and by the above reasoning, this player would be strictly better off playing f instead of her randomization over strategies other than f .It follows that no element of V or C is ever played in a Nash equilibrium.
So, we can assume both players only put positive probability on strategies in L ∪ {f }.Then, if the other player puts positive probability on f , playing f is a strictly better response than any element of L (since f does as at least as well against any strategy in L, and strictly better against f ).It follows that the only equilibrium where f is ever played is the one where both players always play f .Now we can assume that both players only put positive probability on elements of L. Suppose that for some l ∈ L, the probability that a given player plays either l or −l is less than 1 n .Then the expected utility for the other player of playing v(l) is strictly greater than and hence this cannot be a Nash equilibrium.So we can assume that for any l ∈ L, the probability that a given player plays either l or −l is precisely 1  n .If there is an element of L such that player 1 puts positive probability on it and player 2 on its negation, both players have expected utility less than n − 1 and would be better off switching to f .So, in a Nash equilibrium, if player 1 plays l with some probability, player 2 must play l with probability 1 n , and thus player 1 must play l with probability 1 n .Thus we can assume that for each variable, exactly one of its corresponding literals is played with probability 1 n by both players.It follows that in any Nash equilibrium (besides the one where both players play f ), literals that are sometimes played indeed correspond to an assignment to the variables.
All that is left to show is that if this assignment does not satisfy φ, it does not correspond to a Nash equilibrium.Let c ∈ C be a clause that is not satisfied by the assignment, that is, none of its literals are ever played.Then playing c would give utility n, and both players would be better off playing this.
From Theorem 1, it follows that there exists a Nash equilibrium in G (φ) where each player gets utility n − 1 if and only if φ is satisfiable; otherwise, the only equilibrium is the one where both players play f and each of them gets .Suppose n − 1 > .Then, any sensible definition of welfare optimization would prefer the first kind of equilibrium.Because determining whether φ is satisfiable is N P-hard, it follows that determining whether a "good" equilibrium exists is N P-hard for any such definition.Additionally, the first kind of equilibrium is, in various senses, an optimal outcome for the game, even if the players were to cooperate; hence, finding out whether such an optimal equilibrium exists is N P-hard.More significantly, given that n − 1 is significantly larger than , there is no efficient algorithm that always returns an equilibrium that is "close" to optimal (assuming P =N P): either an optimal equilibrium is found, or we have to settle for the equilibrium that gives each player .
In the remainder of this section, we prove a variety of corollaries of Theorem 1 that illustrate these and other points.We start with corollaries that do not involve an optimization problem.All of these corollaries show N P-completeness of a problem, meaning that the problem is both N P-hard and in N P. Technically, only the N Phardness part is a corollary of Theorem 1 in each case.Membership in N P follows because, for the case of two players, if an equilibrium with the desired property exists, then the supports in this equilibrium constitute a polynomial-length certificate.This is because given the supports, the remainder of the problem can be solved using linear programming (and linear programs can be solved in polynomial time [23]).
Corollary 1 Even in symmetric 2-player games, it is N P-complete to determine whether there exists a Pareto-optimal Nash equilibrium.(A distribution over outcomes is Paretooptimal if there is no other distribution over outcomes such that every player has at least the same expected utility, and at least one player has strictly greater expected utility.)Proof: For < 1 and n ≥ 2, any Nash equilibrium in G (φ) corresponding to a satisfying assignment is Pareto-optimal, whereas the Nash equilibrium that always exists is not Pareto-optimal.Thus, a Pareto optimal Nash equilibrium exists if and only if φ is satisfiable.
Corollary 2 (Gilboa and Zemel [17]) Even in symmetric 2-player games, it is N Pcomplete to determine whether there is more than one Nash equilibrium.
Proof: For any φ, G (φ) has additional Nash equilibria (besides the one that always exists) if and only if φ is satisfiable.
Corollary 3 (Gilboa and Zemel [17]) 7 Even in symmetric 2-player games, it is N Pcomplete to determine whether there is a Nash equilibrium where player 1 sometimes plays a given x ∈ Σ 1 .
Proof: For any φ, in G (φ), there is a Nash equilibrium where player 1 sometimes plays +x 1 if and only if there is a satisfying assignment to φ with x 1 set to true.But determining whether this is the case is N P-complete.
Corollary 4 (Gilboa and Zemel [17]) 8 Even in symmetric 2-player games, it is N Pcomplete to determine whether there is a Nash equilibrium where player 1 never plays a given x ∈ Σ 1 .
Proof: For any φ, in G (φ), there is a Nash equilibrium where player 1 never plays f if and only if φ is satisfiable.
Definition 5 A strong Nash equilibrium [2] is a vector of mixed strategies for the players so that no nonempty subset of the players can change their strategies to make all players in the subset better off.
Corollary 5 Even in symmetric 2-player games, it is N P-complete to determine whether a strong Nash equilibrium exists.
Proof: For < 1 and n ≥ 2, any Nash equilibrium in G (φ) corresponding to a satisfying assignment is a strong Nash equilibrium, whereas the Nash equilibrium that always exists is not strong.Thus, a strong Nash equilibrium exists if and only if φ is satisfiable.
The next few corollaries concern optimization problems, such as maximizing social welfare, or maximizing the number of pure strategies in the supports of the equilibrium.For such problems, an important question is whether they can be approximately solved.For example, is it possible to find, in polynomial time, a Nash equilibrium that has at least half as great a social welfare as the social-welfare maximizing Nash equilibrium?Or-a nonconstructive version of the same problem-can we, in polynomial time, find a number k such that there exists a Nash equilibrium with social welfare at least k, and there is no Nash equilibrium with social welfare greater than 2k? (The latter problem does not require constructing a Nash equilibrium, so it is conceivable that there is a polynomial-time algorithm for this problem even if it is hard to construct any Nash equilibrium.)We will not give approximation algorithms in this subsection; rather, we will derive certain inapproximability results from Theorem 1.In each case, we will show that even the nonconstructive problem is hard (and therefore the constructive problem is hard as well).
Before presenting our results, we first make one subtle technical point, namely that it is unreasonable to expect an approximation algorithm to work even when the game has some negative utilities in it.For suppose we had an algorithm that approximated (say) social welfare to some positive ratio, even when there are some negative utilities in the game.Then we can "boost" its results, as follows.Suppose the algorithm returns a social welfare of 2r on a game, and suppose this is less than the social welfare of the best Nash equilibrium.If we subtract r from all utilities in the game, the game remains the same for all strategic purposes (it has the same set of Nash equilibria).But now the result returned by the approximation algorithm on the original game corresponds to a social welfare of 0, which does not satisfy the approximation ratio.It follows that running the approximation algorithm on the transformed game must give a better result (which we can easily transform back to the original game).
For this reason, we require our hardness results to only use reductions to games where 0 is the lowest possible utility in the game.Strictly speaking, our main reduction does not have this property, as can be seen from Example 1.Nevertheless, G (φ) does have this property whenever n ≥ 4. (We recall that n is the number of variables in φ.) Hence, our reduction does in fact suffice, because satisfiability remains an N P-hard problem even under the restriction n ≥ 4. 9We are now ready to present the remaining corollaries.
Corollary 6 Unless P = N P, there does not exist a polynomial-time algorithm that approximates (to any positive ratio) the maximum social welfare obtained in a Nash equilibrium, even in symmetric 2-player games.(This holds even if the ratio is allowed to be a function of the size of the game.) Proof: Suppose such an algorithm did exist.For any formula φ (with number of variables n ≥ 4), consider the game G (φ) where is set so that 2 < r(2n − 2) (here, r is the approximation ratio that the algorithm guarantees for games of the size of G (φ)).If φ is satisfiable, then by Theorem 1, there exists an equilibrium with social welfare 2n − 2, and thus the approximation algorithm should return a social welfare of at least r(2n − 2) > 2 .Otherwise, by Theorem 1, the only equilibrium has social welfare 2 , and thus the approximation algorithm should return a social welfare of at most 2 .Thus we can use the algorithm to solve arbitrary SAT instances.
Corollary 7 Unless P = N P, there does not exist a polynomial-time algorithm that approximates (to any positive ratio) the maximum egalitarian social welfare obtained in a Nash equilibrium, even in symmetric 2-player games.(This holds even if the ratio is allowed to be a function of the size of the game.The egalitarian social welfare is the expected utility of the worse-off player.) Proof: The proof is similar to that of Corollary 6.
Corollary 8 Unless P = N P, there does not exist a polynomial-time algorithm that approximates (to any positive ratio) the maximum utility for player 1 obtained in a Nash equilibrium, even in symmetric 2-player games.(This holds even if the ratio is allowed to be a function of the size of the game.) Proof: The proof is similar to that of Corollary 6.
The next few corollaries use the notation o(x), which refers to functions that grow slower than linearly in x, and Ω(x), which refers to functions that grow at least as fast as linearly in x.The corollaries state that it is hard to maximize (even approximately) the number of pure strategies played with positive probability (respectively, for both players together, for the player with the smaller support, and for one player only) in a Nash equilibrium.
Corollary 9 Unless P = N P, there does not exist a polynomial-time algorithm that approximates (to any ratio 1/o(|Σ|)) the maximum number, in a Nash equilibrium, of pure strategies in the players' strategies' supports, even in symmetric 2-player games.Proof: Suppose such an algorithm did exist.For any formula φ, consider the game G (φ) where is set arbitrarily.If φ is not satisfiable, then by Theorem 1, the only equilibrium has only one pure strategy in each player's support, and thus the algorithm can return a number of strategies of at most 2. On the other hand, if φ is satisfiable, then by Theorem 1, there is an equilibrium where each player's support has size Ω(|Σ|).(This is assuming that n, the number of variables in φ, is Ω(|Σ|).This is only true if the number of clauses in φ is at most linear in the number of variables, but it is known that SAT remains N P-hard under this restriction-for example, SAT is known to remain N P-hard even if each variable occurs in at most 3 clauses.)Because by assumption our algorithm has an approximation ratio of 1/o(|Σ|), this means that for large enough |Σ|, the algorithm must return a support size strictly greater than 2. Thus we can use the algorithm to solve arbitrary SAT instances (given that the instances are large enough to produce large enough |Σ|).
Corollary 10 Unless P = N P, there does not exist a polynomial-time algorithm that approximates (to any ratio 1/o(|Σ|)) the maximum number, in a Nash equilibrium, of pure strategies in the support of the player that uses fewer pure strategies than the other, even in symmetric 2-player games.
Proof: The proof is similar to that of Corollary 9.
Corollary 11 Unless P = N P, there does not exist a polynomial-time algorithm that approximates (to any ratio 1/o(|Σ|)) the maximum number, in a Nash equilibrium, of pure strategies in player 1's support, even in symmetric 2-player games.
Proof: The proof is similar to that of Corollary 9.
Versions of Corollaries 7 and 10 that do not mention inapproximability were proven by Gilboa and Zemel [17].
The final corollary goes beyond N P-hardness, to #P-hardness.Determining whether equilibria with certain properties exist is not always sufficient: sometimes, we are interested in characterizing all the equilibria of a game.One rather weak such characterization is the number of equilibria. 10We can use Theorem 1 to show that determining this number is #P-hard.
Corollary 12 Even in symmetric 2-player games, counting the number of Nash equilibria is #P-hard.
Proof: The number of Nash equilibria in our game G (φ) is the number of satisfying assignments of φ, plus one.Counting the number of satisfying assignments to a CNF formula is #P-hard [50].
In a sense, the most interesting #P-hardness results are the ones where the corresponding existence problem (does there exist at least one solution?)and search problem (construct one solution, if one exists) are easy.This is the case, for example, for the problem of counting the perfect matchings in a bipartite graph [50].For the problem of counting the Nash equilibria in a finite normal-form game, the corresponding existence problem is trivial (at least one Nash equilibrium always exists, so the answer is always "yes"), but the search problem is PPAD-complete.

Pure-strategy Nash equilibria in Bayesian games
Equilibria in pure strategies are particularly desirable because they avoid the uncomfortable requirement that players randomize over strategies among which they are indifferent.In normal-form games, it is easy to determine the existence of pure-strategy equilibria: one can simply check, for each combination of pure strategies, whether it constitutes a Nash equilibrium.This trivial algorithm runs in time that is polynomial in the size of the normal form.However, this approach is not computationally efficient in Bayesian games where the players have private information about their own preferences (this private information is known as the player's type).In such games, players can condition their actions on their types, resulting in a strategy space that is exponential in the number of types (whereas the natural representation of the Bayesian game is not exponential in the number of types).
In this section, we show that determining whether a pure-strategy Bayes-Nash equilibrium exists is in fact N P-complete even in symmetric two-player Bayesian games.(A mixed-strategy equilibrium always exists, although constructing one is PPADhard because normal-form games are a special case of Bayesian games.)First, we review the standard definitions of Bayesian games and Bayes-Nash equilibrium.

Definition 6
In a Bayesian game, we are given a set of players A; for each player i, a set of types Θ i ; a commonly known prior distribution φ over Θ 1 × Θ 2 × . . .× Θ |A| ; for each player i, a set of actions Σ i ; and for each player i, a utility function u i : We emphasize again that we only consider finite games; in particular, we only consider finite type spaces.
Definition 7 (Harsanyi [21]) Given a Bayesian game, a Bayes-Nash equilibrium (BNE) is a vector of probability distributions over actions, one distribution (over Σ i ) for each pair i, θ i ∈ Θ i , such that no player has an incentive to deviate, for any of her types, given that the others do not deviate.That is, for any i, θ i ∈ Θ i , and any alternative probability distribution σ i,θi over Σ i , we have where each s i,θi is drawn from σ i,θi , and s i,θi from σ i,θi .
A Bayesian game can be converted to a normal-form game as follows.For every player i, let every mapping s i : Θ i → Σ i be a pure strategy in the new normal-form game.Then, the utility function for the normal-form game is given by u i (s 1 , . . ., s |A| ) = E θ1,...,θ |A| [u i (θ i , s 1 (θ 1 ), . . ., s |A| (θ |A| )].Assuming that no type receives 0 probability under the prior, the Nash equilibria of this normal-form game correspond exactly to the Bayes-Nash equilibria of the original game.However, the normal-form game is exponentially larger (player i has |Σ i | |Θi| pure strategies in it), so this conversion is of little use for solving computational problems efficiently.
We can now define the computational problem.
Definition 8 (PURE-STRATEGY-BNE) We are given a Bayesian game.We are asked whether there exists a BNE where every distribution σ i,θi places all its mass on a single action.
To show our N P-hardness result, we will reduce from the N P-complete SET-COVER problem.
Theorem 2 PURE-STRATEGY-BNE is N P-complete, even in symmetric 2-player games where the priors over types are uniform.
Proof: To show membership in N P, we observe that, given an action for each type for each player, it is easy to verify whether these constitute a BNE: we merely need to check that for each player i, for each type θ i , the corresponding action maximizes i's expected utility (with respect to θ i , given the (conditional) distribution over −i's types and given −i's strategy).This is done by computing the expected utility for θ i for each possible action for i. (As an aside, we cannot simply examine every (pure) strategy for each player, since there are exponentially many pure strategies.Effectively, the above only examines the strategies that deviate for only a single type, and this is sufficient.) To show N P-hardness, we reduce an arbitrary SET-COVER instance to the following PURE-STRATEGY-BNE instance.Let there be two players, with Θ = Θ 1 = Θ 2 = {θ 1 , . . ., θ k }.The priors over types are uniform.Furthermore, Σ = Σ 1 = Σ 2 = {S 1 , S 2 , . . ., S m , s 1 , s 2 , . . ., s n }.The utility functions we choose actually do not depend on the types, so we omit the type argument in their definitions.They are as follows: • u 1 (S i , S j ) = u 2 (S j , S i ) = 1 for all S i and S j ; • u 1 (S i , s j ) = u 2 (s j , S i ) = 1 for all S i and s j / ∈ S i ; • u 1 (S i , s j ) = u 2 (s j , S i ) = 2 for all S i and s j ∈ S i ; • u 1 (s i , s j ) = u 2 (s j , s i ) = −3k for all s i and s j ; • u 1 (s j , S i ) = u 2 (S i , s j ) = 3 for all S i and s j / ∈ S i ; • u 1 (s j , S i ) = u 2 (S i , s j ) = −3k for all S i and s j ∈ S i .
We now show the two instances are equivalent.First suppose there exist c 1 , c 2 , . . ., c k ∈ {1, . . ., m} such that 1≤i≤k S ci = S. Suppose both players play as follows: when their type is θ i , they play S ci .We claim that this is a BNE.For suppose the other player employs this strategy.Then, because for any s j , there is at least one S ci such that s j ∈ S ci , we have that the expected utility of playing s j is at most 1 k (−3k)+ k−1 k 3 < 0. It follows that playing any of the S j (which gives utility 1) is optimal.So there is a pure-strategy BNE.
On the other hand, suppose that there is a pure-strategy BNE.We first observe that in no pure-strategy BNE, both players play some element of S for some type: for if the other player sometimes plays some s j , the utility of playing some s i is at most k 3 < 0, whereas playing some S i instead guarantees a utility of at least 1.So there is at least one player who never plays any element of S. Now suppose the other player sometimes plays some s j .We know there is some S i such that s j ∈ S i .If the former player plays this S i , this will give her a utility of at least Since she must do at least this well in the equilibrium, and she never plays elements of S, she must sometimes receive utility 2. It follows that there exist S a and s b ∈ S a such that the former player sometimes plays S a and the latter sometimes plays s b .But then, playing s b gives the latter player a utility of at most 1 k (−3k) + k−1 k 3 < 0, and she would be better off playing some S i instead.This contradiction implies that no element of S is ever played in any pure-strategy BNE.Now, in our given pure-strategy equilibrium, consider the set of all the S i that are played by player 1 for some type.Clearly there can be at most k such sets.We claim they cover S. For if they do not cover some element s j , the expected utility of playing s j for player 2 is 3 (because player 1 never plays any element of S).But this means that player 2 (who never plays any element of S either) is not playing optimally.This contradiction implies that there exists a set cover.
5 Pure-strategy Nash equilibria in stochastic (Markov) games We now shift our attention from one-shot games to games with multiple stages.There has already been some research into the complexity of playing repeated and sequential games.For example, determining whether a particular automaton is a best response is N P-complete [3]; it is N P-complete to compute a best-response automaton when the automata under consideration are bounded [38]; the problem of whether a given player with imperfect recall can guarantee herself a given payoff using pure strategies is N P-complete [25]; and in general, best-responding to an arbitrary strategy can even be noncomputable [24,35].In this section, we present a PSPACE-hardness result on the existence of a pure-strategy equilibrium.
Markov (or stochastic) games constitute an important type of multi-stage games.In such games, there is an underlying set of states, and the game shifts between these states from stage to stage [15,47,48].At every stage, each player's payoff depends not only on the players' actions, but also on the state.Furthermore, the probability of transitioning to a given state is determined by the current state and the players' current actions.It should be noted that PSPACE-hardness results are known for alternatingmove games such as generalized Go [30] or QSAT [49]; however, if we were to formulate such a game as a Markov game, we would require an exponential number of states, so these results do not imply PSPACE-hardness for (straightforwardly represented) Markov games.Still, one might suspect that Markov games are hard to solve because the strategy spaces are extremely rich.However, in this section we show PSPACEhardness for a variant where the strategy spaces are quite simple: in this variant, the players cannot condition their actions on events in the game.

Definition 10 A Markov game consists of
• A set of players A; • A set of states S, among which the game transits, one of which is the starting state; • For each player i, a set of actions Σ i that can be played in any state; • A transition probability function p : In general, a player is not always aware of the current state of the game, the actions the others played in previous stages, or even the payoffs that the player has accumulated.In the extreme case, players never receive any information about any of these.We call such a Markov game unobserved.It is relatively easy to specify a pure strategy in an unobserved Markov game, because there is nothing on which the player can condition her actions.Hence, a strategy for player i is "simply" an infinite sequence of actions {a k i }.In spite of this apparent simplicity of the game, we show that determining whether pure-strategy equilibria exist is extremely hard.We do not need to worry about issues of credible threats and subgame perfection in this setting, so we can simply use Nash equilibrium as our solution concept.
Definition 11 (PURE-STRATEGY-UNOBSERVED-MARKOV-NE) We are given an unobserved Markov game.We are asked whether there exists a Nash equilibrium where all the strategies are pure.
Additionally, the game played in state r is some symmetric zero-sum game without a pure-strategy equilibrium (for example, a generalization of rock-paper-scissors) with very small payoffs.Finally, the discount factor is δ = ( 12 ) 1 2n+1 (so that δ 2n > 1 2 ).We start our analysis with a few observations.First, there can be no pure-strategy equilibrium in which state r is reached at some point, because (since r is an absorbing state) this would require that some pure-strategy equilibrium of the game in state r were played whenever state r occurred.(Otherwise, a player who is not best-responding in one of these stages could simply switch to a best response in this stage, and because the game is unobserved, the rest of the game would remain unaffected, so this would give higher utility.This is using the fact that in a pure-strategy equilibrium, on the path of play, every player always knows the current state, because the transition process is deterministic.)But such an equilibrium does not exist.Second, if we ever reach one of the t j i,c states, we will inevitably reach state r at some point after this.It follows that all pure-strategy Nash equilibria never leave the s i states.
Now suppose an assignment satisfying the periodic SAT formula exists.Let both players play as follows: in stage kn + i (with where b is the value that the variable x k i is set to.Clearly, both players receive utility 0 with these strategies.Does either player have an incentive to deviate?The only deviation of any significance is to play some c ∈ C when the current state is s 1 .So, without loss of generality (because of the symmetry of the game), say player 2 deviates to playing c ∈ C in stage kn+1 (when the state is s 1 ).We know that in the satisfying assignment, some variable x l i among x k 1 , . . ., x k n , x k+1 1 , . . ., x k+1 n is set to some b such that setting x l−k i to b satisfies c.If it is x k 1 , which is set to b, then in stage kn + 1 player 1 plays b, and player 2 gets payoff −1 in this stage since we are in state s 1 and setting x 0 1 to b satisfies c.Otherwise, if it is x l i with l = k + 1 or i = 1, which is set to b, then player 2 will get payoff 1 in stage kn + 1, but in stage ln + i player 1 plays b, and player 2 gets payoff −4 in this stage since we are in state t 2 (l−k)n+i,c and setting x l−k i to b satisfies c.The discounting is insignificant enough that this more than cancels out the 1 earned in stage kn + 1. Player 2 will get (at most) 0 in the other stages up to the first stage in state r, and given that we made the payoffs in the game in state r sufficiently small relative to δ, player 2 will not earn enough in the remaining stages to cancel out her losses so far.So there is no incentive to deviate.Thus, a pure-strategy NE exists.
On the other hand, suppose that no assignment satisfying the periodic SAT formula exists.Let us investigate whether a Nash equilibrium could exist.We know that in such a Nash equilibrium we never leave the s i , so both players receive utility 0, and no c is ever played in a stage with state s 1 .Since playing a c in one of the other stages can have no deterrent value, we may suppose that only elements of {t, f } are played.Now consider the following assignment to the x k i : if player 1 plays b in stage kn+i, x k i is set to b.Since no assignment satisfying the periodic SAT formula exists, we know there is some clause c and some k such that no variable and b ∈ {t, f } such that setting variable x i to b does satisfy c, and all x ∈ Σ; • All other utilities are 0.
We now proceed to show that the instances are equivalent.First suppose there exists an assignment of truth values to the variables such that every clause is satisfied.Then, if variable x i is set to b i ∈ {t, f } in this assignment, let each player play b i in the ith stage.This will give both players a total utility of 0. The only deviation for a player that may change this is to play some clause c in the first stage.However, some variable x i occurring in that clause must be set to a value that satisfies c.If it is x 1 , the deviating player will receive utility −1 in the first stage, and no positive utilities after that.Otherwise, it is some x i with i > 1, and the deviating player will receive 1 in the first stage, but −2 in the ith stage, and no positive utilities anywhere else.It follows that there is no incentive to deviate, and this is a pure-strategy Nash equilibrium.Now suppose there exists a pure strategy Nash equilibrium.If both players play a clause in the first stage, then both players would receive a utility of −1, and either player would be better off playing some b ∈ {t, f } in the first stage, to get a total utility of at least 0. So this cannot be the case in a pure-strategy equilibrium.If only one player plays a clause c in the first stage, any best response for the other player plays a truth value in the first stage, and plays b i in stage i whenever setting x i to b i satisfies c.But then, the clause-playing player receives negative utility overall, and is better off playing some b ∈ {t, f } in the first stage, to get a total utility of at least 0. So this also cannot be the case in a pure-strategy equilibrium.It follows that in any pure-strategy equilibrium, both players play a truth value in the first stage, and thus both players receive a total utility of 0. However, if there were no satisfying solution to the SAT instance, then there must be some clause c such that whenever setting x i to b i satisfies c, player 2 does not play b i in stage i.But then, player 1 is better off playing c in the first stage, to get a total utility of 1, contradicting the fact that we have a pure-strategy Nash equilibrium.It follows that there exists a satisfying solution to the SAT instance.
It is instructive to compare these two hardness results to known hardness results for partially observable Markov decision processes (POMDPs).A Markov decision process is a Markov game with a single player.A partially observable Markov decision process is a Markov decision process in which the current state is not directly observable, but a player may observe noisy signals about the state.Papadimitriou and Tsitsiklis [41] show that computing the optimal policy (strategy) for a POMDP is PSPACE-hard even with a finite horizon.(In fact, they show this for a special kind of POMDP in which the states are partitioned, and the player always observes the element of the partition to which the current state belongs.)Unlike Theorem 3, their reduction makes use of both probabilistic transitions and nontrivial observations about the current state.Papadimitriou and Tsitsiklis also mention that their reduction can be modified to show N P-hardness for the unobserved case, leading to a result that is more similar to our Theorem 4 (though neither result directly implies the other).

Conclusions and future research
We provided a single reduction that demonstrates that in normal-form games: 1) it is N P-complete to determine whether Nash equilibria with certain natural properties exist (these results are similar to those obtained by Gilboa and Zemel [17]), 2) more significantly, the problems of maximizing certain properties of a Nash equilibrium are inapproximable (unless P = N P), and 3) it is #P-hard to count the Nash equilibria.We also showed that determining whether a pure-strategy Bayes-Nash equilibrium exists in a Bayesian game is N P-complete, and that determining whether a pure-strategy Nash equilibrium exists in a Markov (stochastic) game is PSPACE-hard even if the game is unobserved (and that this remains N P-hard if the game has finite length).All of our hardness results hold even if there are only two players and the game is symmetric.
Another topic of interest is how the game is represented, that is, in what form the game is presented to the solver.A polynomial-time algorithm for normal-form games is of little use if the normal form is too large for the computer to store.In this case, the computer needs to operate directly on a more concise representation of the game.Examples of such representations (other than the extensive form) include graphical games [22], action-graph games [5,29], and multiagent influence diagrams [27].While changing the way the game is represented does not change it strategically, 11it does affect the computational complexity of solving the game [20,46].However, as long as the representation can capture any game, the computational problem cannot become any easier than under the straightforward representation.Therefore, our hardness results apply to such other representations as well.
Finally, we should consider the implications of complexity results in game theory for the modeling of human behavior.It seems unreasonable to expect humans to play according to solutions that are too hard for computers to find, so perhaps we should consider new solution concepts.On the other hand, as rationality and computational resources increase, it seems that the standard concepts should result in the limit.
1], where p(s 1 , a 1 , ..., a |A| , s 2 ) gives the probability of the game being in state s 2 in the next stage, given that the current state of the game is s 1 and the players play actions a 1 , ..., a |A| ;• For each player i, a payoff function u i : S × Σ 1 × ...Σ |A| → R, where u i (s, a 1 , ..., a |A| ) gives the payoff to player i when the players play actions a 1 , ..., a |A| in state s;• A discount factor δ such that the total utility of player i is∞ k=0 δ k u i (s k , a k 1 , . .., a k |A| ), where s k is the state of the game at stage k and the players play actions a k 1 , . . ., a k |A|in stage k.