Prisoner ’ s Dilemma , Chicken , and Mixed-strategy Evolutionary Equilibria

This is the author's draft of an article published in Behavioral and Brain Sciences. 
http://journals.cambridge.org/action/displayJournal?jid=BBS

the absence of a definitive historical record, therefore, the empirical strength of the current paper rests entirely on the charm, as a metaphor for human interactions, of the random matching one-shot Prisoner's Dilemma model with impelfeet identification of types.But this is not the only charming model available, and some of the others (including the same one with different parameter values) do not admit a stable proportion of cheaters, or do not admit cheaters at all.Also, if taken seriously, the model chosen makes predictions about sociopathy that do not seem to be true.
I will address these criticisms in reverse order.Recall that if a mixcd strategy profile is evolutionarily stable, then ifa particular strategy A increases (or decreases) in frequency, the payoffs to all the strategies must change so that A is relatively disadvantaged (or advantaged).l\atural selection will then ensure that strategy A goes back to its proper place.If we are to take the eurrent application of Frank's (1988) model seriously, then the propor tion of sociopaths in a society should follow a similar dynamic, But does it?Would primary sociopaths in an ancient society with "too many" of their own kind have had difficulty gaining access to resources and mating partners?\Ve just do not know, and the target article is not helpful.Perhaps cooperators would beeome more diligent.but if cheaters ganged up under a charismatic Attila one suspects large numbers would he an advantage.
The dynamic for secondarv sociopathy is discussed in the paper, but things seem to go the wrong way.One has to be careful here -if I IIlO\e from Kingston to :\ew York City and as a result my kids are more likely to beeome sociopathic, this eould he because environmental differences kad l\ew York to han~ a higher proportion of cheaters.Rather, suppose that a small group of young sociopaths move to Kingston, all else the same.According to the Frank/.\!lealey model, Kingston children will now be less likely to become sociopathic.I have no evidence; but like most parents, I think not.
Frank 's (1988) model is a variant of the round robin, infinitely repeated Prisoner's Dilelllma introduced by Axelrod (1984) in his celebrated competition.But there are other models that have equal "charm" and are therefore equally (un)likely to capture the essence of the conditions facing primitive homo sapiens.Here are two examples: Consider a model where individuals match up randomly, play a one-shot Prisoner's Dilemma, and then have the choice of continuing or terminating the match (Carmichael & .\!IacLeod 1994; Stanley 1993).If the match ends, both parties go back, anonymously, to the matching market.If not, the partners mav stay matched until death, continuing to playa Prisoner's Di lemma each period.For a modern image, think of the matching market as a large, dimly lit singles' bar.
Even though cooperators play a repeated Prisoner's Di lemma, the strategy "tit for tat" does very poorly.It is quickly invaded by cheaters who defect at the first opportunity and then move on to a new match.An interesting evolutionarily stahle strategy is for cooperators to offer (and demand) an exehange of gifts at the beginning of any new match (Carmichael & "hcLeod 1993), Cheaters in this society would have to buy a succession of gifts, and this effectively screens them out.This model makes quite a few predictioIlS about the form of the gifts that must be used.l Here is another one (Carmichael 1994).Suppose we retain the one-shot matching framework of Frank but change the game from a Prisoner's Dilemma to a bargain.People meet and have to decide how to divide the spoils ofsome joint vcnture (the carcass of some animal, perhaps).Ifthey can agree quickly on a division, then all is well.If they cannot, the spoils disappear, dragged away by a hyena.
Strategies that do well in these bargains will proliferate into the future.An intraspecies arms race might develop, where "bargaining ability" grows over time.Rational and unemotional

5.50
BEHAVIORAL AND BRAIN SCIENCES (1995) 183 bargainers will be vulnerahle to the "terrible twos" strategy of demanding almost everything, backed up with the ernotional threat to ensure that otherwise the hyena gets everything.Faced with such an opponent, a rational bargainer cuts his losses, takes what is offered, and mm'es on.Of course an entire society of two-year-olds does very poorly, and can be invaded by a group whose membcrs fight if thcv do not receive at least half the spoils.This strategy is evolutionarily stable -it quickly reaches agreement with itself and can do no worse than any invader it meets,2 Again, in this simple model, there is no room for cheaters.
Readers will recognize the "bourgeois" strategy of .\!Iaynarcl Smith (1982), but there are somc new twists.In particular, if people are of two types, there are many equilibria where one type does better than the other, If men fight whenever thev get less than one-third and women fight whenever they get less than two-thirds, for example, this is evolutionarily stable.Equilibria like these require that one's notion of territory be socially determined.There must be a way for early experience and teaching to establish and coordinate internaillotions ofwhat one deserves out of life.Sociobiology ean therei()re account {()r the existence of systemic discrimination, and society may indeed, at least partly, be to blame.
The point, of course, is not that these are better models of reality than the one used in the target article, but they do seem equally plausible, and they have implications that are at least as attractive and intriguing.Perhaps they each eapture relevant hut separate aspects of reality.If so, "Ieaky's conelusion -that evolution is unable to rid society ofa small proportion ofcheaters -is not robust.(The rule of emotions in these models, bv contrast, is robust.)Cheaters do prosper, no doubt.But until w~ have excellent evidence about the exact nature of earlv human society, or until we can show that in any sensible evolutionary model there will survive a small proportion of cheaters, socio biology will not be able to tell us why, \;OTES L Among other things, gifts should havc 10\\ use value, be oVt'r priced, and should he harel to recycle as gifts in a suhsequent match.Cut flowers and chocolates work -house plants and monev do not.
2. lTnless, of course, the invader has a weapon.The arms race in this modcl is real.Abstract: Mealt'y's interesting interpretation of sociopathy is based on an inappropriate two-person game mo(lel.A multiperson, compound game version ofChickcll would he more suitable, becausp a population engaging in random pairwist, interactions with that structure would evolve to an equilibrium in which a fixed proportion of strategic choices was exploitative, antisocial, and risky.as rt'quired bv ~h'aley's interpretation.
In a target article of exceptional scholarship and originality, .\!Iealcy has put forward an interesting new interpretation of sociopathy.Given the vast range of material covered by thc article and the limited space available {(Jr my commentary, I shall confine my cOlllmcnts to the specific game theoretic model that underpins :'.ll'aley's interpretation.I shall argue that it cannot do what is required of it, and I shall suggest an alternative.
Like most earlier theorists who have used game theory to explain the evolution of social behavior, starting with :'.laynardThe row and column players each choose between two pure strategies, C and D, the payoffs shown in the matrix are those to the row player.Thus the payoff to the row player following a C choice is R or 5. depending on whether the colulIln plaver chooses C or J), respectively, and following a D choicc it is T o~ P, depending on whether the column player chooses C or D, respectively.In the Prisoner's Dilemma game, C represents cooperation and D defection, and hy definition T > R > P > S. so that the row plan'r reccives the hest payoff In choosing U (defect) while the colullllI player chooses C (cooperatcl, the second-best pavoff In choosing C while the column playcr chooses C, and so Oil.Although it is customary to show only til(' row player's payoffs, the game is the same 'from the column player's point of view, so that the column player also gets the best payoff by choosing D while the row player chooses C, and so on.
The standard two-person model is of lilllited value in deter mining evolutionary processes.vVe need to cstablish what will happen in an entire population in which individuals interact with one another in pairwise games with this strategic structure, assuming that the payoffs represent units of DanL'illilln jitlles.s(the lifetilllc reproductive success of the individual players I ane! that the propensit\• to ('hoose CorD is heritable, For this purpose, we ne(,d to COllstruct a l11ultiperson compound game (Colman 191)2,, in which it is assllmed that every pla~'('r pla\'s the same average number of two-person games either with each of the others or with a random salllpk of the others. Considering the situation from a single player's viewpoint, suppose that the number of other players is n and the number of other players choosing C is c.The total payoff to a player choosing C, denoted by P(C), and the total payoff to a player choosing D, denoted by P(D), are then defined by the following payoff functions: The total pavoff to a player adopting a mixed stratt'g\ is just a weighted awrage of P(el and P(D).
The values of the P(CI and P(D) payoff functions at their end points arC' found by sdting c = 0 and c = n.Thus, if none of the other players chooses C (i. c., c = 0), the payoff to a solitarv C chooser is Sn and the payoff to a D chooser is PII.If all o(th(' otherplavers choose C (i.e., c = II), then a C chooser gets HII alld a solitary D chooser is Tn.It is clear that in the case of the Prisoner's Dilemma gamc Til can be interpreted as the tnltpta tion to be the sole D chooser, RII the reward !(Jr collective cooperation, PII the p1lllisizment for collective defection, and Sn the suckf'r's' pill/off filr hei])g the sole C chooser.
Figure 1':al sluJ\\'s clearly that, in the ease of the PrisoJler's Dilemma g<llll(' (with T> Ii > P > 5), the P(D) pa\offfulldioll strictly dominates the PlCi payoff function, which means that a D choice pays better than a C choice irrespective of the number of others choosing C. The C'volutiollarily optimal strategy is therefore not frequency-dependent.and the population will (regrettably) evolve to a stahle equilibrium in which every player chooses D in every two-person encounter.This mean's that the Prisoner's Dilemma game cannot provide a basis for In this case, the population will evolve to a mixed-strategy equilibrium point, where the two payoff func tions intersect.10 the left of the intersection, when relatively few of the others choose C (e is small), the C function lies abov~ the D function, which means that the fjtness payoff from a C choice is higher than fro II I a J) choice.so the number of C choosers will increase relative to D choosers and the outcome will move to the right as c increases.10 the right of the intersection, exactly the reverse holds: D choosers will increase relative to C choosers and the outcome will move to the left as c decreases.At the intersection, and only there, the strategies are best against each another and are in equilibrium, and any deviation from the mixture at that point will tend to he self-correcting.By setting the parameters (values of the payoffs T, R, S, and P) ~ppropriat?,ly, the intersection point, and thus the proportion of predatory D-ehoices, can be made as small as required.
It appears, therefi:lre, that the Prisoner's Dilemma game canno~ underpin an e\'Ollitionary explanation of sociopathic behaVIOr, but that a II1ultiperson compound game version of Chicken, in which cheating is at least frequency-dependent, Blight be more promising.Chi('ken is the archetypal dangerous game, because a player call outdo a co-player only by cheating :choo,sing D) while the co-player behaves cautiously (by choos mg C!, and any such attempt to get the best payoff (T in the payoff matrix above) involvcs a nccessarv risk of the worst possihle payoff (P).The interpretation of c;iminal, delinquent, and generally antisocial behavior in terms of strategic choices seems more natural in the game of Chicken.(For a more detailed discussion of the strategic properties of Chicken and some observations on its application to antisocial and criminal behavior, see Colman 191)2, pp. 98-104;1995, sect. 9.G.) AC KN OWL E DC.\1 E l\'T Pr"lXmltioll of this commentary was facilitated by Crant :\10.L122251002 from tIlt' Economic and Social Research Council as part of the Framing, Salience and Product Images project.

Figure 1 (
Figure 1 (Colman).M ultiperson compound games based on 2 X 2 matrices.Panel (a) 011 tllP left is multiperson Prisoner's Di lemma; (b) on the right is multiperson Chicken.The PIC) and P(D) functions indicate the payoffs to a player choosing C or D when c of the other players choose C. Dashed circles indicatc stable equilibria.