Let’s Program A Prisoner’s Dilemma Part 1: Game Theory Is Less Fun Than It Sounds

A Different Kind Of Gaming

Game Theory, despite the name, is unfortunately not “The study of videogames”. Instead it is the science of mathematically analyzing how decisions are made.

The reason it’s called game theory is because one popular way to analyze real world decision making is to come up with a simplistic “game” that mimics the core dynamic of the real world problem. The game can then be mathematically analyzed and the results applied back to the more complex real world situation.

For instance, a game theorist studying crime might come up with a “game” where people who follow the law get 1 point and people who break the law have a chance of gaining two points but also a chance of losing three points. He can then mathematically analyze how different risk levels influence whether or not crime is a winning strategy.

The Prisoner’s Dilemma

The most famous example of a game theory game is the “Prisoner’s Dilemma”. Two criminal are caught red-handed committing a small crime that will land them in jail for one year. The police also suspect the two criminals are guilty of a larger crime worth two years of jail time, but they don’t have enough evidence to prove it.

The police eventually come up with a plan to interview each criminal separately and offer them this deal: Testify about the other guy’s involvement in the larger crime and we’ll drop the current small charge against you.

And now here’s the “game”: As one of the prisoners do you protect your buddy and keep quiet even though it means going to jail for a year? Or do you tattle in hopes of going free? And how do you know whether or not you can trust the other guy? If he confesses you go to jail for an extra two years.

Other Guy Keeps Quiet Other Guy Confesses
You Keep Quiet You both go to jail for one year You go to jail for three years

Other guy goes free

You Confess You go free

Other guy goes to jail for three years

You both go to jail for two years

Now as mathematicians we want to be able to analyze this choice with actual numbers, so let’s do a little game theorizing. Instead of two prisoners let’s imagine a two player game where each player is allowed to choose to either play nice and cooperate or to betray their partner and defect. If you cooperate you lose 1 point. If you defect the OTHER player loses 2 points.

Player B Cooperates Player B Defects
Player A Cooperates Player A: -1 point

Player B: -1 point

Player A: -3 points

Player B: -0 points

Player A Defects Player A: -0 points

Player B: -3 points

Player A: -2 points

Player B: -2 points

What makes this such an interesting game to study is the fact that the best group strategy and the best individual strategy are complete opposites.

As a group the obvious solution is for both players to cooperate so that between them they only lose 2 points. If one player defects the total loss jumps up to 3 points and if both players defect the total loss is 4 points, as bad as it gets.

But as an individual the equally obvious solution is to defect. After all, if you cooperate you are guaranteed to lose 1 point and might even lose 3 points if the other player defects. But if you yourself defect you might not lose any points and even if the other player defects you only lose 2. There is just no good reason to purposely give up one point.

So the smartest thing to do as an individual is to defect but the smartest thing to do as a group is to cooperate and there doesn’t seem to be any logical way to bridge the gap between these two strategies. Or is there?

Iterated Prisoner’s Dilemma

The core problem in the prisoners dilemma is the fact that it’s a one-time choice. You either cooperate or defect and a few seconds later you find out who won and who lost, end of story. This is why there is no logical reason not to betray the other player. If you defect you’re guaranteed to at least tie so why risk anything else?

But what if you had to play the game multiple times?

This could change things up. Now your decision to cooperate or defect might impact how future games are played. There is now a possibility to build or lose trust.

But sadly you can show that even with multiple games the best logical strategy is still to always defect. The logic goes like this:

1) During the last round you might as well defect since there are no more future rounds in which the other player could punish you.

2) Since a logical opponent will defect during the last round you might as well defect during the second to last round. It’s not like your opponent was going to cooperate in that last round anyways.

3) Since a logical opponent will defect during the second to last round you might as well defect during the third to last round…

4) Continue the pattern and you can see that always defecting is the best strategy.

There are only two ways to break out of this cycle.:

One is to have an infinite number of rounds so that players are always at risk of being punished in the future for bad behavior in the present. We’re not going to do that though for the simple reason that there is no way to run an infinite number of Prisoner’s Dilemmas on a finite computer.

The other way to break the cycle is to introduce more players.

Iterated Group Prisoner’s Dilemma

The iterated group prisoners dilemma works the same as the normal iterated prisoner dilemma except that you have several different players who get randomly paired up before every round.

This drastically changes the dynamics of the game.

In a two-player prisoner’s dilemma double defection is a “stable” strategy because it means neither player will get ahead of the other. You might be losing two points every round but the other guy is too so you aren’t actually losing.

But when there’s a group? Now a players who is constantly losing two points from double defection can end up falling behind other pairs of players who cooperate and thus only lose one point per round. Suddenly it’s not enough to try and figure out whether your current partner is going to cooperate or defect; you also have to worry about whether other players in other matches are cooperating or defecting.

Interesting… But Is It Useful?

The Iterated Group Prisoners Dilemma is not only mathematically interesting, it is highly relevant to real life. Ever day people are faced with scenarios where they can choose to either work together or try to cheat each other. In the long run it’s best for society when people work together but on an individual level exploiting others can provide huge short term benefits.

So by analyzing the prisoners dilemma we can get some interesting insights into why different societies work the way they do along with some ideas on what sort of ethical systems are and aren’t stable.

Of course, real life has all sorts of issues that aren’t fully reflected in a simple tool like the prisoners dilemma so don’t think it somehow has all the answers to all our problems. But it’s still better than nothing and at the very least provides a nice topic for small talk with other math geeks.

What Does Any Of This Have To Do With Programming?

As AI enthusiasts we are obviously very interested in writing programs that can make logical decisions, and the prisoners dilemma provides a great sandbox for practicing that skill. It’s simple enough of a scenario to be implemented in less than a page of code while still being complex enough to produce interesting results.

So here’s the plan for this Let’s Program: We’re going to write a Prisoner’s Dilemma simulation along with a few different types of players each with their own strategy for deciding when to cooperate and when to defect. We’ll then run a bunch of simulations and see if we can find any interesting trends.

Like usual this isn’t exactly an original idea. Game theorists have been writing and testing prisoners dilemma programs for years and if you’re familiar with their research you probably already know everything we’re going to cover in this Let’s Program.

But who cares? The goal here is to practice our programming, not push the cutting edge of computer science. And there’s bound to be at least one reader in my audience who isn’t already a game theory expert.

Actually, quick show of hands: Is there anyone here who has never studied game theory before reading this blog?

There, see? That guy over there is the reason we’re doing this. He’s about to learn something cool.