Let’s Program A Prisoners Dilemma 5: What Is A Decision Making Process?

The saint and devil prisoners were easy to write but, as we’ve seen, they’re not actually very good at playing the game. The always cooperating saints are wide open to exploitation and the devils will always defect even when cooperating would score them more points in the long run.

We clearly need a prisoner who can actually make choices instead of just doing the same thing again and again.

1d6 Points of SAN Loss

As any game designer can tell you, the easiest way to simulate decision making is with random numbers. Just give the computer a list of decisions and let it pick one at random.

So on that note I give you: The madman.

class MadMan < Prisoner
   def initialize(id)
      super(id)
      @strategy = "MadMan"
   end

   def cooperate?(opponentID)
      choice = rand(2)
      if( choice == 1)
         return true
      else
         return false
      end
   end
end

Ruby weirdness warning: In a lot of languages “0” is the same as false, so you might be tempted to have cooperate? just return the number generated by rand. Don’t do that. In Ruby only “false” and “null” are considered false. Everything else, including the number 0, are considered true. This is useful because it means the number 0 always acts like just a number. On the other hand it messes up a lot of other useful programming shortcuts so all in all it sort of break even.

Anyways, don’t forget to tell our create Prisoners method that there’s a new type of prisoner object for it to work with.

def createPrisoners(saintCount, devilCount, madmanCount)
   prisoners = Array.new
   playerCounter = 0
   saintCount.times{ prisoners.push(Saint.new(playerCounter += 1)) }
   devilCount.times{ prisoners.push(Devil.new(playerCounter += 1)) }
   madmanCount.times{ prisoners.push(MadMan.new(playerCounter += 1)) }
   return prisoners
end

Inmates Are Running The Asylum

Let’s be honest here: Making decisions at random is almost never a good idea. So how does our new class of insane prisoners perform in an actual game?

prisoners = createPrisoners(4, 4, 4)
playPrisonersDilemma(prisoners, 1000)

Ten doesn’t divide evenly into thirds so this time we’ll have four of each type of prisoner. Please not that this changes the perfect score to -12,000.

The Group’s Overall Score was -17905 in 1000 rounds with 12 prisoners

ID: 6 Score: -876 Strategy: Devil

ID: 7 Score: -882 Strategy: Devil

ID: 8 Score: -892 Strategy: Devil

ID: 5 Score: -906 Strategy: Devil

ID: 12 Score: -1499 Strategy: MadMan

ID: 9 Score: -1504 Strategy: MadMan

ID: 10 Score: -1538 Strategy: MadMan

ID: 11 Score: -1564 Strategy: MadMan

ID: 2 Score: -2026 Strategy: Saint

ID: 1 Score: -2060 Strategy: Saint

ID: 4 Score: -2074 Strategy: Saint

ID: 3 Score: -2084 Strategy: Saint

Madmen randomly flip between acting like saints and acting like devils so it makes sense they would wind up scoring squarely in between the two. They don’t just let the devils betray them; sometimes they betray right back. And they alternate between cooperating with saints for small gains and betraying them for big gains.

So all in all it seems like mild insanity is actually a pretty well rounded fit for the cutthroat world of the prisoner’s dilemma.

Also, as promised, the fact that madmen can make actual decisions means that our overall group score now has some variation to it even when running multiple games with the same group.

The Group’s Overall Score was -17995 in 1000 rounds with 12 prisoners

The Group’s Overall Score was -18046 in 1000 rounds with 12 prisoners

The Group’s Overall Score was -17938 in 1000 rounds with 12 prisoners

The Lost And The Damned

So madmen seem to do pretty well in a mixed group… but maybe that’s just because they had some saints to act as backup. What happens when we pair up only devils and madmen?

The Group’s Overall Score was -17523 in 1000 rounds with 10 prisoners

ID: 4 Score: -1440 Strategy: Devil

ID: 1 Score: -1450 Strategy: Devil

ID: 2 Score: -1456 Strategy: Devil

ID: 3 Score: -1456 Strategy: Devil

ID: 5 Score: -1472 Strategy: Devil

ID: 9 Score: -2013 Strategy: MadMan

ID: 7 Score: -2014 Strategy: MadMan

ID: 8 Score: -2068 Strategy: MadMan

ID: 6 Score: -2072 Strategy: MadMan

ID: 10 Score: -2082 Strategy: MadMan

About the same thing as when there were saints, it turns out. The madmen’s habit of cooperating roughly half the time means they still can’t directly compete with the vicious defecting devils, but at least randomly defecting half of the time allows them to sort of defend themselves.

In fact, if you compare this to the time we evenly paired up devils and saints you’ll see that the madmen scored about the same as the saints. But the big difference is that the madmen did much more damage to the devils in the process.

Although it’s up in the air as to whether this is a good thing or not. Taking the devils down a notch is certainly a satisfying feeling but the madmen still lost and the group score is much much worse then when the saints had their match with the devil.

The More Randomness You Have The Less Random It Is

For our final experiment I just want to point out that while I call them “Madmen” the random decision prisoners actually managed to achieve some pretty reliable results, consistently scoring halfway between a saint and a devil.

This is of course because random numbers tend to average out over time. So “cooperates at random” eventually transforms into “consistently cooperates 50% of the time”.

To show this off I’m going to have a bunch of madmen play increasingly long games against each other.

prisoners = createPrisoners(0, 0, 10)
playPrisonersDilemma(prisoners, 10)
playPrisonersDilemma(prisoners, 100)
playPrisonersDilemma(prisoners, 1000)
playPrisonersDilemma(prisoners, 1000000)

The Group’s Overall Score was -141 in 10 rounds with 10 prisoners

ID: 10 Score: -7 Strategy: MadMan

ID: 1 Score: -10 Strategy: MadMan

ID: 6 Score: -12 Strategy: MadMan

ID: 8 Score: -13 Strategy: MadMan

ID: 9 Score: -14 Strategy: MadMan

ID: 5 Score: -15 Strategy: MadMan

ID: 2 Score: -16 Strategy: MadMan

ID: 3 Score: -17 Strategy: MadMan

ID: 7 Score: -18 Strategy: MadMan

ID: 4 Score: -19 Strategy: MadMan

The Group’s Overall Score was -1493 in 100 rounds with 10 prisoners

ID: 8 Score: -132 Strategy: MadMan

ID: 3 Score: -135 Strategy: MadMan

ID: 6 Score: -139 Strategy: MadMan

ID: 5 Score: -144 Strategy: MadMan

ID: 1 Score: -145 Strategy: MadMan

ID: 9 Score: -154 Strategy: MadMan

ID: 7 Score: -156 Strategy: MadMan

ID: 4 Score: -159 Strategy: MadMan

ID: 2 Score: -163 Strategy: MadMan

ID: 10 Score: -166 Strategy: MadMan

The Group’s Overall Score was -14976 in 1000 rounds with 10 prisoners

ID: 3 Score: -1437 Strategy: MadMan

ID: 8 Score: -1457 Strategy: MadMan

ID: 1 Score: -1482 Strategy: MadMan

ID: 4 Score: -1487 Strategy: MadMan

ID: 7 Score: -1491 Strategy: MadMan

ID: 10 Score: -1492 Strategy: MadMan

ID: 5 Score: -1503 Strategy: MadMan

ID: 6 Score: -1514 Strategy: MadMan

ID: 9 Score: -1537 Strategy: MadMan

ID: 2 Score: -1576 Strategy: MadMan

The Group’s Overall Score was -15001829 in 1000000 rounds with 10 prisoners

ID: 3 Score: -1498113 Strategy: MadMan

ID: 10 Score: -1499339 Strategy: MadMan

ID: 1 Score: -1499525 Strategy: MadMan

ID: 9 Score: -1500065 Strategy: MadMan

ID: 7 Score: -1500128 Strategy: MadMan

ID: 2 Score: -1500445 Strategy: MadMan

ID: 4 Score: -1500662 Strategy: MadMan

ID: 5 Score: -1500894 Strategy: MadMan

ID: 8 Score: -1501085 Strategy: MadMan

ID: 6 Score: -1501573 Strategy: MadMan

As you can see, the longer the game lasts the less difference there is in the scores.

After ten rounds the worst score (-19) was almost three times as bad as the best score (-7).

After one hundred rounds the worst score (-166) was only about 25% worse than the best score (-132).

After a thousand rounds there was only a 10% difference between best (-1437) and worst (-1576).

And after a million rounds there was less than a 1% difference between best and worst.

So in the long run cooperating at random is a viable way to take a middle path cooperation and defection.

We Can Be Smarter Than This

Random decision making lead to some interesting outcomes but it failed to come even close to beating the devils. Plus it’s an embarrassingly simple algorithm for AI enthusiasts like ourselves. Surely we can come up with a smarter prisoner. One that actually thinks instead of guessing.

But that’s going to have to wait for next time.