Let’s Program A Prisoner’s Dilemma Part 7: Survival of the Fittest

So we’ve built a bunch of different prisoner models and thrown them together in all sorts of different combinations to see what happens. And we could keep doing that.

But isn’t running all these experiments by hand kind of boring? I mean, who really has the time to sit around all day adjusting prisoner ratios and recording results?

Fortunately, as programmers we know that the answer to boredom is automation, so let’s build us a system that can test different prisoner populations for us.

What we’re going to be doing specifically is a simple survival of the fittest algorithm. We’ll still give our program a starting mix of prisoners, but after their first game is done the worst 10% of the prisoners will be removed while the best 10% will be duplicated. This new group will then play the game again. We’ll repeat this pattern a few dozen or hundred times producing all sorts of interesting numbers on what sorts of groups seem to work best.

It Runs In The Family

So what should our new multi-generational code look like?

For starters the main game playing algorithm should stay the same. We’ll still have a group of prisoners that need to be randomly paired up for a certain number of rounds and we’ll still want to keep track of everyone’s score.

The only difference is that at the end of the round instead of just reporting those scores and exiting we’re going to want to generate a new group of prisoners and then run the entire experiment again. We’ll do this for however many generations we think the experiment should run.

So there’s our first hint on what this should look like. Instead of just asking for a group of prisoners and how many rounds they should play we are going to ask for a group of prisoners, how many rounds they should play and how many times (or generations) we should repeat the experiment.

def playMultiGenerationalPrisonersDilemma(prisoners, generations, roundsPerGeneration)

Now inside of that we’re going to need a loop that runs once for every generation we asked for. And inside of that loop we’re going to want to copy our existing prisoner dilemma code. Standard Disclaimer: Copying code is sloppy and should be avoided on serious projects. If you were building a professional prisoner dilemma product you would want to figure out some way that playPrisonersDilemma and playMultiGenerationalPrisonersDilemma could both reference the same piece of code instead of copying each other.

Anyways…

generations.times{ |generation|
   #Copy paste our code from the normal playPrisonersDilemma function here
end

That’s enough to make our code play the prisoners dilemma again and again as many times as we want, but it will use the same group every time. What we want instead is to create a new group based off of the last group but with the bottom 10% of prisoners removed and the top 10% of prisoners doubled. We can do this by adding a little bit of code after the copy pasted dilemma logic, right after the part where it prints out the score.

#Generate a new group of prisoners based off of the last group.
#Prisoners that scored well get duplicated. Prisoners that scored poorly get dropped

newPrisoners = Array.new
newPrisonerCount = 0
newSaintCount = 0
newDevilCount = 0
newMadmenCount = 0
newJudgeCount = 0

prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner|
   #Top ten percent get an extra prisoner in the next generation
   if newPrisonerCount < (prisoners.length / 10).floor
      case prisoner.strategy
      when "Saint"
         newSaintCount = newSaintCount + 1
      when "Devil"
         newDevilCount = newDevilCount + 1
      when "MadMan"
         newMadmenCount = newMadmenCount + 1
      when "Judge"
         newJudgeCount = newJudgeCount + 1
      end
   end

   #Bottom ten percent don't get added to the next generation at all
   if newPrisonerCount >= prisoners.length - (prisoners.length / 10).floor
      break
   end

   #If it's not the bottom ten percent add one to the next generation
   case prisoner.strategy
   when "Saint"
      newSaintCount = newSaintCount + 1
   when "Devil"
      newDevilCount = newDevilCount + 1
   when "MadMan"
      newMadmenCount = newMadmenCount + 1
   when "Judge"
      newJudgeCount = newJudgeCount + 1
   end

   newPrisonerCount = newPrisonerCount + 1
}

#Create new generation and load it into the prisoners variable for the next game
prisoners = createPrisoners(newSaintCount, newDevilCount, newMadmenCount, newJudgeCount)

Nothing fancy here. We just sort the prisoners and then iterate through them while keeping track of how many prisoners of each type we’ve seen. For the first ten percent of prisoners we count double and we skip the last ten percent entirely.

We then use our good old createPrisoners function to build a new generation of prisoners. Starting with fresh prisoners like this is actually better than trying to modify our old prisoner group because it ensures no lingering data gets left behind to mess things up. For instance, it would be a bad thing if Judge style prisoners kept their memory between matches (especially since IDs might wind up changing between games).

With the new prisoners created the generation loop then repeats and our new group of prisoners play the game again. All that’s left to do is add a nice little message to our score output letting us know which generation we’re on and we have a complete little program.

def playMultiGenerationalPrisonersDilemma(prisoners, generations, roundsPerGeneration)
        if !prisoners.length.even?
            throw "Prisoners Dilemma requires an even number of participants"
        end
        
        generations.times{ |generation|
        
            # Make sure each prisoner starts out with a clean slate
            prisoners.each{ |prisoner| prisoner.score = 0}
        
            roundsPerGeneration.times{
                pairs = prisoners.shuffle.each_slice(2)
                pairs.each{ |pair|
                    firstPlayerCooperates = pair[0].cooperate?(pair[1].id)
                    secondPlayerCooperates = pair[1].cooperate?(pair[0].id)
                
                    if(firstPlayerCooperates)
                        pair[0].score -= 1
                    else
                        pair[1].score -= 2
                    end
                
                    if(secondPlayerCooperates)
                        pair[1].score -= 1
                    else
                        pair[0].score -= 2
                    end
                
                    pair[0].learnResults(pair[1].id, secondPlayerCooperates)
                    pair[1].learnResults(pair[0].id, firstPlayerCooperates)
                }
            }
        
            #Show the stats for the group as a whole as well as for each individual prisoner
            groupScore = 0
            prisoners.each{ |prisoner| groupScore += prisoner.score }
            puts "In generation #{generation+1} the group's overall score was #{groupScore} in #{roundsPerGeneration} rounds with #{prisoners.length} prisoners"
            prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner| prisoner.report }
            
            #Generate a new group of prisoners based off of the last group.
            #Prisoners that scored well get duplicated. Prisoners that scored poorly get dropped
            newPrisoners = Array.new
            newPrisonerCount = 0
            newSaintCount = 0
            newDevilCount = 0
            newMadmenCount = 0
            newJudgeCount = 0
            prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner|
                #Top ten percent get an extra prisoner in the next generation
                if newPrisonerCount < (prisoners.length / 10).floor
                    case prisoner.strategy
                    when "Saint"
                        newSaintCount = newSaintCount + 1
                    when "Devil"
                        newDevilCount = newDevilCount + 1
                    when "MadMan"
                        newMadmenCount = newMadmenCount + 1
                    when "Judge"
                        newJudgeCount = newJudgeCount + 1
                    end
                end
            
                #Bottom ten percent don't get added to the next generation at all
                if newPrisonerCount >= prisoners.length - (prisoners.length / 10).floor
                    break
                end
                
                #If it's not the bottom ten percent add one to the next generation
                case prisoner.strategy
                when "Saint"
                    newSaintCount = newSaintCount + 1
                when "Devil"
                    newDevilCount = newDevilCount + 1
                when "MadMan"
                    newMadmenCount = newMadmenCount + 1
                when "Judge"
                    newJudgeCount = newJudgeCount + 1
                end
                
                newPrisonerCount = newPrisonerCount + 1
            }        
                    
            #Create new generation and load it into the prisoners variable for the next game
            prisoners = createPrisoners(newSaintCount, newDevilCount, newMadmenCount, newJudgeCount)
        }
end

Me, Papa and Granpappy

Now let’s give our code a whirl. For testing purposes I’m going to create a small 10 prisoner group and have it run for a mere three generations.

prisoners = createPrisoners(2, 3, 3, 2)
#playPrisonersDilemma(prisoners, 1000)
playMultiGenerationalPrisonersDilemma(prisoners, 3, 1000)

In generation 1 the group’s overall score was -15444 in 1000 rounds with 10 prisoners

ID: 5 Score: -1176 Strategy: Devil

ID: 3 Score: -1190 Strategy: Devil

ID: 4 Score: -1192 Strategy: Devil

ID: 9 Score: -1495 Strategy: Judge

ID: 10 Score: -1505 Strategy: Judge

ID: 6 Score: -1589 Strategy: MadMan

ID: 8 Score: -1616 Strategy: MadMan

ID: 7 Score: -1639 Strategy: MadMan

ID: 1 Score: -1994 Strategy: Saint

ID: 2 Score: -2048 Strategy: Saint

In generation 2 the group’s overall score was -16716 in 1000 rounds with 10 prisoners

ID: 4 Score: -1440 Strategy: Devil

ID: 3 Score: -1444 Strategy: Devil

ID: 5 Score: -1448 Strategy: Devil

ID: 2 Score: -1498 Strategy: Devil

ID: 10 Score: -1602 Strategy: Judge

ID: 9 Score: -1629 Strategy: Judge

ID: 7 Score: -1793 Strategy: MadMan

ID: 6 Score: -1815 Strategy: MadMan

ID: 8 Score: -1839 Strategy: MadMan

ID: 1 Score: -2208 Strategy: Saint

In generation 3 the group’s overall score was -17991 in 1000 rounds with 10 prisoners

ID: 2 Score: -1642 Strategy: Devil

ID: 5 Score: -1656 Strategy: Devil

ID: 3 Score: -1680 Strategy: Devil

ID: 1 Score: -1688 Strategy: Devil

ID: 4 Score: -1690 Strategy: Devil

ID: 10 Score: -1738 Strategy: Judge

ID: 9 Score: -1749 Strategy: Judge

ID: 6 Score: -2039 Strategy: MadMan

ID: 7 Score: -2045 Strategy: MadMan

ID: 8 Score: -2064 Strategy: MadMan

Looking at the first generation you’ll notice that the devils picked on the saints (like usual) which put the devils in the lead and sent the saints to last place. That means that the top 10% of our group was a single devil and the bottom 10% was a single saint. Because of this the next generation has four devils (up from the starting three) and only one saint (down from the starting two).

Another generation then passes with results very much like the first, causing us to lose another saint and gain another devil. Which is honestly a little ominous…

Next Time: Turning Boring Data Overflow Into Interesting Visuals

So that was just ten prisoners playing three generations of fairly short games. For a real experiment we’re going to want hundreds of prisoners playing significantly longer games over at least a dozen generations. Things like:

prisoners = createPrisoners(250, 250, 250, 250)
playMultiGenerationalPrisonersDilemma(prisoners, 25, 10000)

Only problem is that a big experiment like that is going to rapidly fill up your terminal with junk. Sending the results to a text file helps but even that produces a lot more numbers than the human brain can conveniently work with.

But you know what the human brain can conveniently work with? Animated bar graphs!