So we’ve built a bunch of different prisoner models and thrown them together in all sorts of different combinations to see what happens. And we could keep doing that.
But isn’t running all these experiments by hand kind of boring? I mean, who really has the time to sit around all day adjusting prisoner ratios and recording results?
Fortunately, as programmers we know that the answer to boredom is automation, so let’s build us a system that can test different prisoner populations for us.
What we’re going to be doing specifically is a simple survival of the fittest algorithm. We’ll still give our program a starting mix of prisoners, but after their first game is done the worst 10% of the prisoners will be removed while the best 10% will be duplicated. This new group will then play the game again. We’ll repeat this pattern a few dozen or hundred times producing all sorts of interesting numbers on what sorts of groups seem to work best.
It Runs In The Family
So what should our new multi-generational code look like?
For starters the main game playing algorithm should stay the same. We’ll still have a group of prisoners that need to be randomly paired up for a certain number of rounds and we’ll still want to keep track of everyone’s score.
The only difference is that at the end of the round instead of just reporting those scores and exiting we’re going to want to generate a new group of prisoners and then run the entire experiment again. We’ll do this for however many generations we think the experiment should run.
So there’s our first hint on what this should look like. Instead of just asking for a group of prisoners and how many rounds they should play we are going to ask for a group of prisoners, how many rounds they should play and how many times (or generations) we should repeat the experiment.
def playMultiGenerationalPrisonersDilemma(prisoners, generations, roundsPerGeneration)
Now inside of that we’re going to need a loop that runs once for every generation we asked for. And inside of that loop we’re going to want to copy our existing prisoner dilemma code. Standard Disclaimer: Copying code is sloppy and should be avoided on serious projects. If you were building a professional prisoner dilemma product you would want to figure out some way that playPrisonersDilemma and playMultiGenerationalPrisonersDilemma could both reference the same piece of code instead of copying each other.
Anyways…
generations.times{ |generation| #Copy paste our code from the normal playPrisonersDilemma function here end
That’s enough to make our code play the prisoners dilemma again and again as many times as we want, but it will use the same group every time. What we want instead is to create a new group based off of the last group but with the bottom 10% of prisoners removed and the top 10% of prisoners doubled. We can do this by adding a little bit of code after the copy pasted dilemma logic, right after the part where it prints out the score.
#Generate a new group of prisoners based off of the last group. #Prisoners that scored well get duplicated. Prisoners that scored poorly get dropped newPrisoners = Array.new newPrisonerCount = 0 newSaintCount = 0 newDevilCount = 0 newMadmenCount = 0 newJudgeCount = 0 prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner| #Top ten percent get an extra prisoner in the next generation if newPrisonerCount < (prisoners.length / 10).floor case prisoner.strategy when "Saint" newSaintCount = newSaintCount + 1 when "Devil" newDevilCount = newDevilCount + 1 when "MadMan" newMadmenCount = newMadmenCount + 1 when "Judge" newJudgeCount = newJudgeCount + 1 end end #Bottom ten percent don't get added to the next generation at all if newPrisonerCount >= prisoners.length - (prisoners.length / 10).floor break end #If it's not the bottom ten percent add one to the next generation case prisoner.strategy when "Saint" newSaintCount = newSaintCount + 1 when "Devil" newDevilCount = newDevilCount + 1 when "MadMan" newMadmenCount = newMadmenCount + 1 when "Judge" newJudgeCount = newJudgeCount + 1 end newPrisonerCount = newPrisonerCount + 1 } #Create new generation and load it into the prisoners variable for the next game prisoners = createPrisoners(newSaintCount, newDevilCount, newMadmenCount, newJudgeCount)
Nothing fancy here. We just sort the prisoners and then iterate through them while keeping track of how many prisoners of each type we’ve seen. For the first ten percent of prisoners we count double and we skip the last ten percent entirely.
We then use our good old createPrisoners function to build a new generation of prisoners. Starting with fresh prisoners like this is actually better than trying to modify our old prisoner group because it ensures no lingering data gets left behind to mess things up. For instance, it would be a bad thing if Judge style prisoners kept their memory between matches (especially since IDs might wind up changing between games).
With the new prisoners created the generation loop then repeats and our new group of prisoners play the game again. All that’s left to do is add a nice little message to our score output letting us know which generation we’re on and we have a complete little program.
def playMultiGenerationalPrisonersDilemma(prisoners, generations, roundsPerGeneration) if !prisoners.length.even? throw "Prisoners Dilemma requires an even number of participants" end generations.times{ |generation| # Make sure each prisoner starts out with a clean slate prisoners.each{ |prisoner| prisoner.score = 0} roundsPerGeneration.times{ pairs = prisoners.shuffle.each_slice(2) pairs.each{ |pair| firstPlayerCooperates = pair[0].cooperate?(pair[1].id) secondPlayerCooperates = pair[1].cooperate?(pair[0].id) if(firstPlayerCooperates) pair[0].score -= 1 else pair[1].score -= 2 end if(secondPlayerCooperates) pair[1].score -= 1 else pair[0].score -= 2 end pair[0].learnResults(pair[1].id, secondPlayerCooperates) pair[1].learnResults(pair[0].id, firstPlayerCooperates) } } #Show the stats for the group as a whole as well as for each individual prisoner groupScore = 0 prisoners.each{ |prisoner| groupScore += prisoner.score } puts "In generation #{generation+1} the group's overall score was #{groupScore} in #{roundsPerGeneration} rounds with #{prisoners.length} prisoners" prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner| prisoner.report } #Generate a new group of prisoners based off of the last group. #Prisoners that scored well get duplicated. Prisoners that scored poorly get dropped newPrisoners = Array.new newPrisonerCount = 0 newSaintCount = 0 newDevilCount = 0 newMadmenCount = 0 newJudgeCount = 0 prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner| #Top ten percent get an extra prisoner in the next generation if newPrisonerCount < (prisoners.length / 10).floor case prisoner.strategy when "Saint" newSaintCount = newSaintCount + 1 when "Devil" newDevilCount = newDevilCount + 1 when "MadMan" newMadmenCount = newMadmenCount + 1 when "Judge" newJudgeCount = newJudgeCount + 1 end end #Bottom ten percent don't get added to the next generation at all if newPrisonerCount >= prisoners.length - (prisoners.length / 10).floor break end #If it's not the bottom ten percent add one to the next generation case prisoner.strategy when "Saint" newSaintCount = newSaintCount + 1 when "Devil" newDevilCount = newDevilCount + 1 when "MadMan" newMadmenCount = newMadmenCount + 1 when "Judge" newJudgeCount = newJudgeCount + 1 end newPrisonerCount = newPrisonerCount + 1 } #Create new generation and load it into the prisoners variable for the next game prisoners = createPrisoners(newSaintCount, newDevilCount, newMadmenCount, newJudgeCount) } end
Me, Papa and Granpappy
Now let’s give our code a whirl. For testing purposes I’m going to create a small 10 prisoner group and have it run for a mere three generations.
prisoners = createPrisoners(2, 3, 3, 2) #playPrisonersDilemma(prisoners, 1000) playMultiGenerationalPrisonersDilemma(prisoners, 3, 1000)
In generation 1 the group’s overall score was -15444 in 1000 rounds with 10 prisoners
ID: 5 Score: -1176 Strategy: Devil
ID: 3 Score: -1190 Strategy: Devil
ID: 4 Score: -1192 Strategy: Devil
ID: 9 Score: -1495 Strategy: Judge
ID: 10 Score: -1505 Strategy: Judge
ID: 6 Score: -1589 Strategy: MadMan
ID: 8 Score: -1616 Strategy: MadMan
ID: 7 Score: -1639 Strategy: MadMan
ID: 1 Score: -1994 Strategy: Saint
ID: 2 Score: -2048 Strategy: Saint
In generation 2 the group’s overall score was -16716 in 1000 rounds with 10 prisoners
ID: 4 Score: -1440 Strategy: Devil
ID: 3 Score: -1444 Strategy: Devil
ID: 5 Score: -1448 Strategy: Devil
ID: 2 Score: -1498 Strategy: Devil
ID: 10 Score: -1602 Strategy: Judge
ID: 9 Score: -1629 Strategy: Judge
ID: 7 Score: -1793 Strategy: MadMan
ID: 6 Score: -1815 Strategy: MadMan
ID: 8 Score: -1839 Strategy: MadMan
ID: 1 Score: -2208 Strategy: Saint
In generation 3 the group’s overall score was -17991 in 1000 rounds with 10 prisoners
ID: 2 Score: -1642 Strategy: Devil
ID: 5 Score: -1656 Strategy: Devil
ID: 3 Score: -1680 Strategy: Devil
ID: 1 Score: -1688 Strategy: Devil
ID: 4 Score: -1690 Strategy: Devil
ID: 10 Score: -1738 Strategy: Judge
ID: 9 Score: -1749 Strategy: Judge
ID: 6 Score: -2039 Strategy: MadMan
ID: 7 Score: -2045 Strategy: MadMan
ID: 8 Score: -2064 Strategy: MadMan
Looking at the first generation you’ll notice that the devils picked on the saints (like usual) which put the devils in the lead and sent the saints to last place. That means that the top 10% of our group was a single devil and the bottom 10% was a single saint. Because of this the next generation has four devils (up from the starting three) and only one saint (down from the starting two).
Another generation then passes with results very much like the first, causing us to lose another saint and gain another devil. Which is honestly a little ominous…
Next Time: Turning Boring Data Overflow Into Interesting Visuals
So that was just ten prisoners playing three generations of fairly short games. For a real experiment we’re going to want hundreds of prisoners playing significantly longer games over at least a dozen generations. Things like:
prisoners = createPrisoners(250, 250, 250, 250) playMultiGenerationalPrisonersDilemma(prisoners, 25, 10000)
Only problem is that a big experiment like that is going to rapidly fill up your terminal with junk. Sending the results to a text file helps but even that produces a lot more numbers than the human brain can conveniently work with.
But you know what the human brain can conveniently work with? Animated bar graphs!