Let’s Program A Compression Algorithm Part 3: In Which A General Compression Algorithm Becomes Working Lisp Code

Last time we invented and tested a super-simple compression algorithm that revolves around replacing the eight most common symbols in the English language with tiny 4-bit codes instead of their normal 8-bit representations (at the cost of replacing everything else with bigger 9-bit versions). We even did some examples by hand.

But this is a coding blog, so it’s time to write some actual code. When it comes to language choice our only real requirement is that the language be capable or working directly with bits and bytes and files and since pretty much every language ever has no problem doing that we’re open to choose whatever we want.

I’m personally going to choose Lisp and be keeping all my code in a file named “wrc.lisp” for “White Rabbit Compression”. You, of course, can follow along in whatever language you want. In fact, that would probably be the most educational approach to this series.

Anyways, like we talked about in the design phase, this project will mostly focus on processing lists of bits in batches of 8, 4 and 9. That means we’re going to need a convenient data structure for holding these lists.

Fortunately Lisp is built entirely around list processing and has some very powerful built in tools for creating and managing lists of numbers, so I will be representing our binary fragments as a plain old list of integers which will just so happen to always be either a 0 or a 1.

For those of you not in Lisp I bet your language has it’s own list structures. Could be a vector, a queue, a linked list, whatever. Anything that lets you add new stuff to the end and pull old stuff off the front will be fine.

Now the efficiency buffs in the audience might have noticed that we’re wasting space by using entire integers to keep track of mere bits. Shouldn’t we be using something else, like a bit vector?

Probably! But this is just the proof of concept first pass so doing things the easy way is more important than doing things the best way. If it turns out the program is too slow or memory hungry then we’ll revisit this decision.

With that in mind our first attempt at letter bit lists is going to look something like this:

; An 8-bit ASCII letter
(list 0 1 0 0 0 0 0 1)

;One of our 4 bit compressed letters
(list 0 0 1 1)

;One of our 9 bit labeled ASCII letters
(list 1 0 1 0 0 0 0 0 1)

Next up let’s write some helper functions that can turn real 8-bit bytes into our custom bit lists or turn our bit lists back into bytes.

(defun 8-bit-list-to-byte (bitlist)
    (+     (* 128 (first bitlist))
        (* 64 (second bitlist))
        (* 32 (third bitlist))
        (* 16 (fourth bitlist))
        (* 8 (fifth bitlist))
        (* 4 (sixth bitlist))
        (* 2 (seventh bitlist))
        (* 1 (eighth bitlist))))
        
(defun byte-to-8-bit-list (byte)
    (let ((bitlist nil))
        (if (>= byte 128)
            (progn
                (push 1 bitlist)
                (setf byte (- byte 128)))
            (push 0 bitlist))
        (if (>= byte 64)
            (progn
                (push 1 bitlist)
                (setf byte (- byte 64)))
            (push 0 bitlist))
        (if (>= byte 32)
            (progn
                (push 1 bitlist)
                (setf byte (- byte 32)))
            (push 0 bitlist))
        (if (>= byte 16)
            (progn
                (push 1 bitlist)
                (setf byte (- byte 16)))
            (push 0 bitlist))
        (if (>= byte 8)
            (progn
                (push 1 bitlist)
                (setf byte (- byte 8)))
            (push 0 bitlist))
        (if (>= byte 4)
            (progn
                (push 1 bitlist)
                (setf byte (- byte 4)))
            (push 0 bitlist))
        (if (>= byte 2)
            (progn
                (push 1 bitlist)
                (setf byte (- byte 2)))
            (push 0 bitlist))
        (if (>= byte 1)
            (progn
                (push 1 bitlist)
                (setf byte (- byte 1)))
            (push 0 bitlist))
        (nreverse bitlist)))

 

The logic here is pretty simple although the Lisp syntax can be a bit weird. I think 8-bit-list-to-byte is self-explanatory but byte-to-8-bit-list might need some explaining. The basic idea is that every bit in a byte has a specific value: 128, 64, 32, 16, 8, 4, 2 or 1. Because of how binary works any number greater than 128 must have the first bit set, any number smaller than 128 but larger than 64 must have the second bit set and so on.

So we can turn a byte into a list by first checking if the number is bigger than 128. If not we put a 0 into our list and move on. But if it is we put a 1 into our list and then subtract 128 before working with the remainder of the number. Then the next if statement checks if the numbe ris bigger than 64 and so on. The only Lispy trick here is that Lisp if statements by default can only contain two lines of logic: the first for when the if is true and the second when the if is false. Since we want to run two lines of logic when things are true we wrap them in a progn, which just lumps multiple lines of code into one unit.

Clear as mud? Let’s move on then.

With those generic helpers out of the way let’s move on to writing the core of our compressor: A function that can accept a byte and then transform it into either a 4-bit short code or a 9-bit extended code.

The first thing we’ll want is a single place in the code where we can keep the “official” list of which letters translate to which short codes. It’s important we have only one copy of this list because later on we might find out we need to change it and updating one list is a lot easier than hunting through our code for half a dozen different lists.

(defparameter *compression-map*
   '((160 (0 0 0 0)) 
     (101 (0 0 0 1))
     (116 (0 0 1 0))
     (97  (0 0 1 1))
     (111 (0 1 0 0))
     (105 (0 1 0 1))
     (110 (0 1 1 0))
     (115 (0 1 1 1))))

Here I’m using the ‘ shortcut to create a list of lists since (list (list 160 (list 0 0 0 0))…) would get really tiring to type really fast. Syntax aside the first item in each of the eight items in our list is the ASCII code for the letter we want to compress and the second item is the bit list we want it to compress to.

Now that we’ve gor our compression map in whatever format you prefer all we need is an easy way to get the data we need out of that map. During compression we want to be able to take a byte and find out it’s short code and during decompression we want to take a short-code and find out which byte it used to be.

Off the top of my head there are a few ways to do this. One would be to just loop through the entire list every time we want to do a lookup, stopping when we find an item that starts with the byte we want or ends with the list we want (or returning false if we don’t find anything). The whole list is only eight items long so this isn’t as wasteful as it might sound.

An easier solution (depending on your language) might be to use our master list to create a pair of hash tables or dictionaries. As you probably know these are one way lookup structures that link “keys” to “values”. Give the hash a key and it will very efficiently tell you whether or not it has stored a value for it and what that value is; perfect for our needs. The only trick is that since they are only one way and we want to both compress and decompress our data we’ll actually need two hashes that mirror each other. One would use the bytes as keys and reference the bit-lists. The other would use the bit-lists as keys and reference the bytes.

I think I’ll go with the hash approach since they are a more universal language feature than Lisp’s particular approach to list parsing.

Hash tables seem pretty weird until you understand how they work internally.

To translate my universal compression mapping into a pair of usable hash tables I’m going to first create the hash tables as global variables and then I’m going to use a simple Lisp loop to step through every pair in the compression map. During each step of the loop we will insert the data from one pair into each hash.

(defparameter *byte-to-list-compression-hash* (make-hash-table))
(defparameter *list-to-byte-compression-hash* (make-hash-table :test 'equal))

(loop
    for compression-pair in *compression-map*
    do    (setf 
            (gethash (car compression-pair) *byte-to-list-compression-hash*) 
            (cadr compression-pair))
        (setf 
            (gethash (cadr compression-pair) *list-to-byte-compression-hash*) 
            (car compression-pair)))

 

A little Lisp trivia here for anybody following along in my language of choice:

1) The :test keyword lets you tell a hash how to compare keys. The default value works great with bytes but not so great with lists so I use :test ‘equal to give the *list-to-byte-compression-hash* a more list friendly compartor.

2) To pull data out of or put data into a hash you use the gethash function. The fist argument is a key value, the second argument is the hash you want to use. I tend to get this backwards a lot :-(

3) “car” and “cdr” and combinitions like “cadar” are all old fashioned keywords for grabbing different parts of lists. I could have just as easily used chains of “first” and “second” but what’s the fun in that?

Lisp aside, we now have an official compression map and two easy to search hashes for doing compression lookups and decompression reverse lookups. Let’s put them to good use by actually writing a compression function!

The compression function should take an ASCII byte and check if it’s in the compression map. If it is it will return the proper 4-bit short code. If it isn’t in the map it should transform it to an 8-bit list, glue on a leading 1 and then return the new 9-bit list.

(defun compress-byte (byte-to-compress)
    (let ((short-code (gethash byte-to-compress *byte-to-list-compression-hash*)))
        (if short-code 
            short-code 
            (append '(1) (byte-to-8-bit-list byte-to-compress)))))

Nothing all that clever here. We take the byte-to-compress and lookup whatever value it has in the *byte-to-list-compression-hash*. We then use let to store that result in a local short-code variable. We then us a simple if statment to see whether short-code has an actual value (meaning we found a compression) or if it is empty (meaning that byte doesn’t compress). If it has a compressed value we just return it. Otherwise we take the original byte-to-compress, turn it into a bit list, glue a 1 to the front and return it.

Let’s see it all in action:

[1]> (load “wrc.lisp”)

;; Loading file wrc.lisp …

;; Loaded file wrc.lisp

T

[2]> (compress-byte 111)

(0 1 0 0)

[3]> (compress-byte 82)

(1 0 1 0 1 0 0 1 0)

Cool. It successfully found 111 (“o”) in our compression map and shrunk it from eight bytes down to four. It also noticed that 82 (“R”) was not in our map and so returned the full 8-bits with a ninth “1” marker glued to the front.

Next time we’ll start looking into how to use this function to compress and save an actual file.

Let’s Program A Compression Algorithm Part 2: In Which A General Idea Becomes A Specific Algorithm

Last time we talked about Alice in Wonderland and the deep cruelty of stores that play the same handful of movie trailers on endless loop.

More importantly we also talked about how computers use compression to shrink files down for easier storage and faster transfer. Without good compression algorithms the Internet would crawl to a halt and half the electronics you use on a daily basis wouldn’t exist. For instance, without compression a 2 hour HD movie would be over 300 GB; good luck fitting that all on one disc!

To better understand this vital computer science breakthrough we’re going to be writing our own ASCII text compression program.

Disclaimer: It’s not going to be a very good compression program. Like all of my let’s programs the goal here is education and not the creation of usable code. We’re going to be skipping out on all the security and error checking issues a professional compressor would include and our end goal is a modest 20% reduction in file size compared to the 75%+ reduction well known tools like Zip can pull off.

Now that your expectations have been properly lowered let’s talk about the structure of text files. After all, we can’t compress them if we don’t know what they’re supposed to look like or how they work.

The ASCII text standard is a set of 256 symbols that includes all 26 letters of the English language in both upper and lower case as well as all standard punctuation, some useful computer symbols (like newline) and a bunch of random symbols and proto-emoticons that seem like they were included mostly to fill up space.

The fact that there are 256 symbols is very important because 256 is exactly how high you can count using a single 8-bit computer byte. That means you can store one letter in one byte by simply using binary counting to mark down where the letter is in the ASCII chart.

For example, the capital letter “A” is in spot 65 of the ASCII chart. 65 in binary is 01000001. So to store the letter “A” in our computer we would grab a byte worth of space on our hard drive and fill it with the electronic pattern 01000001.

An ASCII text file is just a bunch of these 8-bit character bytes all in a row. If you want to save a 120 character long text you need a 120-byte long ASCII file. If you want to save a 5,000 character short story you need a roughly 5 kilobyte ASCII file.

Ok, cool. Now we know what an ASCII file’s guts look like. Time to start looking for patterns we can use for compression.

Patterns… patterns… here’s one!

The ASCII files we’ll be working with are all based on the English alphabet, and in English not all letters are used evenly. Things like “s”, “e”, and “a” get used all the time while poor little letters like “x” and “z” hardly ever see the light of day. And don’t forget “ ”! You might not think of the blank space as a letter but just imagine tryingtowritewithoutit.

So some letters are much more common than others but ASCII stores them all in identical 8-bit bytes anyways. What if we were to change that? What if we stored the most common letters in smaller spaces, like 4-bit nibbles*?

The biggest challenge here is figuring out a way to let the computer know when it should expect a nibble instead of a byte. In normal ASCII every letter is eight bits long which makes it easy for the computer to figure out where one letter ends and the next begins.

But since we’re going to have letters of different lengths we need some way to point out to the computer what to expect. A sort of virtual name-tag to say “I’m a 4-bit letter” or “I’m a normal 8-bit letter”.

Here’s a simple idea: Our short 4-bit letters will always start with a 0, on the other hand our 8-bit ASCII letters will always start with a 1.

This solution does have some drawbacks. If the first bit of our 4-bit letters is always 0 that means we only have three bits left for encoding the actual letter. Three bits is only enough to count up to eight so that means we will only be able to compress eight letters.

A bigger problem comes from the 8-bit ASCII letters. They need their full 8-bits to work properly so the only way to mark them with a leading 1 is by gluing it to the front and turning them into 9-bit letters. So while our common letters had their size cut in half our uncommon letters are actually getting bigger. Hopefully we’ll still come out ahead but it might be close.

It’s amazing how many problems are caused by people trying to apply averages to things that ought not be averaged

Anyways, it sounds like we’re going to have eight different shortcut codes to work with. What letters should we use them for? Well, according to Wikipedia the eight most common letters in the English language are, in order: E, T, A, O, I, N, S, H. So that’s probably a good bet if we want as much compression as possible.

But Wikipedia doesn’t count the blank space as a letter. However because it’s so common in text it’s definitely something we want to compress. Let’s add it to the front of the list and drop “H”. That means the letters we will be compressing are “ ”, E, T, A, O, I, N, S.

Or more accurately “ ”, e, t, a, o , i, n, s. Remember that in ASCII upper and lower case letters are coded differently so we have to choose which we want. Since lowercase letters are more common than uppercase it makes sense to focus on them.

Now that we have our eight compression targets all we have to do is assign them one of our short codes, all of which are just the number 0 followed by some binary. Let’s go with this:

“ ” = 0000

“e” = 0001

“t” = 0010

“a” = 0011

“o” = 0100

“i” = 0101

“n” = 0110

“s” = 0111

Also remember that any letter not on this list will actually be expanded by putting a “1” in front of it’s binary representation.

One Makes You Smaller

Whew! That was a lot of abstract thinking but believe it or not we now have a complete compression algorithm. And just to prove it we’re going to do a compression by hand.

But what should we practice on? Well, our theme is “Wonderland” and I seem to recall that Alice was able to shrink herself by fooling around with a bottle labeled “drink me”. In ASCII that looks like this:

d r i n k m e
01100100 01110010 01101001 01101110 01101011 00100000 01101101 01100101

Eight bits times eight letter means 64 bits total. But if we replace the space, ‘i’, ‘e’, and ‘n’ with our 4-bit shortcuts while adding a 1 flag in front of the remaining 8-bit (soon to be 9-bit) letters we get

d r i n k m e
101100100 101110010 0101 0110 101101011 0000 101101101 0001

Which is 9*4 + 4*4 bits long for a total of only 52 bits. So we saved ourselves 12 bits, which is almost 20% less space than the original. Not bad.

One Makes You Grow Taller

Of course, taking text and compressing it is pretty useless unless we also know how to take compressed text and expand it back into normal readable ASCII. So please take a look at the following bit sequence and see if you can figure out what it used to say:

0001001111111010000001011011010001

I don’t want anybody accidentally looking ahead so let’s push the answer down a page or so with some another random comic.

Information theory jokes are funny, right?

So the first thing to do here is to take that big messy data stream and split it into individual letters. Remember that according to our rules the length of each letter is determined by whether it starts with a 0 or a 1. The 0s are 4-bit letters and the 1s are 9-bit letters.

So here we go. 0001001111111010000001011011010001 starts with 0 so the first letter must be four bits long: 0001

After removing those four bits we’re left with 001111111010000001011011010001, which also starts with a 0 meaning our next letter is also four bits long: 0011

Removing those four letters leaves us with 11111010000001011011010001. Since that starts with a 1 that means our next letter is 9 bits long: 111110100

By doing this again and again we finally get these six letters:

0001 0011 111110100 0000 101101101 0001

Now that we have our individual letters it’s time to turn them into, well, letters. But the kind of letters people can read.

For the four bit letters we just reference the list of short codes we made up. Scroll up in the likely event that you neglected to commit them to long term memory.

0001 0011 111110100 0000 101101101 0001
e a ? ? e

For the nine bit letters we need to remove the leading 1 and then look up the remaining eight bit code in the official ASCII chart. For instance 111110100 becomes 11110100 which is the code for “t”.

0001 0011 111110100 0000 101101101 0001
e a t m e

And there we go, the compressed binary has successfully been turned back into human readable data.

I Don’t Like Pretending To Be A Compression Algorithm

Doing these examples by hand was a great way to prove our proposed compression algorithm actually works but I don’t think any of us want to to try and compress an entire book or even an entire email by hand. It would be much better to teach the computer how to do this all for us. Which is exactly what we’re going to start working on next time.

After all, this is a Let’s Program, not a Let’s Spend A Small Eternity Doing Mathematical Busywork.

* Yes, nibble is the actual official term for half a byte. Programmers are weird like that.

Let’s Program A Compression Algorithm Part 1: How To Fit A Byte Into A Bit And Other Curious Tricks

So a while ago there was a new Alice in Wonderland movie coming out and my local geek store apparently decided it would be fun to replay the trailer again and again every few minutes. So it should be no surprise I’ve got the lyrics to “White Rabbit” stuck in my head.

One pill makes you larger
And one pill makes you small
And the ones that mother gives you
Don’t do anything at all
Go ask Alice
When she’s ten feet tall

Speaking of growing and shrinking, have you ever really thought about data compression? It’s that amazing thing that let’s you take an entire folder full of vacation photos and shrink it down into a single file that’s small enough to email to grandma. Data compression is also the star player behind streaming Youtube videos, fast loading image files, compact MP3 songs, reasonable computer backups, home DVD players and anywhere else you need to take a very big file and make it very small for either storage or transport.

But how exactly does that work? You would think that making a file smaller would result in lost data, kind of like how shrinking a picture in paint and then blowing it back up makes everything fuzzy. A 10MB file has 10MB of data, right? You’d think that the only way to shrink it is by throwing some of that data away.

After years of playing really old games on newer, bigger monitors I’ve grown fond of the fuzzy pixelated look

And yet somehow magic compression algorithms like zip manage to shrink things down and then return them to normal size without ever losing any of the original data.

To explain how that is possible we’re going to have to talk information theory, but that sounds boring so instead let’s talk fast food.

Your average fast food restaurant offers a dozen or so different kinds of sandwiches along with another dozen or so sides such as soda, onion rings and soft serve ice cream. Customers can then order whatever mix of items they want.

But not all combination of items are equally popular. Every day hundreds and hundreds of people ordered sandwiches and sodas for lunch but only one or two customers a week ask for a six pack of onion rings plus an ice cream cone.

This leads to a brilliant idea: What if we make the most popular combinations of food items easier to order by giving them simple numbers? A lot of people order hamburgers, a drink and some fries so let’s call that the “#1 Combo”. Other people order chicken strips and a soda so let’s call that the “#2 Combo”. We get a lot of visitors from the gym next door who just want a salad and some water so let’s call that the “#3 Combo”.

Fast food combos are basically a way to “compress” common orders into simple numbers that chef’s can “expand” by just looking at a chart. Customer wants a #3? Let’s check the menu and see which items go with that combo.

Customer service is thus much faster because now most people can just shout out a number instead of having to spend half a minute or more carefully listing out exactly what they want.

Information theory in a nutshell

Now information science has a lot of fancy words and equations for talking about this stuff but all you really need to know for now is that most types of data have patterns and those patterns can be used to create shortcuts. Restaurants use these patterns to replace complex orders with simple combo numbers and compression algorithms use these patterns to replace big files with simplified smaller files.

As an extreme example, imagine a file that’s just the word “Jabberwoky” repeated a million times. That’s roughly 8 MB of disk space, but do you really need all that? Not really. You could replace it with a file that says “JabberwokyX1000000” and as long as your code knew to interpret that as a million item list everything would work the exact same.

And that’s the secret to compression: A 10MB file contains 10MB worth of 0s and 1s but not necessarily 10MB worth of unique information. By finding a more efficient way to express that same information you can shrink your files, sometimes quite dramatically.

For a more realistic example: imagine an image file made up of a few million pixels. In a raw image file each pixel is a 32 bit number telling the computer which of several million colors it should draw on the screen. But what if your image doesn’t have millions of different colors? What if it’s a plain black and white drawing? You don’t need a full 32 bits just to mark whether a pixel is black or white so you could save a ton of space by inventing a new compressed image format where each pixel is just a single bit, 0 for black and 1 for white.

Or maybe you have a cartoon video file with a lot of identical still shots. Instead of storing a complete image for all those identical frames maybe you could create a compressed video format with the ability to say “This frame should look just like the last one”.

And maybe you have a text file that’s just a little too big for an email attachment. You could just email it in parts but I bet with a little clever thought you could come up with a way to compress it.

That’s going to be the topic of this let’s program.

Well, that and Alice in Wonderland.

Let’s Program A Prisoner’s Dilemma Part 7: Survival of the Fittest

So we’ve built a bunch of different prisoner models and thrown them together in all sorts of different combinations to see what happens. And we could keep doing that.

But isn’t running all these experiments by hand kind of boring? I mean, who really has the time to sit around all day adjusting prisoner ratios and recording results?

Fortunately, as programmers we know that the answer to boredom is automation, so let’s build us a system that can test different prisoner populations for us.

What we’re going to be doing specifically is a simple survival of the fittest algorithm. We’ll still give our program a starting mix of prisoners, but after their first game is done the worst 10% of the prisoners will be removed while the best 10% will be duplicated. This new group will then play the game again. We’ll repeat this pattern a few dozen or hundred times producing all sorts of interesting numbers on what sorts of groups seem to work best.

It Runs In The Family

So what should our new multi-generational code look like?

For starters the main game playing algorithm should stay the same. We’ll still have a group of prisoners that need to be randomly paired up for a certain number of rounds and we’ll still want to keep track of everyone’s score.

The only difference is that at the end of the round instead of just reporting those scores and exiting we’re going to want to generate a new group of prisoners and then run the entire experiment again. We’ll do this for however many generations we think the experiment should run.

So there’s our first hint on what this should look like. Instead of just asking for a group of prisoners and how many rounds they should play we are going to ask for a group of prisoners, how many rounds they should play and how many times (or generations) we should repeat the experiment.

def playMultiGenerationalPrisonersDilemma(prisoners, generations, roundsPerGeneration)

Now inside of that we’re going to need a loop that runs once for every generation we asked for. And inside of that loop we’re going to want to copy our existing prisoner dilemma code. Standard Disclaimer: Copying code is sloppy and should be avoided on serious projects. If you were building a professional prisoner dilemma product you would want to figure out some way that playPrisonersDilemma and playMultiGenerationalPrisonersDilemma could both reference the same piece of code instead of copying each other.

Anyways…

generations.times{ |generation|
   #Copy paste our code from the normal playPrisonersDilemma function here
end

That’s enough to make our code play the prisoners dilemma again and again as many times as we want, but it will use the same group every time. What we want instead is to create a new group based off of the last group but with the bottom 10% of prisoners removed and the top 10% of prisoners doubled. We can do this by adding a little bit of code after the copy pasted dilemma logic, right after the part where it prints out the score.

#Generate a new group of prisoners based off of the last group.
#Prisoners that scored well get duplicated. Prisoners that scored poorly get dropped

newPrisoners = Array.new
newPrisonerCount = 0
newSaintCount = 0
newDevilCount = 0
newMadmenCount = 0
newJudgeCount = 0

prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner|
   #Top ten percent get an extra prisoner in the next generation
   if newPrisonerCount < (prisoners.length / 10).floor
      case prisoner.strategy
      when "Saint"
         newSaintCount = newSaintCount + 1
      when "Devil"
         newDevilCount = newDevilCount + 1
      when "MadMan"
         newMadmenCount = newMadmenCount + 1
      when "Judge"
         newJudgeCount = newJudgeCount + 1
      end
   end

   #Bottom ten percent don't get added to the next generation at all
   if newPrisonerCount >= prisoners.length - (prisoners.length / 10).floor
      break
   end

   #If it's not the bottom ten percent add one to the next generation
   case prisoner.strategy
   when "Saint"
      newSaintCount = newSaintCount + 1
   when "Devil"
      newDevilCount = newDevilCount + 1
   when "MadMan"
      newMadmenCount = newMadmenCount + 1
   when "Judge"
      newJudgeCount = newJudgeCount + 1
   end

   newPrisonerCount = newPrisonerCount + 1
}

#Create new generation and load it into the prisoners variable for the next game
prisoners = createPrisoners(newSaintCount, newDevilCount, newMadmenCount, newJudgeCount)

Nothing fancy here. We just sort the prisoners and then iterate through them while keeping track of how many prisoners of each type we’ve seen. For the first ten percent of prisoners we count double and we skip the last ten percent entirely.

We then use our good old createPrisoners function to build a new generation of prisoners. Starting with fresh prisoners like this is actually better than trying to modify our old prisoner group because it ensures no lingering data gets left behind to mess things up. For instance, it would be a bad thing if Judge style prisoners kept their memory between matches (especially since IDs might wind up changing between games).

With the new prisoners created the generation loop then repeats and our new group of prisoners play the game again. All that’s left to do is add a nice little message to our score output letting us know which generation we’re on and we have a complete little program.

def playMultiGenerationalPrisonersDilemma(prisoners, generations, roundsPerGeneration)
        if !prisoners.length.even?
            throw "Prisoners Dilemma requires an even number of participants"
        end
        
        generations.times{ |generation|
        
            # Make sure each prisoner starts out with a clean slate
            prisoners.each{ |prisoner| prisoner.score = 0}
        
            roundsPerGeneration.times{
                pairs = prisoners.shuffle.each_slice(2)
                pairs.each{ |pair|
                    firstPlayerCooperates = pair[0].cooperate?(pair[1].id)
                    secondPlayerCooperates = pair[1].cooperate?(pair[0].id)
                
                    if(firstPlayerCooperates)
                        pair[0].score -= 1
                    else
                        pair[1].score -= 2
                    end
                
                    if(secondPlayerCooperates)
                        pair[1].score -= 1
                    else
                        pair[0].score -= 2
                    end
                
                    pair[0].learnResults(pair[1].id, secondPlayerCooperates)
                    pair[1].learnResults(pair[0].id, firstPlayerCooperates)
                }
            }
        
            #Show the stats for the group as a whole as well as for each individual prisoner
            groupScore = 0
            prisoners.each{ |prisoner| groupScore += prisoner.score }
            puts "In generation #{generation+1} the group's overall score was #{groupScore} in #{roundsPerGeneration} rounds with #{prisoners.length} prisoners"
            prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner| prisoner.report }
            
            #Generate a new group of prisoners based off of the last group.
            #Prisoners that scored well get duplicated. Prisoners that scored poorly get dropped
            newPrisoners = Array.new
            newPrisonerCount = 0
            newSaintCount = 0
            newDevilCount = 0
            newMadmenCount = 0
            newJudgeCount = 0
            prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner|
                #Top ten percent get an extra prisoner in the next generation
                if newPrisonerCount < (prisoners.length / 10).floor
                    case prisoner.strategy
                    when "Saint"
                        newSaintCount = newSaintCount + 1
                    when "Devil"
                        newDevilCount = newDevilCount + 1
                    when "MadMan"
                        newMadmenCount = newMadmenCount + 1
                    when "Judge"
                        newJudgeCount = newJudgeCount + 1
                    end
                end
            
                #Bottom ten percent don't get added to the next generation at all
                if newPrisonerCount >= prisoners.length - (prisoners.length / 10).floor
                    break
                end
                
                #If it's not the bottom ten percent add one to the next generation
                case prisoner.strategy
                when "Saint"
                    newSaintCount = newSaintCount + 1
                when "Devil"
                    newDevilCount = newDevilCount + 1
                when "MadMan"
                    newMadmenCount = newMadmenCount + 1
                when "Judge"
                    newJudgeCount = newJudgeCount + 1
                end
                
                newPrisonerCount = newPrisonerCount + 1
            }        
                    
            #Create new generation and load it into the prisoners variable for the next game
            prisoners = createPrisoners(newSaintCount, newDevilCount, newMadmenCount, newJudgeCount)
        }
end

Me, Papa and Granpappy

Now let’s give our code a whirl. For testing purposes I’m going to create a small 10 prisoner group and have it run for a mere three generations.

prisoners = createPrisoners(2, 3, 3, 2)
#playPrisonersDilemma(prisoners, 1000)
playMultiGenerationalPrisonersDilemma(prisoners, 3, 1000)

In generation 1 the group’s overall score was -15444 in 1000 rounds with 10 prisoners

ID: 5 Score: -1176 Strategy: Devil

ID: 3 Score: -1190 Strategy: Devil

ID: 4 Score: -1192 Strategy: Devil

ID: 9 Score: -1495 Strategy: Judge

ID: 10 Score: -1505 Strategy: Judge

ID: 6 Score: -1589 Strategy: MadMan

ID: 8 Score: -1616 Strategy: MadMan

ID: 7 Score: -1639 Strategy: MadMan

ID: 1 Score: -1994 Strategy: Saint

ID: 2 Score: -2048 Strategy: Saint

In generation 2 the group’s overall score was -16716 in 1000 rounds with 10 prisoners

ID: 4 Score: -1440 Strategy: Devil

ID: 3 Score: -1444 Strategy: Devil

ID: 5 Score: -1448 Strategy: Devil

ID: 2 Score: -1498 Strategy: Devil

ID: 10 Score: -1602 Strategy: Judge

ID: 9 Score: -1629 Strategy: Judge

ID: 7 Score: -1793 Strategy: MadMan

ID: 6 Score: -1815 Strategy: MadMan

ID: 8 Score: -1839 Strategy: MadMan

ID: 1 Score: -2208 Strategy: Saint

In generation 3 the group’s overall score was -17991 in 1000 rounds with 10 prisoners

ID: 2 Score: -1642 Strategy: Devil

ID: 5 Score: -1656 Strategy: Devil

ID: 3 Score: -1680 Strategy: Devil

ID: 1 Score: -1688 Strategy: Devil

ID: 4 Score: -1690 Strategy: Devil

ID: 10 Score: -1738 Strategy: Judge

ID: 9 Score: -1749 Strategy: Judge

ID: 6 Score: -2039 Strategy: MadMan

ID: 7 Score: -2045 Strategy: MadMan

ID: 8 Score: -2064 Strategy: MadMan

Looking at the first generation you’ll notice that the devils picked on the saints (like usual) which put the devils in the lead and sent the saints to last place. That means that the top 10% of our group was a single devil and the bottom 10% was a single saint. Because of this the next generation has four devils (up from the starting three) and only one saint (down from the starting two).

Another generation then passes with results very much like the first, causing us to lose another saint and gain another devil. Which is honestly a little ominous…

Next Time: Turning Boring Data Overflow Into Interesting Visuals

So that was just ten prisoners playing three generations of fairly short games. For a real experiment we’re going to want hundreds of prisoners playing significantly longer games over at least a dozen generations. Things like:

prisoners = createPrisoners(250, 250, 250, 250)
playMultiGenerationalPrisonersDilemma(prisoners, 25, 10000)

Only problem is that a big experiment like that is going to rapidly fill up your terminal with junk. Sending the results to a text file helps but even that produces a lot more numbers than the human brain can conveniently work with.

But you know what the human brain can conveniently work with? Animated bar graphs!

Let’s Program A Prisoner’s Dilemma Part 6: Eye For An Eye

It’s finally time to write a prisoner with some actual brain power. Let’s start by looking at our current prisoners and their unique issues:

Saints always cooperate, making them an easy target for defectors.

Devils always defect, which leads to bad scores for everyone when multiple devils get together.

Madmen act randomly, which means whether or not they make good choices is up to chance.

What we really want is a prisoner that cooperates when cooperating is a good idea and defects in self defense when it suspects its about to be betrayed. A prisoner that can judge whether or not its partner deserves to be trusted.

Trust No One?

But how will our new “judge” type prisoner decide who to trust? The only thing it will know about its partner is their ID. There’s no way to force the other prisoner to reveal their strategy*, or at least there shouldn’t be. After all, the prisoner’s dilemma is meant to model real life and most real humans aren’t mind readers.

So how will our judge decide how to make decisions if he can’t read minds or see the future? The same way we humans do. We trust people who act trustworthy and we are suspicious of people who betray us.

This leads to a strategy called “tit-for-tat”, where you treat your partner the same way they treat you. If they betray you then the next time you meet them you betray them. If they cooperate then the next time you see them you pay them back by cooperating.

To pull this off we’re going to have to give our judges a memory.

Remember that reportResults function that we programed into the original prisoner class? We’ve been ignoring that so far because none of our prisoners needed it. Saints, devils and madmen always act the same way so they don’t really care how their partners act.

But our judge can use the info from reportResults to build a list of people he can and can’t trust. All we have to do is initialize a blank hash when the judge is created. Then every time reportResults is called we update that hash. We then use this hash in cooperate? by taking the provided partner ID, looking up in our hash whether or not they like to cooperate and then matching their behavior.

But wait! What do we do the first time we meet a new player? We can’t copy their behavior if we’ve never seen how they behave!

The answer is to remember the Golden Rule and “Treat others like you want to be treated”. That means the first time a judge meets a new player he will cooperate. Not only is this good manners, it helps make sure that when judges meet each other they will get locked into a loop of infinite cooperation instead of a loop of infinite betrayal.

Designing The Judicial Branch

The two things that make a judge a judge are:

1) It keeps track of how other prisoners treat it

2) If it meets a prisoner more than once it copies their last behavior

So let’s start with feature 1 by giving our prisoner a memory. To do that we’re going to have to create a hash during prisoner initialization and then use the learnResults function to fill it with useful data.

def initialize(id)
   super(id)
   @strategy = "Judge"
   @opponentBehavior = Hash.new()
end

def learnResults(opponentID, opponentCooperated)
   @opponentBehavior[opponentID] = opponentCooperated
end

Now that our opponentBehavior hash is filling up all that’s left is to use it as part of our decision making process. This is a pretty simple two step process. When our judge is asked whether or not it wants to cooperate with a given prisoner it starts by checking whether or not it has an opponentBehvaior entry for that ID. If it does it just mimics back whatever behavior it recorded for that ID, but if it’s empty it instead defaults to cooperation.

def cooperate?(opponentID)
   if( @opponentBehavior.key?(opponentID))
      return @opponentBehavior[opponentID]
   else
      return true
   end
end

Put it all together and what do you get?

class Judge < Prisoner

   def initialize(id)
      super(id)
      @strategy = "Judge"
      @opponentBehavior = Hash.new()
   end

   def cooperate?(opponentID)
      if( @opponentBehavior.key?(opponentID))
         return @opponentBehavior[opponentID]
      else
         return true
      end
   end

   def learnResults(opponentID, opponentCooperated)
      @opponentBehavior[opponentID] = opponentCooperated
   end
end

Featuring Four Different Character Classes!

Now let’s give our code a test by setting up a group with four judges, four saints, four devils and four madmen.

The Group’s Overall Score was -23546 in 1000 rounds with 16 prisoners

ID: 8 Score: -1126 Strategy: Devil

ID: 7 Score: -1170 Strategy: Devil

ID: 6 Score: -1198 Strategy: Devil

ID: 5 Score: -1254 Strategy: Devil

ID: 14 Score: -1391 Strategy: Judge

ID: 16 Score: -1394 Strategy: Judge

ID: 15 Score: -1398 Strategy: Judge

ID: 13 Score: -1425 Strategy: Judge

ID: 12 Score: -1479 Strategy: MadMan

ID: 9 Score: -1483 Strategy: MadMan

ID: 11 Score: -1498 Strategy: MadMan

ID: 10 Score: -1510 Strategy: MadMan

ID: 3 Score: -1766 Strategy: Saint

ID: 2 Score: -1790 Strategy: Saint

ID: 4 Score: -1802 Strategy: Saint

ID: 1 Score: -1862 Strategy: Saint

In our mixed group you can see that the judges outperform both the madmen and the saints and come pretty close to matching the devils.

But that’s not good enough. We didn’t want to just match the devils, we wanted to beat them! It’s absolutely infuriating that our genuinely intelligent judges are somehow being outscored by a bunch of dumb always-defectors.

Grudge Match

OK, time to get to the bottom of this. Why can’t our judges beat the devils? Let’s get rid of all the saints and madmen and isolate our problem.

The Group’s Overall Score was -17859 in 1000 rounds with 10 prisoners

ID: 10 Score: -1558 Strategy: Judge

ID: 7 Score: -1581 Strategy: Judge

ID: 6 Score: -1581 Strategy: Judge

ID: 9 Score: -1592 Strategy: Judge

ID: 8 Score: -1597 Strategy: Judge

ID: 2 Score: -1990 Strategy: Devil

ID: 3 Score: -1990 Strategy: Devil

ID: 4 Score: -1990 Strategy: Devil

ID: 5 Score: -1990 Strategy: Devil

ID: 1 Score: -1990 Strategy: Devil

Well would you look at that. When all you have are devils and judges the judges scream ahead into first place no problem.

It’s not that our judges weren’t smart enough to out-think the devils in the mixed game, it’s that the devils had a bunch of saints sitting around they could exploit for easy points. As soon as we took away the ever-forgiving all cooperators the devils sank like a rock while the judges pulled off a brilliant group victory by cooperating with each other and paying back the deceitful devils with defection after defection.

Impossible To Trust

Let’s finish up with a few more what-if scenarios, like “What if you stuck a bunch of judges with a bunch of saints?”

The Group’s Overall Score was -10000 in 1000 rounds with 10 prisoners

ID: 1 Score: -1000 Strategy: Saint

ID: 2 Score: -1000 Strategy: Saint

ID: 3 Score: -1000 Strategy: Saint

ID: 4 Score: -1000 Strategy: Saint

ID: 5 Score: -1000 Strategy: Saint

ID: 6 Score: -1000 Strategy: Judge

ID: 7 Score: -1000 Strategy: Judge

ID: 8 Score: -1000 Strategy: Judge

ID: 9 Score: -1000 Strategy: Judge

ID: 10 Score: -1000 Strategy: Judge

A ten way tie with a perfect group score, just like when we had a group of pure saints. As long as you’re nice to the judges they’ll be perfectly nice to you.

What if we get rid of the saints and just have judges?

The Group’s Overall Score was -10000 in 1000 rounds with 10 prisoners

ID: 1 Score: -1000 Strategy: Judge

ID: 2 Score: -1000 Strategy: Judge

ID: 3 Score: -1000 Strategy: Judge

ID: 4 Score: -1000 Strategy: Judge

ID: 5 Score: -1000 Strategy: Judge

ID: 6 Score: -1000 Strategy: Judge

ID: 7 Score: -1000 Strategy: Judge

ID: 8 Score: -1000 Strategy: Judge

ID: 9 Score: -1000 Strategy: Judge

ID: 10 Score: -1000 Strategy: Judge

Yet another perfect score! Since judges start off by cooperating they inevitably end up trusting each other and create just as nice a world as a group of pure saints would.

That’s enough happy thoughts for now. Why not try something more sinister? Like a group full of devils with only one judge?

The Group’s Overall Score was -19991 in 1000 rounds with 10 prisoners

ID: 1 Score: -1998 Strategy: Devil

ID: 2 Score: -1998 Strategy: Devil

ID: 3 Score: -1998 Strategy: Devil

ID: 4 Score: -1998 Strategy: Devil

ID: 5 Score: -1998 Strategy: Devil

ID: 9 Score: -1998 Strategy: Devil

ID: 7 Score: -1998 Strategy: Devil

ID: 8 Score: -1998 Strategy: Devil

ID: 6 Score: -1998 Strategy: Devil

ID: 10 Score: -2009 Strategy: Judge

As we’ve seen time and time again there is literally no way to win in a world full of devils. They will always betray you and so you can never get ahead.

But unlike the saint, which was absolutely destroyed by a gang of devils, the judge manages to at least keep his loses pretty low. As you can see from the numbers the judge cooperated exactly once with every single devil and then never trusted them again, thereby preventing them from leeching more points out of him.

* There is actually a variation of the prisoner dilemma where each prisoner is given a complete copy of their partner’s code as an input. We’re not doing that.

Let’s Program A Prisoners Dilemma 5: What Is A Decision Making Process?

The saint and devil prisoners were easy to write but, as we’ve seen, they’re not actually very good at playing the game. The always cooperating saints are wide open to exploitation and the devils will always defect even when cooperating would score them more points in the long run.

We clearly need a prisoner who can actually make choices instead of just doing the same thing again and again.

1d6 Points of SAN Loss

As any game designer can tell you, the easiest way to simulate decision making is with random numbers. Just give the computer a list of decisions and let it pick one at random.

So on that note I give you: The madman.

class MadMan < Prisoner
   def initialize(id)
      super(id)
      @strategy = "MadMan"
   end

   def cooperate?(opponentID)
      choice = rand(2)
      if( choice == 1)
         return true
      else
         return false
      end
   end
end

Ruby weirdness warning: In a lot of languages “0” is the same as false, so you might be tempted to have cooperate? just return the number generated by rand. Don’t do that. In Ruby only “false” and “null” are considered false. Everything else, including the number 0, are considered true. This is useful because it means the number 0 always acts like just a number. On the other hand it messes up a lot of other useful programming shortcuts so all in all it sort of break even.

Anyways, don’t forget to tell our create Prisoners method that there’s a new type of prisoner object for it to work with.

def createPrisoners(saintCount, devilCount, madmanCount)
   prisoners = Array.new
   playerCounter = 0
   saintCount.times{ prisoners.push(Saint.new(playerCounter += 1)) }
   devilCount.times{ prisoners.push(Devil.new(playerCounter += 1)) }
   madmanCount.times{ prisoners.push(MadMan.new(playerCounter += 1)) }
   return prisoners
end

Inmates Are Running The Asylum

Let’s be honest here: Making decisions at random is almost never a good idea. So how does our new class of insane prisoners perform in an actual game?

prisoners = createPrisoners(4, 4, 4)
playPrisonersDilemma(prisoners, 1000)

Ten doesn’t divide evenly into thirds so this time we’ll have four of each type of prisoner. Please not that this changes the perfect score to -12,000.

The Group’s Overall Score was -17905 in 1000 rounds with 12 prisoners

ID: 6 Score: -876 Strategy: Devil

ID: 7 Score: -882 Strategy: Devil

ID: 8 Score: -892 Strategy: Devil

ID: 5 Score: -906 Strategy: Devil

ID: 12 Score: -1499 Strategy: MadMan

ID: 9 Score: -1504 Strategy: MadMan

ID: 10 Score: -1538 Strategy: MadMan

ID: 11 Score: -1564 Strategy: MadMan

ID: 2 Score: -2026 Strategy: Saint

ID: 1 Score: -2060 Strategy: Saint

ID: 4 Score: -2074 Strategy: Saint

ID: 3 Score: -2084 Strategy: Saint

Madmen randomly flip between acting like saints and acting like devils so it makes sense they would wind up scoring squarely in between the two. They don’t just let the devils betray them; sometimes they betray right back. And they alternate between cooperating with saints for small gains and betraying them for big gains.

So all in all it seems like mild insanity is actually a pretty well rounded fit for the cutthroat world of the prisoner’s dilemma.

Also, as promised, the fact that madmen can make actual decisions means that our overall group score now has some variation to it even when running multiple games with the same group.

The Group’s Overall Score was -17995 in 1000 rounds with 12 prisoners

The Group’s Overall Score was -18046 in 1000 rounds with 12 prisoners

The Group’s Overall Score was -17938 in 1000 rounds with 12 prisoners

The Lost And The Damned

So madmen seem to do pretty well in a mixed group… but maybe that’s just because they had some saints to act as backup. What happens when we pair up only devils and madmen?

The Group’s Overall Score was -17523 in 1000 rounds with 10 prisoners

ID: 4 Score: -1440 Strategy: Devil

ID: 1 Score: -1450 Strategy: Devil

ID: 2 Score: -1456 Strategy: Devil

ID: 3 Score: -1456 Strategy: Devil

ID: 5 Score: -1472 Strategy: Devil

ID: 9 Score: -2013 Strategy: MadMan

ID: 7 Score: -2014 Strategy: MadMan

ID: 8 Score: -2068 Strategy: MadMan

ID: 6 Score: -2072 Strategy: MadMan

ID: 10 Score: -2082 Strategy: MadMan

About the same thing as when there were saints, it turns out. The madmen’s habit of cooperating roughly half the time means they still can’t directly compete with the vicious defecting devils, but at least randomly defecting half of the time allows them to sort of defend themselves.

In fact, if you compare this to the time we evenly paired up devils and saints you’ll see that the madmen scored about the same as the saints. But the big difference is that the madmen did much more damage to the devils in the process.

Although it’s up in the air as to whether this is a good thing or not. Taking the devils down a notch is certainly a satisfying feeling but the madmen still lost and the group score is much much worse then when the saints had their match with the devil.

The More Randomness You Have The Less Random It Is

For our final experiment I just want to point out that while I call them “Madmen” the random decision prisoners actually managed to achieve some pretty reliable results, consistently scoring halfway between a saint and a devil.

This is of course because random numbers tend to average out over time. So “cooperates at random” eventually transforms into “consistently cooperates 50% of the time”.

To show this off I’m going to have a bunch of madmen play increasingly long games against each other.

prisoners = createPrisoners(0, 0, 10)
playPrisonersDilemma(prisoners, 10)
playPrisonersDilemma(prisoners, 100)
playPrisonersDilemma(prisoners, 1000)
playPrisonersDilemma(prisoners, 1000000)

The Group’s Overall Score was -141 in 10 rounds with 10 prisoners

ID: 10 Score: -7 Strategy: MadMan

ID: 1 Score: -10 Strategy: MadMan

ID: 6 Score: -12 Strategy: MadMan

ID: 8 Score: -13 Strategy: MadMan

ID: 9 Score: -14 Strategy: MadMan

ID: 5 Score: -15 Strategy: MadMan

ID: 2 Score: -16 Strategy: MadMan

ID: 3 Score: -17 Strategy: MadMan

ID: 7 Score: -18 Strategy: MadMan

ID: 4 Score: -19 Strategy: MadMan

The Group’s Overall Score was -1493 in 100 rounds with 10 prisoners

ID: 8 Score: -132 Strategy: MadMan

ID: 3 Score: -135 Strategy: MadMan

ID: 6 Score: -139 Strategy: MadMan

ID: 5 Score: -144 Strategy: MadMan

ID: 1 Score: -145 Strategy: MadMan

ID: 9 Score: -154 Strategy: MadMan

ID: 7 Score: -156 Strategy: MadMan

ID: 4 Score: -159 Strategy: MadMan

ID: 2 Score: -163 Strategy: MadMan

ID: 10 Score: -166 Strategy: MadMan

The Group’s Overall Score was -14976 in 1000 rounds with 10 prisoners

ID: 3 Score: -1437 Strategy: MadMan

ID: 8 Score: -1457 Strategy: MadMan

ID: 1 Score: -1482 Strategy: MadMan

ID: 4 Score: -1487 Strategy: MadMan

ID: 7 Score: -1491 Strategy: MadMan

ID: 10 Score: -1492 Strategy: MadMan

ID: 5 Score: -1503 Strategy: MadMan

ID: 6 Score: -1514 Strategy: MadMan

ID: 9 Score: -1537 Strategy: MadMan

ID: 2 Score: -1576 Strategy: MadMan

The Group’s Overall Score was -15001829 in 1000000 rounds with 10 prisoners

ID: 3 Score: -1498113 Strategy: MadMan

ID: 10 Score: -1499339 Strategy: MadMan

ID: 1 Score: -1499525 Strategy: MadMan

ID: 9 Score: -1500065 Strategy: MadMan

ID: 7 Score: -1500128 Strategy: MadMan

ID: 2 Score: -1500445 Strategy: MadMan

ID: 4 Score: -1500662 Strategy: MadMan

ID: 5 Score: -1500894 Strategy: MadMan

ID: 8 Score: -1501085 Strategy: MadMan

ID: 6 Score: -1501573 Strategy: MadMan

As you can see, the longer the game lasts the less difference there is in the scores.

After ten rounds the worst score (-19) was almost three times as bad as the best score (-7).

After one hundred rounds the worst score (-166) was only about 25% worse than the best score (-132).

After a thousand rounds there was only a 10% difference between best (-1437) and worst (-1576).

And after a million rounds there was less than a 1% difference between best and worst.

So in the long run cooperating at random is a viable way to take a middle path cooperation and defection.

We Can Be Smarter Than This

Random decision making lead to some interesting outcomes but it failed to come even close to beating the devils. Plus it’s an embarrassingly simple algorithm for AI enthusiasts like ourselves. Surely we can come up with a smarter prisoner. One that actually thinks instead of guessing.

But that’s going to have to wait for next time.

Let’s Program A Prisoner’s Dilemma Part 3: Fight In Cell Block D

Last time we programmed a couple prisoners and ran them through their paces. This time we’re going to write the code for the actual prisoner’s dilemma game.

We start by pseudo-coding our algorithm:

  • First: generate a bunch of prisoners objects.
  • Second: decide how many rounds the game should last.
  • During each round:
    •    Randomly pair up prisoners.
    •    Each prisoner is asked whether or not they want to cooperate with their partner
    •    Prisoners lose points based on the decision they and their partner made
    •    Move on to the next round by randomly sorting prisoners into new pairs
  • When all the rounds are done show the stats of every prisoners so we know who won (by losing the least points).
  • Also show the total sum of all the prisoners’ scores so we can compare different mixes of prisoners.

Breaking News: Prisons Filling Up At An Alarming Rate

Let’s start by figuring out how to set up a group of prisoners. We could just hard code an array of prisoner objects but that would leave us in the bad situation of having to rewrite actual code every time we wanted to try a new mix of prisoners. That sounds both boring and error prone so instead let’s come up with a function we can use to create a mix of prisoners on the fly.

def createPrisoners(saintCount, devilCount)
   prisoners = Array.new
   playerCounter = 0
   saintCount.times{ prisoners.push(Saint.new(playerCounter += 1)) }
   devilCount.times{ prisoners.push(Devil.new(playerCounter += 1)) }
   return prisoners
end

We just tell this function how many saints and devils we want and it gives us exactly what we asked for packaged inside a convenient array. It also keeps count of how many prisoners it has created so far and uses that count to make sure every prisoner has a unique ID.

This function also shows off a rather handy Ruby shortcut for writing a classic “for loop”. If you want a loop that runs a certain number of times just take that number, or a variable holding that number, and add .times{ your code here } to the end. Ex: 4.times{ puts “This will print four times” }

Exciting Next-gen Gameplay!

To actually play the prisoners dilemma we need a group of prisoners and an idea of how many rounds we want the game to last, which means our function definition probably needs to look a little like this:

def playPrisonersDilemma(prisoners, rounds)

Then inside the function itself we want to set up a loop that runs once for every round in the game. During those rounds we want to randomly shuffle our prisoners, divide them up into pairs and then ask every prisoner whether they want to cooperate with their temporary partner or not.

At that point we subtract one point from any player who chose to cooperate and subtract two points from any player who was betrayed by their partner (Yes, a player can be hit by both penalties in the same round).

rounds.times{
   pairs = prisoners.shuffle.each_slice(2)
   pairs.each{ |pair|
      firstPlayerCooperates = pair[0].cooperate?(pair[1].id)
      secondPlayerCooperates = pair[1].cooperate?(pair[0].id)

      if(firstPlayerCooperates)
         pair[0].score -= 1
      else
         pair[1].score -= 2
      end

      if(secondPlayerCooperates)
         pair[1].score -= 1
      else
         pair[0].score -= 2
      end

      pair[0].learnResults(pair[1].id, secondPlayerCooperates)
      pair[1].learnResults(pair[0].id, firstPlayerCooperates)
   }
}

Once again we’re using handy Ruby shortcuts to save a bit of typing. We use rounds.times to set up a loop that will play the game the proper number of times. We then use prionser.shuffle to randomly mix up the prisoners and then chain it to each_slice(2) to divide the random mix into random pairs.

Important note: shuffle and slice don’t change the original array. They instead return a transformed copy that has to be assigned to a variable before you can use it. But for us this is actually a good thing because it means we can just shuffle and slice the same prisoners array at the start of each loop without having to worry about it somehow getting messed up between iterations.

Once we’ve got our random pair array we can use pairs.each to write some quick code we want to run once for each piece of data in our array. The each loop starts out by grabbing an item from the array and storing it inside whatever variable we’ve named with the | variable | syntax.

In our case the pairs array is full of tiny two item arrays, so we call our variable pair. From there it’s pretty simple to each half ot he pair whether or not it wants to cooperate with the other half and then we can assign points. Remember our scoring rules: A player who cooperates loses one point. A player who refuses to cooperate forces the other player to lose two points. We also call the learnResults function to let each prisoner know how the round played out.

After the rounds.times loop has finished playing the game all that’s left is to print out the final score.

#Show the stats for the group as a whole as well as for each individual prisoner
groupScore = 0
prisoners.each{ |prisoner| groupScore += prisoner.score }
puts "The Group's Overall Score was #{groupScore} in #{rounds} rounds with #{prisoners.length} prisoners"
prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner| prisoner.report }

Nothing complicated here. We use an each loop to tally up the scores from every player so we can print a group score. Then we sort the prisoners by score and use another each loop to print out the individual stats for every prisoner.

All Together Now

Gluing together all the bits of code we just wrote leaves us with this nice function:

def playPrisonersDilemma(prisoners, rounds)
   if !prisoners.length.even?
      throw "Prisoners Dilemma requires an even number of participants"
   end

   # Make sure each prisoner starts out with a clean slate
   prisoners.each{ |prisoner| prisoner.score = 0}
   
   rounds.times{
      pairs = prisoners.shuffle.each_slice(2)
      pairs.each{ |pair|
         firstPlayerCooperates = pair[0].cooperate?(pair[1].id)
         secondPlayerCooperates = pair[1].cooperate?(pair[0].id)

         if(firstPlayerCooperates)
            pair[0].score -= 1
         else
            pair[1].score -= 2
         end

         if(secondPlayerCooperates)
            pair[1].score -= 1
         else
            pair[0].score -= 2
         end
      }
   }

   #Show the stats for the group as a whole as well as for each individual prisoner
   groupScore = 0
   prisoners.each{ |prisoner| groupScore += prisoner.score }
   puts "The Group's Overall Score was #{groupScore} in #{rounds} rounds with #{prisoners.length} prisoners"
   prisoners.sort{ |x, y| y.score <=> x.score}.each{ |prisoner| prisoner.report }
end

And now we can test it by having a group of prisoners play the game. Let’s try it out with a group of two saints and two devils playing for a thousand rounds:

prisoners = createPrisoners(2, 2)
playPrisonersDilemma(prisoners, 1000)

This should give you output kind of like this:

The Group’s Overall Score was -6000 in 1000 rounds with 4 prisoners

ID: 4 Score: -650 Strategy: Devil

ID: 3 Score: -650 Strategy: Devil

ID: 2 Score: -2350 Strategy: Saint

ID: 1 Score: -2350 Strategy: Saint

Hmm… looks like the Devils had no trouble at all completely crushing the Saints. We’ll look at that match up in more detail next time.

Let’s Program A Prisoner’s Dilemma Part 2: Crime And Punishment

Last time we learned about a game theory thought experiment called the “Iterated Group Prisoner’s Dilemma”. Basically players get randomly paired up and have to decide to either cooperate by accepting a small penalty for themselves or defect by shoving a big penalty onto the other player. The players then get shuffled around and repeat the game with a new partner. After a few hundred rounds (or more) the game comes to a stop and the winner is whoever has lost the fewest points.

This time we’re going to start programing a simulator for that game so we can watch how different strategies play out.

I will be writing this project in Ruby. It’s not really “better” for this program than any of the various other high level languages we could choose from, but I’ve been using it a lot as part of my game programming hobby and figured I might as well stick with it while it was in my head.

So if you want to follow along with my exact code open up a file named prisonerdilemma.rb in your favorite text editor and get ready for some Ruby. Of course, rewriting the project in your favorite language is always an option.

Introducing Prisoner 24601

The most important part of the prisoner’s dilemma is the prisoners. So let’s make a list of what our little digital prisoners need to be able to do in order for this simulation to work.

1) They need to have a unique ID so that other prisoners (and us humans) can recognize them.

2) They need to be able to decide whether to cooperate or defect with other prisoners based on ID.

3) They need to be able to see whether their opponent decided to cooperate or defect.

4) They need to be able to report what strategy they are using so we can keep track of how different strategies perform.

5) They need to be able to keep track of their score so we can calculate who is winning and losing.

Based on all this I’ve thrown together a very simple prisoner class for us to start working with. It can’t really do much except book keeping though. It has variables to keep track of it’s own ID, score and strategy and even exposes the score to outside code so we can update it during the game. It also has a convenient “report” function that will print out all that information in human friendly format.

As for actual game playing code… it does technically have a “cooperate?” function for deciding how to play the game along with a “learnResults” function for keeping track of its opponents. But at the moment both of these functions are empty because this is just skeleton code; it isn’t meant to actually do anything. In fact, if you try to call the “cooperate?” function it will throw an error!

class Prisoner
   attr_reader :id, :strategy, :score
   attr_writer :score

   def initialize(id)
      @id = id;
      @score = 0;
      @strategy = "Prisoner Parent Class"
   end

   def cooperate?(opponentID)
      throw "Prisoner child class must implement a cooperate? method"
   end

   def learnResults(opponentID,oponentCooperated)
   end

   def report
      puts "ID: #{@id} Score: #{@score} Strategy: #{@strategy}"
   end
end

Still, even with two incomplete functions we can still give this a test run. If you were to add some code like this to your project:

testprisoner = Prisoner.new(6)
testprisoner.score = -10
testprisoner.report

You would get some output like this:

ID: 6 Score: -10 Strategy: Prisoner Parent Class

Ye Not Guilty – The “Saint” Strategy

Now that we’ve got a nice generic class up and running we can move on to writing a prisoner with actual dilemma solving code!

For our first attempt we’re going to be writing the simplest possible of all prisoner algorithms: A player who always always chooses to cooperate. We’ll be calling this the “Saint” strategy.

Now the actual code for this will be virtually identical to the generic prisoner. It will just have a different name inside of the “strategy” variable and its “cooperate” function will return true instead of throwing an error.

But we all know that copy pasting code is horrible and hard to maintain, so we’ll be using the miracle of inheritance to avoid any copy pasting.

class Saint < Prisoner
   def initialize(id)
      super(id)
      @strategy = "Saint"
   end

   def cooperate?(opponentID)
      return true
   end
end

That “Saint < Prisoner” tells the compiler that Saint is a type of Prisoner and should get a free copy of all of its code. Then instead of writing an entire new class from scratch we only have to write the few bits that are different.

First up we write a new initialize function so we can set the strategy variable to “Saint” instead of the default from the Prisoner class. But we still want the Prisoner class to handle things like setting our ID and initializing our score so we start out by calling the generic Prisoner initialize function with a call to “super”.

The super call checks whether the parent has a function with the same name as the current function and then calls it. It’s really useful for when you want a child function to do everything the parent did and then a little extra. Just start with super to get the parent’s behavior and then add in the unique child logic afterwards.

Next we write a new cooperate? function, which for our super trusting saint just involves returning “true”. In this case we want to completely override the parent’s version of the function so we leave out the call to super and just write new code.

Father of Lies

That was pretty easy, so let’s keep going by programming a second type of prisoner that perfectly mirrors our first. This time we’ll create a player that always always chooses to defect. Let’s call this the “Devil” strategy.

class Devil < Prisoner
   def initialize(id)
      super(id)
      @strategy = "Devil"
   end

   def cooperate?(opponentID)
      return false
   end
end

Just like with the “Saint” we use inheritance to grab our basic Prionser code and then just add in our new strategy name and cooperate? Algorithm. Couldn’t be easier.

A Not So Epic Clash Between Good And Evil

And now here’s a little code snippet you can use for testing that both of our new classes work:

testSaint = Saint.new(1)
testSaint.report
testDevil = Devil.new(2)
testDevil.report
puts "Saint Cooperates?"
puts testSaint.cooperate?(testDevil.id)
puts "Devil Cooperates?"
puts testDevil.cooperate?(testSaint.id)

Which results in this output:

ID: 1 Score: 0 Strategy: Saint

ID: 2 Score: 0 Strategy: Devil

Saint Cooperates?

true

Devil Cooperates?

False

Let’s Get Ready To Rumble

Now that we have some players assembled our next task will be setting up an actual game for them to play in. But that will have to wait until the next post.

Let’s Program A Prisoner’s Dilemma Part 1: Game Theory Is Less Fun Than It Sounds

A Different Kind Of Gaming

Game Theory, despite the name, is unfortunately not “The study of videogames”. Instead it is the science of mathematically analyzing how decisions are made.

The reason it’s called game theory is because one popular way to analyze real world decision making is to come up with a simplistic “game” that mimics the core dynamic of the real world problem. The game can then be mathematically analyzed and the results applied back to the more complex real world situation.

For instance, a game theorist studying crime might come up with a “game” where people who follow the law get 1 point and people who break the law have a chance of gaining two points but also a chance of losing three points. He can then mathematically analyze how different risk levels influence whether or not crime is a winning strategy.

The Prisoner’s Dilemma

The most famous example of a game theory game is the “Prisoner’s Dilemma”. Two criminal are caught red-handed committing a small crime that will land them in jail for one year. The police also suspect the two criminals are guilty of a larger crime worth two years of jail time, but they don’t have enough evidence to prove it.

The police eventually come up with a plan to interview each criminal separately and offer them this deal: Testify about the other guy’s involvement in the larger crime and we’ll drop the current small charge against you.

And now here’s the “game”: As one of the prisoners do you protect your buddy and keep quiet even though it means going to jail for a year? Or do you tattle in hopes of going free? And how do you know whether or not you can trust the other guy? If he confesses you go to jail for an extra two years.

Other Guy Keeps Quiet Other Guy Confesses
You Keep Quiet You both go to jail for one year You go to jail for three years

Other guy goes free

You Confess You go free

Other guy goes to jail for three years

You both go to jail for two years

Now as mathematicians we want to be able to analyze this choice with actual numbers, so let’s do a little game theorizing. Instead of two prisoners let’s imagine a two player game where each player is allowed to choose to either play nice and cooperate or to betray their partner and defect. If you cooperate you lose 1 point. If you defect the OTHER player loses 2 points.

Player B Cooperates Player B Defects
Player A Cooperates Player A: -1 point

Player B: -1 point

Player A: -3 points

Player B: -0 points

Player A Defects Player A: -0 points

Player B: -3 points

Player A: -2 points

Player B: -2 points

What makes this such an interesting game to study is the fact that the best group strategy and the best individual strategy are complete opposites.

As a group the obvious solution is for both players to cooperate so that between them they only lose 2 points. If one player defects the total loss jumps up to 3 points and if both players defect the total loss is 4 points, as bad as it gets.

But as an individual the equally obvious solution is to defect. After all, if you cooperate you are guaranteed to lose 1 point and might even lose 3 points if the other player defects. But if you yourself defect you might not lose any points and even if the other player defects you only lose 2. There is just no good reason to purposely give up one point.

So the smartest thing to do as an individual is to defect but the smartest thing to do as a group is to cooperate and there doesn’t seem to be any logical way to bridge the gap between these two strategies. Or is there?

Iterated Prisoner’s Dilemma

The core problem in the prisoners dilemma is the fact that it’s a one-time choice. You either cooperate or defect and a few seconds later you find out who won and who lost, end of story. This is why there is no logical reason not to betray the other player. If you defect you’re guaranteed to at least tie so why risk anything else?

But what if you had to play the game multiple times?

This could change things up. Now your decision to cooperate or defect might impact how future games are played. There is now a possibility to build or lose trust.

But sadly you can show that even with multiple games the best logical strategy is still to always defect. The logic goes like this:

1) During the last round you might as well defect since there are no more future rounds in which the other player could punish you.

2) Since a logical opponent will defect during the last round you might as well defect during the second to last round. It’s not like your opponent was going to cooperate in that last round anyways.

3) Since a logical opponent will defect during the second to last round you might as well defect during the third to last round…

4) Continue the pattern and you can see that always defecting is the best strategy.

There are only two ways to break out of this cycle.:

One is to have an infinite number of rounds so that players are always at risk of being punished in the future for bad behavior in the present. We’re not going to do that though for the simple reason that there is no way to run an infinite number of Prisoner’s Dilemmas on a finite computer.

The other way to break the cycle is to introduce more players.

Iterated Group Prisoner’s Dilemma

The iterated group prisoners dilemma works the same as the normal iterated prisoner dilemma except that you have several different players who get randomly paired up before every round.

This drastically changes the dynamics of the game.

In a two-player prisoner’s dilemma double defection is a “stable” strategy because it means neither player will get ahead of the other. You might be losing two points every round but the other guy is too so you aren’t actually losing.

But when there’s a group? Now a players who is constantly losing two points from double defection can end up falling behind other pairs of players who cooperate and thus only lose one point per round. Suddenly it’s not enough to try and figure out whether your current partner is going to cooperate or defect; you also have to worry about whether other players in other matches are cooperating or defecting.

Interesting… But Is It Useful?

The Iterated Group Prisoners Dilemma is not only mathematically interesting, it is highly relevant to real life. Ever day people are faced with scenarios where they can choose to either work together or try to cheat each other. In the long run it’s best for society when people work together but on an individual level exploiting others can provide huge short term benefits.

So by analyzing the prisoners dilemma we can get some interesting insights into why different societies work the way they do along with some ideas on what sort of ethical systems are and aren’t stable.

Of course, real life has all sorts of issues that aren’t fully reflected in a simple tool like the prisoners dilemma so don’t think it somehow has all the answers to all our problems. But it’s still better than nothing and at the very least provides a nice topic for small talk with other math geeks.

What Does Any Of This Have To Do With Programming?

As AI enthusiasts we are obviously very interested in writing programs that can make logical decisions, and the prisoners dilemma provides a great sandbox for practicing that skill. It’s simple enough of a scenario to be implemented in less than a page of code while still being complex enough to produce interesting results.

So here’s the plan for this Let’s Program: We’re going to write a Prisoner’s Dilemma simulation along with a few different types of players each with their own strategy for deciding when to cooperate and when to defect. We’ll then run a bunch of simulations and see if we can find any interesting trends.

Like usual this isn’t exactly an original idea. Game theorists have been writing and testing prisoners dilemma programs for years and if you’re familiar with their research you probably already know everything we’re going to cover in this Let’s Program.

But who cares? The goal here is to practice our programming, not push the cutting edge of computer science. And there’s bound to be at least one reader in my audience who isn’t already a game theory expert.

Actually, quick show of hands: Is there anyone here who has never studied game theory before reading this blog?

There, see? That guy over there is the reason we’re doing this. He’s about to learn something cool.

Let’s Program A JavaScript Game: Index and Code

JavaScript is a powerful programming language that’s built right into most web browsers. Because of this it’s rapidly become the default way to provide user friendly interactive web experiences. The Internet experience as we know it today wouldn’t exist without good old JS.

But who cares about silly little things like that? The real question is: Can we use it to program games? And thanks to the introduction of HTML5 elements like the “canvas” the answer is now a resounding “YES”.

So come along and join me as we program ourselves a simple JavaScript web game.

Index

Let’s Program A JavaScript Game 1: Browser Games With JavaScript

Let’s Program A JavaScript Game 2: The Canvas Of Dreams

Let’s Program A JavaScript Game 3: Designing “Defrag Cycle”

Let’s Program A JavaScript Game 4: V8 600HP Game Engine

Let’s Program A JavaScript Game 5: Press Start To Play

Let’s Program A JavaScript Game 6: When Worlds Collide

Let’s Program A JavaScript Game 7: When Worlds Collide For Real

Let’s Program A JavaScript Game 8: That Sinking Feeling

Let’s Program A JavaScript Game 9: Look Before You Leap

Let’s Program A JavaScript Game 10: Letting The World Pass By

Let’s Program A JavaScript Game 11: This Post Isn’t About Malware, Honest

Let’s Program A JavaScript Game 12: Making A STATEment

Let’s Program A JavaScript Game 13: Some Much Needed Polish

Let’s Program A JavaScript Game 14: A Winner Is You

Let’s Program A JavaScript Game 15: The Computer Is Out To Get You

Let’s Program A JavaScript Game 16: Do You Believe In Magic?

Let’s Program A JavaScript Game 17: Can You Hear Me Now?

Let’s Program A JavaScript Game 18: Moving Pictures

Let’s Program A Javascript 19: Cross Browser Chaos

Let’s Program A JavaScript Game 20: BONUS STAGE!

Code

You can play a complete demo of the game we’re building here: http://scottcornaby.com/games/debugcycle/defrag-cycle-final.html

To run the game locally so you can examine the code and make changes just download all the html, png and ogg files in this folder: http://scottcornaby.com/games/debugcycle/