Book Review: The Apollo Guidance Computer: Architecture and Operation

Like most programmers I have spend my entire career working with general purpose “do anything” computers. Every machine I’ve touched from $3 micro-controllers to $3000 servers has been based around the same basic design: A processor that supports universal math and logic combined with a chunk of flexible memory that can hold whatever code and data you want to put in it.

Need your universal computer to do something new? Just load different code into the memory and suddenly your spreadsheet machine is a web server or a fluid simulator.

But what if your computer only ever had to do one very specific thing? What if you your only goal was to get a spaceship to the moon with near-perfect reliability? And what if you had an entire team of engineers willing to build you a custom computer from scratch? How would your one-thing-only computer be different from a generic off the shelf processor and memory combo?

That is the question answered by The Apollo Guidance Computer: Architecture and Operation, a four hundred page book dedicated entirely to the design and programming of the computer that sent the first man to the moon.

As you’ve probably guessed the aspect of the book I found most interesting was how profoundly weird the guidance computer was compared to the generic mass produced machines we use for 99.999% of computation.

For example, normal computers have flexible memory that lets the user program whatever they want. After all, the manufacturer has no idea whether their customers are going to be building stop lights, washing machines or air conditioners. But the guidance computer was only ever going to be used for flight calculations. It didn’t need to be flexible. But what it did need to be was reliable in the harsh, slightly radioactive void of space. So instead of programmable memory it stored it’s code in the form of wires woven around and through hundreds and hundreds of tiny magnets, creating an extremely reliable form of read only memory.

Or what about mathematics. In normal computers a lot of silicon and logic is spent on making sure processors can gracefully and accurately work with numbers both extremely large and extremely small. But implementing that sort of mathematical power in a machine with as many constraints as the guidance computer would have been difficult.

The solution? To not solve the problem. After carefully studying the physics problems they needed to solve it became apparent that they never needed to multiply a very very large number by a very very small number. That meant they didn’t need a full range of flexible numbers. They could instead save processor complexity and instruction space by just doing relatively simple math and leaving it up to the programmers to just remember which equations were working in dozens of kilometers instead of fractions of seconds. Magnitude is relative and as long as the computer fires the thrusters at the right moment it doesn’t really matter whether that it was doing 10 +3 instead of 10,000,000 + 3,000,000.

These design decisions, and hundreds of other, were meticulously described over the course of hundreds of pages of thorough technical description. Technical enough that following along might be difficult if you’ve never taken a class or done some readings on the fundamentals of computer architecture.

So overall a very niche book, but for someone with the right background it’s a true pleasure to read. It’s always interesting learning about the technical accomplishments of NASA and thinking about the design decisions of a very specific computer really helps highlight that modern computer architecture, while very powerful, isn’t actually mandatory or universal.

Let’s Program Some Weird Markov Chains Part 7: Overly Abstract Algorithmic Art

For our final weird Markov Chain experiment we will be working with images. This is the definitely least likely project to result in something interesting. After all, Markove Chains are designed to model and simulate sets of linear data where the next item in the chain is heavily dependent on the items that came before. How are we even going to apply that idea to a 2D image?

I suppose there are a lot of ways you could handle this but what I settled on revolves around the question “What pixels tend to group together?” or more specifically “If I have a 2 x 2 square of pixels and I know what the top left, top right, and lower left pixels are, can I predict what the lower right pixel should be?”

That’s enough to turn an image into a Markov Model. We just loop through every 2 x 2 square in the entire image, use the first three pixels as a key and then keep a count of how often different colors show up for the fourth and final pixel.

We can then use this model to create new images by starting with some randomized corner pixels and then filling in the gap by randomly choosing a pixel based on our model. Adding that one extra pixel creates a new corner with a new gap that we can once again probabilistically paint until we have an entire image that hopefully somewhat reflects the original we based our model on.

What should that model be though? Obviously it needs to be something old enough to be public domain. So how about… this?

A low resolution black and white version of the Mona Lisa

The low resolution might catch you by surprise, but it’s an important pre-processing step. There are so many different colors of pixels that trying to build a Markov Model of a full color picture would probably just produce hundreds of thousands of unique keys each with one unique next pixel. For a more meaningful model we need something a little more generic, something that captures general patterns instead of hyper specific patterns. We achieve this by flattening the colors of the original to a limited number of shades of grey.

We can then turn this into a model and use that model to generate new images with some fairly simple code:

from PIL import Image
from random import randint

def getRandomFromProbabilityList(probabilityList):
    total = 0
    for shade, count in probabilityList.items():
        total = total + count
    chosenProbability = randint(0, total -1)
    for shade, count in probabilityList.items():
        if(chosenProbability < count):
            return shade
        chosenProbability = chosenProbability - count

shadeMarkovChain = {}
        
filename = "images\mona_lisa.jpg"
with Image.open(filename) as image: 
    width, height = image.size
    resizeRatio = 64/width
    newWidth = ((int)(width * resizeRatio))
    newHeight = ((int)(height*resizeRatio))
    image = image.resize((newWidth, newHeight))
    image = image.convert('LA')
    image.save("images\output.png")
    
    for x in range(0,newWidth):
        for y in range(0, newHeight):
            oldPixel = image.getpixel((x, y))[0]
            newPixel = oldPixel - oldPixel % 128
            topLeft = -1
            top = -1
            left = -1
            if(x > 0):
                tempPixel = image.getpixel((x-1, y))[0]
                left = tempPixel - tempPixel %128
            if(y > 0):
                tempPixel = image.getpixel((x, y-1))[0]
                top = tempPixel - tempPixel %128
              
            if(x > 0 and y > 0):
                tempPixel = image.getpixel((x-1, y-1))[0]
                topLeft = tempPixel - tempPixel %128
                
            if (topLeft, top, left) not in shadeMarkovChain:
                shadeMarkovChain[(topLeft, top, left)] = {}
            if newPixel not in shadeMarkovChain[(topLeft, top, left)]:
                shadeMarkovChain[(topLeft, top, left)][newPixel] = 1
            else:
                shadeMarkovChain[(topLeft, top, left)][newPixel] += 1
            
            
print(shadeMarkovChain)

print(getRandomFromProbabilityList(shadeMarkovChain[(-1,-1,-1)]))


img = Image.new('LA', (newWidth, newHeight))
for x in range(0,newWidth):
    for y in range(0, newHeight):
        topLeft = -1
        top = -1
        left = -1
        if(x > 0):
            left = img.getpixel((x-1, y))[0]

        if(y > 0):
            top = img.getpixel((x, y-1))[0]
          
        if(x > 0 and y > 0):
            topLeft = img.getpixel((x-1, y-1))[0]

        if(topLeft, top, left) in shadeMarkovChain:
            img.putpixel((x,y),(getRandomFromProbabilityList(shadeMarkovChain[(topLeft, top, left)]), 255))
        else:
            img.putpixel((x,y),(getRandomFromProbabilityList(shadeMarkovChain[(-1, -1, -1)]),255))
img.save("images\markov.png")


The results aren’t exactly that impressive though.

I played around with the program a bit to try and get different results. I pre processed the image less, I pre processed it more, I generated smaller images, I generated larger images, i looked at more neighboring pixels.

Nothing really helped. We wind up with this sort of diagonal storm cloud pattern every time.

Which honestly makes sense. As we mentioned earlier images aren’t really linear. You can’t draw a person just by looking at a few dozen pixels at a time. You need context; you need to know where the entire face is to draw the eyes. You need posture to put the hands in the right place. And while there are certainly generative models that can track this sort of data they are much more complex than the little bit of toy code we have built here.

But I promised you a weird Markov chain and I believe I have delivered.

 

Let’s Program Some Weird Markov Chains Part 6: Listening To Music Reeeeaaaally Closely

Last time we learned how to represent musical chords with a Markov Chain as well as how to use them to programmatically write music. Unfortunately the music sounds really really bad because I have no idea what chords are supposed to sound like or what chords are supposed to follow each other.

That’s why this time we’re going to focus on reading files instead of writing them. That way our program can learn from actual musicians what notes and chords should follow each other. Because I’m certainly never going to figure that stuff out.

Of course, my lack of musical knowledge also limits my ability to write code that analyzes existing music but if I can get even close we should see some interesting results. And then all you out in the audience can do me one better by learning enough about real music and mp3 to write something better than the upcoming prototype.

Let’s start by importing some libraries, writing a couple helper functions and setting up a variable to hold our musical markov chain data.

from mido import Message, MidiFile, MidiTrack
from random import randint
       
def getRandomFromProbabilityList(probabilityList):
    total = 0
    for word, count in probabilityList.items():
        total = total + count
    chosenProbability = randint(0, total -1)
    for word, count in probabilityList.items():
        if(chosenProbability < count):
            return word
        chosenProbability = chosenProbability - count
        
def addChordToTrack(track, musicalTuple):
    track.append(Message('note_on', note=musicalTuple[0], velocity=64, time=500))
    for subNote in musicalTuple[1:]:
        track.append(Message('note_on', note=subNote, velocity=64, time=0))
 
markovChain = {}

As you can see we’ve got our classic function for pulling an option out of a list of weighted options as well as a simple function for taking a set of musical notes and adding them to a midi track as a half second long chord.

Next we need a lot of raw music. I found a nice collection of publicly available midis of public domain classical music. I downloaded several from the piano section, put them in a folder named “piano” and then threw their file names into a big list in the code.

inputFiles = ['piano/alice.mid',
                'piano/alkan-op31-1.mid',
                'piano/andre-sonatine.mid',
                'piano/anna-magdalena-20a.mid',
                'piano/bach-invention-01.mid',
                'piano/bach-invention-07.mid',
                'piano/bach-invention-13.mid',
                'piano/bach-invention-14.mid',
                'piano/bluemtns.mid',
                'piano/bwv_462.mid',
                'piano/BWV-117a.mid',
                'piano/bwv-259.mid',
                'piano/bwv347.mid',
                'piano/BWV-511.mid',
                'piano/cpe-bach-rondo.mid',
                'piano/giselle.mid',
                'piano/lanative.mid',
                'piano/lesgraces.mid',
                'piano/Rumores_de_la-caleta.mid',
                'piano/swallows.mid',]

And then we can loop through all the files, use our midi library to loop through all the chords in the piece and build up a one deep markov chain from them.

Getting the chords from a midi file is a little bit more difficult than getting the words from a book. As we saw last time a midi chord is created by using “time = 0” to make several notes play at the same time as the note before them. So to build a cord we loop through “note_on” type events. If the time is 0 we add it to the current chordTemp item. If the time isn’t 0 it must be the start of a new chord so we push the current chord onto our main list of chords, clear it out, add the new note and then start looping again.

And of course, once we have our big list of every chord and the order it happened in we can build a one deep markov chain just the same we did for words.

for inputFile in inputFiles:
 
    mid = MidiFile(inputFile)

    chordList = []
    chordTemp = []

    for msg in mid:
        info = msg.dict()
        if(info['type'] == 'note_on'):
            if(msg.time != 0):
                chordList.append(tuple(sorted(set(chordTemp))))
                chordTemp=[]
            chordTemp.append(msg.note)
                


    for i in range(len(chordList) - 1):
        if chordList[i] not in markovChain:
            markovChain[chordList[i]] = {}
        if chordList[i+1] not in markovChain[chordList[i]]:
            markovChain[chordList[i]][chordList[i+1]] = 1
        else:
            markovChain[chordList[i]][chordList[i+1]] += 1

The last thing we need is a list of starting points for generating new songs. Once again this is a little tricky. A book has thousands upon thousands of senteces each with an easy to find first word. But music? Much harder to pick out the start of a musical phrase within a midi.

So we’ll just take a shortcut and make a list of every chord in our markov chain that links to at least one other chord. And then we’ll grab one at random to be the start of our song.

possibleStartingChords = list(markovChain.keys())
startingChord = possibleStartingChords[randint(0,len(possibleStartingChords)-1)]       

And from there building a new song is easy peasy. Only difference from working with sentences is that our goal here is to produce a song of a certain length instead of just running until we reach a data point with no next possible value. So we loop through 480 chords, using the markov chain whenever we can and choosing a new random note to continue with everytime we hit a dead end.

newMid = MidiFile()
track = MidiTrack()
newMid.tracks.append(track)

track.append(Message('program_change', program=0, time=0))
            
currentChord = startingChord

for x in range(480):
    if(currentChord in markovChain):
        currentChord = getRandomFromProbabilityList(markovChain[currentChord])
    else:
       currentChord = possibleStartingChords[randint(0,len(possibleStartingChords)-1)] 
    addChordToTrack(track, currentChord)

newMid.save('markov_song.mid')

Run it a few times and give the results a listen. What do you think?

Awful isn’t it?

But awful in a way that at times sounds almost like music.

I think the main problem is that several of the chords don’t sound very good. This is probably because our naïve chord collector doesn’t distinguish between different instruments. Nor does it have any way to distinguish between treble and bass cleft chords that happen to be being played at the same time.

This is going to make me sound pretty lazy, but I’m happy leaving this experiment where it is. It’s not a great program for generating music… but as a proof of concept program written by someone who barely understands music it’s pretty good. Or at least, it’s good enough that another programmer who actually understands music could probably take it, improve and make something cool.

So we’ll call this a sort-of success and next time we’ll move on to our final and least likely to succeed experiment: Writing a Markov chain system for painting pictures..

Let’s Program Some Weird Markov Chains Part 5: Digital Cat Walking On A Virtual Piano

Now that we’ve programmed a very standard Markov Chain let’s fulfill our original promise and try to program something slightly weird: A Markov Chain for music.

This will be a little tricky, but isn’t quite as hard as it could be since music and text actually have a lot in common. Both are linear sequences of symbols that represent sounds and in both there are definite patterns of what symbols/sounds should and should not follow each other.

So our plan is pretty straightforward. In the same way that we analyzed public domain books to create a data structure of what words follow what other words we are now going to analyze public domain music to create a data structure of what notes follow what other notes.

The biggest trick here is that most digital music files don’t actually contain information about notes. Instead they just contain information about sound. Your average MP3 file or CD is in reality just a list of millions of sound wavelengths that need to be played one after another at a rate of several thousand a second. This is great for things like recording the exact reverberation of a single guitar string but completely useless if all you want to know is what single note that guitar was playing.

The one major exception is MIDI. MIDI files don’t store sound samples but instead store a list of instruments, the notes they play and when those notes should be played. This is exactly what we need!

The last missing piece is a tool for actually reading and writing those MIDI files. We could definitely write one on our own but for convenience I’d suggest using a third party library. For python I’ll be using a library called mido (https://github.com/olemb/mido/)

Make Some Noise

There is a lot of cool and complicated things you can do with MIDI but I don’t understand even half of it so we’re going to just stick with something super simple. For instance, to make a single piano play a series of random notes we can do something like this in a new musictest.py file:

from mido import Message, MidiFile, MidiTrack
import random

newMid = MidiFile()
track = MidiTrack()
newMid.tracks.append(track)

track.append(Message('program_change', program=0, time=0))

for x in range(480):
    noteNum = 60
    noteNum = random.randint(50,70)
    track.append(Message('note_on', note=noteNum, velocity=64, time=250))

newMid.save('new_song.mid')

I seriously don’t quite understand how this works but I “think” the initial ‘program_change’ message is used to set the instrument. If you don’t like piano change program=0 to some other number and see what happens!

We then randomly generate 480 notes roughly around middle C (value 60) and have each of them play 250 milliseconds after each other.

Finally we write it all to file, resulting in a roughly two minute music clip that sounds like someone randomly banging on a piano. Not terribly impressive but it shows we can programatically generate music.

We can also generate chords instead of single notes. To do this only the first note in each cord has a delay. The other notes in the chord have a delay time of “0” so that they play at the exact same time as the first note.

from mido import Message, MidiFile, MidiTrack
import random

newMid = MidiFile()
track = MidiTrack()
newMid.tracks.append(track)

for x in range(480):
    choice = random.randint(0,3)
    if(choice==0):
        track.append(Message('note_on', note=60, velocity=64, time=500))
        track.append(Message('note_on', note=63, velocity=64, time=0))
        track.append(Message('note_on', note=65, velocity=64, time=0))
    elif(choice == 1):
        track.append(Message('note_on', note=58, velocity=64, time=500))
        track.append(Message('note_on', note=63, velocity=64, time=0))
        track.append(Message('note_on', note=65, velocity=64, time=0))
    elif(choice == 2):
        track.append(Message('note_on', note=60, velocity=64, time=500))
        track.append(Message('note_on', note=63, velocity=64, time=0))
        track.append(Message('note_on', note=67, velocity=64, time=0))
    elif(choice == 3):
        track.append(Message('note_on', note=60, velocity=64, time=500))
        track.append(Message('note_on', note=65, velocity=64, time=0))
        track.append(Message('note_on', note=68, velocity=64, time=0))

newMid.save('new_song.mid')

Markov Models By Hand

So… how do we take these notes and cords and arrange them into a Markov Chain? In Python it’s pretty easy because Python dictionaries can use tuples of numbers as keys just as easily as they can use strings. And we already know how to build Markov Chains using Python dictionaries, so we can just translate our old code to this new use case.

markovChain = {(60, 63, 65): {(60,63,65): 1, (58, 63, 65): 2, (60,65): 1},
   (58, 63, 65): {(60,63,65): 3, (58, 63, 65): 1, (60,65): 1},
   (60, 65): {(60,63,65): 3, (58, 63, 65): 2}}

There you have it, a dictionary that connects chords to other chords and how often each pair happened. Of course the chords and counts are all made up here but hopefully you can see how we could use this same strategy to represent the chord pairs found in a real piece of music.

Let’s finish this proof of concept up by using this musical Markov chain to randomly put together a piece of “music”. (Spoiler: It’s not going to sound good since we’re using made up numbers instead of real musical data).

First off we’ll need to pull in our getRandomFromProbabilityList from our string based experiments. We can just copy paste it directly thanks to the fact that it doesn’t care what kind of keys it works with. Words, numbers, tuples, whatever. It all works.

Next up let’s figure out a way to turn our tuples into actual musical notes. The basic idea is that every tuple will have at least one note that needs to be played 500 milliseconds after the last note. If the tuple has more than one note the rest need to be set up to play at the same time as the first note of the tuple, which we can pull off with a 0 millisecond delay. We can get the extra notes in a chord by using [1:] syntax to get everything in a tuple except the first entry. So something like this:

def addChordToTrack(track, musicalTuple):
    track.append(Message('note_on', note=musicalTuple[0], velocity=64, time=500))
    for subNote in musicalTuple[1:]:
        track.append(Message('note_on', note=subNote, velocity=64, time=0))

And now it’s pretty easy to write code that uses our probability list to semi-randomly string together a few hundred notes:

currentChord = (60, 63, 65)

for x in range(480):
    currentChord = getRandomFromProbabilityList(markovChain[currentChord])
    addChordToTrack(track, currentChord)

Or putting it all together:

from mido import Message, MidiFile, MidiTrack
from random import randint

def getRandomFromProbabilityList(probabilityList):
    total = 0
    for word, count in probabilityList.items():
        total = total + count
        chosenProbability = randint(0, total -1)
    for word, count in probabilityList.items():
        if(chosenProbability < count):
            return word
        chosenProbability = chosenProbability - count

def addChordToTrack(track, musicalTuple):
    track.append(Message('note_on', note=musicalTuple[0], velocity=64, time=500))
    for subNote in musicalTuple[1:]:
        track.append(Message('note_on', note=subNote, velocity=64, time=0))

markovChain = {(60, 63, 65): {(60,63,65): 1, (58, 63, 65): 2, (60,65): 1},
   (58, 63, 65): {(60,63,65): 3, (58, 63, 65): 1, (60,65): 1},
   (60, 65): {(60,63,65): 3, (58, 63, 65): 2}}

startingChord = (60, 63, 65)

newMid = MidiFile()
track = MidiTrack()
newMid.tracks.append(track)
track.append(Message('program_change', program=0, time=0))

currentChord = startingChord
    for x in range(480):
        currentChord = getRandomFromProbabilityList(markovChain[currentChord])
        addChordToTrack(track, currentChord)
        newMid.save('new_song.mid')

All right! We now have a program for creating random midi files that sound like a three year old banging on a piano.

Next time we’ll get rid of our horrible hand-built test Markov Chain and build one based off of actual chord patterns found in actual music. And then we’ll try using that real world data to generate a music file that might actually sound like music.

Let’s Program Some Weird Markov Chains Part 4: Quantifiably Bad

We’re just about done with text based Markov Chains but there’s one thing that’s still bothering me. We saw that as our chain got deeper and deeper it got more and more likely that we would duplicate entire sentences from the original text instead of generating interesting new text.

Last time we dealt with this problem by increasing and decreasing the depth of our chain manually and then going with our gut instinct on what felt “right” and what was “too many” copies of original text.

But we’re doing computer science here! Surely we can do better than gut instinct. Maybe there’s some way we can statistically or mathematically analyze how good or bad a given Markov Chain is when it comes to generating unique text?

Well… one of the main causes of duplicated text is when entries in the Markov Chain have only one possible next word associated with them. If “I” always leads to “like” and “like” always leads to “programming” then you can’t generate anything except “I like programming”.

So maybe we should take a quick look through our one deep, two deep and three deep Markov Chains and take a look at how often their words have lots of different options vs just one or two. The code for this is pretty easy:

markovChain1OptionCount = {}
totalOptions1 = 0;

for word, options in markovChain.items():
    optionCount = len(options)
    if optionCount not in markovChain1OptionCount:
        markovChain1OptionCount[optionCount] = 0
    markovChain1OptionCount[optionCount] += 1
    totalOptions1 += 1

markovChain2OptionCount = {}
totalOptions2 = 0;

for layer1word, layer1options in markovChain2Deep.items():
    for layer2word, layer2options in layer1options.items():
        optionCount = len(layer2options)
        if optionCount not in markovChain2OptionCount:
            markovChain2OptionCount[optionCount] = 0
        markovChain2OptionCount[optionCount] += 1
        totalOptions2 += 1

markovChain3OptionCount = {}
totalOptions3 = 0;

for layer1word, layer1options in markovChain3Deep.items():
    for layer2word, layer2options in layer1options.items():
        for layer3word, layer3options in layer2options.items():
            optionCount = len(layer3options)
            if optionCount not in markovChain3OptionCount:
                markovChain3OptionCount[optionCount] = 0
            markovChain3OptionCount[optionCount] += 1
            totalOptions3 += 1

Everybody following along? We just go to the end of each option in our change and count how many options are available for it. We then keep track of how often that count has shown up. Did our first words only have one possible next word? OptionCount[1] = 1. Did our second word have three possible next words? OptionCount[3] = 1. Another 1 option only word? OptionCount[1] = 2.

And then we keep going on and on until we have a complete count for every possible number of next words. We can then look at how often we found words that only had one possible next word:

print("1 deep chains with only 1 option: {0} ({1:.0f}%)".
    format(markovChain1OptionCount[1],markovChain1OptionCount[1]/totalOptions1 * 100))
print("2 deep chains with only 1 option: {0} ({1:.0f}%)".
    format(markovChain2OptionCount[1],markovChain2OptionCount[1]/totalOptions2 * 100))
print("3 deep chains with only 1 option: {0} ({1:.0f}%)".
    format(markovChain3OptionCount[1],markovChain3OptionCount[1]/totalOptions3 * 100))

1 deep chains with only 1 option: 17917 (61%)

2 deep chains with only 1 option: 111426 (86%)

3 deep chains with only 1 option: 189202(95%)

Hmmm… that is interesting. Pretty clear evidence that the deeper our chain is the less flexibility there is.

(You can use the markovChainNOptionCount arrays to look at how often words have 2 or 3 or whatever options too! That won’t help us with our current questions but it can be interesting to explore.)

Sadly, this simple experiment doesn’t tell us the whole story.

The fact that 86% of 2 deep chains only have 1 option does NOT meant that there is an 86% chance that any given word has only one option. For instance, the 2 chains that come at the beginning of sentences have a very different probability spread:

chain2Starters = markovChain2Deep['sentencestart']
chain2StarterOptionCount = {}
chain2StarterTotalCount = 0

for word in chain2Starters:
    optionCount = len(chain2Starters[word])
    if optionCount not in chain2StarterOptionCount:
        chain2StarterOptionCount[optionCount] = 0
    chain2StarterOptionCount[optionCount] += 1
    chain2StarterTotalCount+=1

print("2 deep chain starters with only 1 option: {0} ({1:.0f}%)".
    format(chain2StarterOptionCount[1], chain2StarterOptionCount[1]/chain2StarterTotalCount * 100))

2 deep chain starters with only 1 option: 1047 (69%)

Well look at that! While 86% of 2 deep chains only have 1 possible next word only 69% of starting 2 deep pairs have only one possible next word. This sort of makes intuitive sense. Sentences in English tend to start out in just a handful of ways that can each link to a wide variety of topics. It’s only when you get near the end of a sentence that you start feeling locked into just a few specific possibilities.

Of course, even that isn’t the whole story because not every one of those 2 deep starting possibilities is equally likely. Lots of sentences start with “The” so that word will be randomly chosen a lot more often than something like “Pythagoras”.

So let’s do another experiment to figure out just how likely it is that the first word randomly chosen for our 2 deep Markov Chain will only have one possible next word.

It’s not too hard to do this. We already know how to calculate the probability of a specific starting word pair by comparing how often that word pair was seen (ex: sentencestart, the) compared to all possible word pairs that start with sentencestart. So to calculate the total probability of all starting word pairs that only have one possible next word we just have to count up how many times we saw word pairs with only one option and then divide that by all sentencestart word pairs.

total = 0
totalWithOnlyOneOption = 0

for word, count in markovChain['sentencestart'].items():
    total = total + count
    if len(markovChain2Deep['sentencestart'][word]) == 1:
        totalWithOnlyOneOption += count

print("Odds of starting with a phrase that only has 1 possible next option using 2 deep chain: {0:.0f}%".
    format(totalWithOnlyOneOption/total * 100))

Odds of starting with a phrase that only has 1 possible next option using 2 deep chain: 16%

So that’s a pretty interesting tidbit. Even though 86% of word pairs only have 1 possible next word the odds of a randomly generated sentence starting with that sort of pair is only 16%. So now we have numbers showing us why a 2 deep chain does a pretty good of job of creating different sentence starts.

But wait! Even that’s not the entire story. We may have done some good analysis on how sentences start but that doesn’t tell us much about the middle or end of sentences. Plus our analysis of how sentences start is still pretty weak. We might know the odds of a sentence starting with only option of what to do next but that doesn’t tell us very much about whether that option will also have only one option or if it might be the gateway to hundreds of different possibilities.

Now we’re not going to take these statistical experiments any further (although we certainly could). But I think this is a really good demonstration of my original point: Markov Chains are easy to understand and program but are also a very powerful tool for capturing and using statistical patterns in data.

I think we’re done with traditional text based Markov Models. Next time we’ll jump into the world of music. Will that work out well for us? No idea!

Let’s Program Some Weird Markov Chains Part 3: Tell Me A Story

So last time we wrote some basic tools for creating Markov Models from a text file. We just send some sentences to the program and it takes them apart in order to figure out what words follow what other words and how often.

But can we do anything interesting with this other than collect obscure trivia on the writing style of, say, Herman Melville?

Of course we can!

The most obvious and amusing thing you can do with a text based Markov Model is use it to chain words together into a sentence. This is pretty easy to do since we already wrote a function that can take any word in our model and randomly return a possible next word.

So if we use that function with “sentencestart” we will randomly get back one of the words that the program found at the start of a sentence. If we then use the function on that first word we will randomly get back a second word. And that second word will lead to a third word until at some point we randomly get “sentenceend”.

The code for that is super straightforward. Just add something like this to the end of the program you wrote for creating the Markov model in the first place:

currentWord = 'sentencestart';

stopWord = 'sentenceend';

message = []

while currentWord != stopWord:
    currentWord = getRandomFromProbabilityList(markovChain[currentWord])
    if(currentWord != stopWord):
        message.append(currentWord)

By the end of this while loop we will have an array holding a complete sentence.

The only real trick here is that you want to make sure that your sentence array does not actually include “sentencestart” or “sentenceend”. That’s why we don’t start adding items to the message array until after the first call to getRandomFromProbabilityList and why we make sure the loop ends as soon as “sentenceend” shows up and before it can wind up in the output array.

You can now sort of see a sentence by looking at the message array, but for convenience you probably want to join all the individual words together with spaces to make a single line of output. In python you do this like this:

print(' '.join(message))

Now if you run the code and feed it our mini text file about apples you will get random output like “I have another apple pie”. You’ll notice that this sentence doesn’t exist in the original sample but that every word pair (“I have”, “have another”, “another apple”, “apple pie”) does.

Great! Now let’s try this on something much much larger: The complete text of Moby Dick as found on project gutenberg. Please note that their files are stored in utf-8 instead of ascii format so depending on what language you’re following along in you might need to change how you read files in order to avoid crashing the first time you run across a strange symbol.

In python that means we need to update our read line to this:

with open(sys.argv[1], encoding="utf8") as myfile:
    data = myfile.read()

And with that we can now summon the digital ghost of Herman Melville! Just grab a public domain copy of Moby Dick and feed it into your program:

“how now,” whispered something like a sentry-box with one matter from the next movement about a mariner’s needle, is, that, parsee:—a hearse can endure; at such prodigies as if the whale-fishery contains it is this arm

Hmmm… Ok? Those are sure some words, at least. Let’s try again.

cook! ho, ho! there’s no small gold-fish has footed it beast, boat, carrying his pupils of real are of the flashing upon the people of his shrill voice, “drinking hot fire burning martyr, or two, for it

Maybe that’s a little better?

Alright, let’s be honest. This is all pure nonsense. Real books have complex sentences that are hard to mimic by looking at them just one word at a time. That’s why we’re going to need to build a better Markov chain based off of looking at multiple words.

It is important to note that even when working with a multi-word chains we sometimes still need a way to look at only one word at a time. For instance, when trying to figure out how to start a sentence we won’t have multiple words to look at. All we’ll have is “sentencestart” so we either need to build a 1 deep model to go along with our 2 deep model or we need some way to make a 2 deep sentence starting symbol.

Since we already have a 1 deep model we’ll just keep that, and the whole algorith that builds it, and then add a 2 deep model after it.

markovChain2Deep = {}

for i in range(len(tokenList) - 2):
    if tokenList[i] not in markovChain2Deep:
        markovChain2Deep[tokenList[i]] = {}
    if tokenList[i+1] not in markovChain2Deep[tokenList[i]]:
        markovChain2Deep[tokenList[i]][tokenList[i+1]] = {}

    if tokenList[i+2] not in markovChain2Deep[tokenList[i]][tokenList[i+1]]:
        markovChain2Deep[tokenList[i]][tokenList[i+1]][tokenList[i+2]] = 1
    else:
        markovChain2Deep[tokenList[i]][tokenList[i+1]][tokenList[i+2]] += 1

Not hard to write, just a little tedious. Instead of grabbing every pair of words we grab every triplet of words in our text and then build a dictionary of first words that link to a dictionary of second words that link to a final dictionary keeping track of how often that triplet happened.

With that in place we can improve our message generating code to work a little bit better. Update it like this:

currentWord = 'sentencestart'

nextWord = getRandomFromProbabilityList(markovChain[currentWord])

stopWord = 'sentenceend'

message = []

while nextWord != stopWord:
    tempWord = getRandomFromProbabilityList(markovChain2Deep[currentWord][nextWord])
    currentWord = nextWord
    nextWord = tempWord

    if(currentWord != stopWord):
        message.append(currentWord)

print(' '.join(message))

Now that we’re linking two deep we get things like:

i fancied that the frenchman had a creditor

That actually makes sense!

Of course it can still do strange things, like here where it got caught in an “oil, nor” loop for a while:

he cannot be olive oil, nor macassar oil, nor train oil, nor train oil, nor macassar oil, nor cod-liver oil

But we still get our fair share of nonsense:

but as the standard-bearer of this ship, and make the bed corner, slips out the smoke from the rim of the enemy defray the current expenses of the west and south

What if we boost it to be three deep? Maybe a little less nonsense?

markovChain3Deep = {}

for i in range(len(tokenList) - 3):
    if tokenList[i] not in markovChain3Deep:
        markovChain3Deep[tokenList[i]] = {}
    if tokenList[i+1] not in markovChain3Deep[tokenList[i]]:
        markovChain3Deep[tokenList[i]][tokenList[i+1]] = {}
    if tokenList[i+2] not in markovChain3Deep[tokenList[i]][tokenList[i+1]]:
        markovChain3Deep[tokenList[i]][tokenList[i+1]][tokenList[i+2]] = {}

    if tokenList[i+3] not in markovChain3Deep[tokenList[i]][tokenList[i+1]][tokenList[i+2]]:
        markovChain3Deep[tokenList[i]][tokenList[i+1]][tokenList[i+2]][tokenList[i+3]] = 1
    else:
        markovChain3Deep[tokenList[i]][tokenList[i+1]][tokenList[i+2]][tokenList[i+3]] += 1

And then replace our old message producing code:

currentWord = 'sentencestart'

nextWord1 = getRandomFromProbabilityList(markovChain[currentWord])

nextWord2 = getRandomFromProbabilityList(markovChain2Deep[currentWord][nextWord1])

stopWord = 'sentenceend'

message = []

while nextWord2 != stopWord:
    tempWord = getRandomFromProbabilityList(markovChain3Deep[currentWord][nextWord1][nextWord2])
    currentWord = nextWord1
    nextWord1 = nextWord2
    nextWord2 = tempWord
    message.append(currentWord)

    #Make sure we don't exit the loop without first recording what's left in the pipeline
    if(nextWord2 == stopWord):
        message.append(nextWord1)

print(' '.join(message))

Now what sort of sentences do we get?

besides, such is the endlessness, yea, the intolerableness of all earthly ills, and that on no account kick back; for you can’t help yourself, wise stubb

Not bad. Not that good either but it does almost sounds like an old sailor giving depressing life advice.

Let’s try again…

but though the lakeman had induced the seamen to dip their ship-biscuit into the huge oil-pots and let them fry there awhile

Hey! That one actually makes a lot of sense. Success!

Or is it? Compare it to this line from the original book:

In the long try watches of the night it is a common thing for the seamen to dip their ship-biscuit into the huge oil-pots and let them fry there awhile.

So you can see that even at only three words deep we’re already reaching the point where entire segments of the original book are being reproduced word for word.

For another example consider this generated sentence:

eakable thing called his “flurry,” the monster horribly wallowed in his blood, at last he partially disclosed a strangely discoloured bunch or protuberance, the size of the iron part of a hoe

To this original sentence:

Still rolling in his blood, at last he partially disclosed a strangely discoloured bunch or protuberance, the size of a bushel, low down on the flank.

Once again we’re just duplicating big blocks of text. That’s because there are quite a few three word phrases that only happen once in the entire book and once you wander into those you’re trapped.

Also, wow, this book is actually pretty bloody. I should have done something like Prid and Prejudice instead…

collins and maria were gone on business into the village, when she was startled by a ring at the door, and elizabeth saw her sister’s countenance change as she read were scarcely to be defined

What a relief, no blood and gore. Although we have randomly generated a pretty good way to start a thriller. Who is a the door? What does Elizabeth’s sister know? What will Collins and Maria find when they finish their business and get back to the house?

So there you go. A complete, if mediocre, tool for tearing a story apart, analyzing word pattern frequency and then stitching things back together into sentences. By this point you should feel pretty comfortable with the basic ideas behind using Markov Models and understand the pros and cons of making them deep vs shallow.

But maybe you still want a little more practice? Well then, here’s some fun projects for you:

1) Writing deeper and deeper Markov Models is tedious. Figure out a way to automate that so you can do something like generateDeepModel(text, depth)

2) Looking at output we did a really bad job at removing punctuation. Fix that.

3) Our code also treats all periods as the end of a sentence, which is a problem for titles like Mr., Mrs. Dr. and so on. Figure out a way around that.

Let’s Program Some Weird Markov Chains Part 2: Need More Data

Last time we learned that a Markov Chain is a way to model lists of events and objects by basically saying “If your current item/event is X then here is a list of items/events that might come next PLUS a list of how likely each choice is.”

This is a remarkably easy thing to program in pretty much any language.

All we really need is:

  1. A way to connect an input to a list of possible outputs
  2. A way to keep track of the probability of those outputs
  3. A way to randomly select one output from a list based on those probabilities

For this Let’s Program we’ll be using Python since it’s cross platform and already on all of my computers. Be warned that I’m not much of a Python expert so please don’t take this project as an example of how to write “proper” Python code.

Step 1, connecting inputs to possible outputs, is super easy because we can just use the dictionary/map/hashtable that is undoubtedly included in whatever language you’re using. These are data structures that let you link a “key” to a “value” of any sort. Observe:

>> dictionaryTest = {‘apple’: ‘pie’, ‘chocolate’: ‘ice cream’}

>> dictionaryTest[‘apple’]

‘pie’

The trick here is that we don’t want to link a word to another word, we want to link a word to a set of possible next words.

Believe it or not we can actually represent that list as a dictionary as well, a dictionary where the keys are words we have seen and the value is how often we have seen them.

>>> dictionaryTest2 = {‘apple’: {‘pie’:15, ‘tart’:6}, ‘chocolate’: {‘ice cream’: 6, ‘milk’: 3}}

>>> dictionaryTest2[‘apple’]

{‘pie’: 15, ‘tart’: 6}

Now all we need is a way to use those word counts to randomly choose a word.

Pulling Marbles Out Of Bags

The most obvious (but probably not most efficient) algorithm here is to cycle through the entire dictionary and count up the total number of word occurrences in it. Ex, if pie showed up 15 times and tart showed up 6 times the total is 21.

We then randomly generate a number between 0 and our total, then loop through the array a second time. For each word count we check if our random number is less than the count. If so we return that word. If not we decrease our random number by that word count and then try again with the next key vaule pair.

Example 1: We randomly generate “3”. That’s less than the “pie” count of 15 so we return “pie”.

Example 2: We randomly generate “19”. That’s higher than the “pie” count of 15 so we subtract 15 from 19 and now have “4”. We then move on to “tart” and see that “4” is less than the “tart” count of 6, so we return “tart”.

Let's see that in code! These experiments are getting a little long to just type into the command line so let's put this in a file with a convenient name like “markov.py”.

from random import randint

def getRandomFromProbabilityList(probabilityList):
    total = 0

    for item, count in probabilityList.items():
        total = total + count
        chosenProbability = randint(0, total -1)

        for item, count in probabilityList.items():
            if(chosenProbability < count):
                return item
            chosenProbability = chosenProbability - count

markovModel = {'apple': {'pie':15, 'tart':6}, 'chocolate': {'ice cream': 6, 'milk': 3}}

print(getRandomFromProbabilityList(markovModel['apple']))

Here you can see we brought in our mini food themed Markov Model and then used our getRandomFromProbabilityList function to randomly generate an appropriate next word for the starting word “apple”.

Inside the getRandomFromProbabilityList you should see the two “key value” for loops that we planned. The “key” is the word and the “value” is how often we saw it. The first loop ignores the keys and just focuses on adding up all the counts. The second loop then uses a randomly chosen number to choose a word key to return.

Running this several times in a row shows that we do indeed get semi-random answers for what word should come after “apple” and that we see ‘pie’ (count 15) a little more than twice as often as ‘tart’ (count 6), which is exactly what we want.

And that simple code really is the core of our Markov Chain engine. Obviously we’ll need to build a few other tools in order to do things like automatically build Markov Chains from raw data or to automatically generate interesting output from our input but the real logic is all right here. That function will help us link our chains together.

Let’s Data Mine Some Text

It didn’t take long at all to build some code for pulling probabilistic events out of our Markov Model. The problem though is that to test that code we had to build a Markov Model by hand. That’s the sort of thing that can get awfully boring awfully fast so let’s figure out a way to automatically build a Markov Model from existing text.

Of course, for debugging purposes it would be silly to jump straight into working with a massive book like Moby Dick. Instead let’s cut our teeth on this fascinating little piece of micro-fiction:

I have an apple. I have another apple. I have some apples. I will make them into a pie. It will be an apple pie. The apple pie will taste good. I like apple pie.

To turn this into a Markov Chain we need to extract all the word pairs and count how often each on happened, but before that we need to do a little pre-processing to make the text more computer friendly.

Specifically we want to make sure the Markov Chain knows when sentences start and end. Knowing when and how sentences start will help the chain create sentences with a natural sounding beginning while knowing how they end will give the program a way to gracefully finish off sentences instead of just looping through an infinite chain of “the man saw the dog saw the man saw the dog saw the…”

After we’ve figured out the start and stop info out we’ll also want to remove all punctuation and capitalization. Otherwise our program might accidentally think “Apple”, “apple”, “apple,” and “apple.” are four different words.

After pre-processing our silly little sample story should look more like this:

sentencestart i have an apple sentenceend sentencestart i have another apple sentenceend sentencestart i have some apples sentenceend sentencestart i will make them into a pie sentenceend sentencestart it will be an apple pie sentenceend sentencestart the apple pie will taste good sentenceend sentencestart i like apple pie sentenceend

Obviously I did that by hand, but equally obvious is the fact that I don’t want to have to edit the entirety of War and Peace just to use it in our program so let’s figure out how we might write a program to pre-process our text for us.

Hmmm…, as humans we keep track of the boundary between sentences by using periods. Maybe our code can do the same? Perhaps we can scan through the code looking for periods and then replace them all with symbols showing that one sentence has ended and another is about to start. We can use whatever symbols we want but should try to avoid anything that might show up in a real piece of text. Mushing words together to create sentenceend and sentencestart is how I did it but you could use a randomly generated strings like BBSDFDSstart and BBSDFDend if you wanted to be extra careful.

As usual there are some edge cases, very literal ones here. The first word of any text should be sentencestart even though there is no period there. And the final period a story should be replaced by just sentenceend with no following sentencestart.

Forunately the code for this is pretty easy thanks to regular experssions:

import re #Regular expressions
import sys #Lets us get command line arguments

#Get the first argument to the program and use it as a filename
with open(sys.argv[1], 'r') as myfile:
    data = myfile.read()

#Text should start with sentencestart
text = 'sentencestart ' + data

#Replace all periods with 'senteceend sentencestart'
text = re.sub('\.', ' sentenceend sentencestart', text)

#Remove the unneccessary sentencestart from the last period in the file
text = re.sub('sentencestart$', '', text);

#Make everything lowercase
text = text.lower()

#Remove everything that isn't a number or letter
text = re.sub('^a-zA-Z\d\s:', '', text)

#Split the pre-processed text in to an array of words
tokenList = text.split()

#Double check that the pre-processed array turned out right
print(tokenList)

Once we have our list split into a nice array it’s pretty easy to walk through that array and add every word pair to our Markov Model.

markovChain = {}

for i in range(len(tokenList) - 1):
    if tokenList[i] not in markovChain:
        markovChain[tokenList[i]] = {}

    if tokenList[i+1] not in markovChain[tokenList[i]]:
        markovChain[tokenList[i]][tokenList[i+1]] = 1
    else:
        markovChain[tokenList[i]][tokenList[i+1]] += 1


Pretty simple. We index through the array of tokens. At every step the word at the current index i is treated as the first word of our pair and the word at i + 1 is treated as the second letter in the pair. We can use this to build our nested dictionary of dictionaries. The end result will eventually work sort of like this: markovChain[firstWord][secondWord] = count.

The only trick here is that you have to make sure you end the loop on the second to last token, not the last token. This is because the last token doesn’t have a “next” token so we can’t build a pair. And, depending on your language, trying to will probably throw an error.

And with that we have automatically turned a text file into a Markov Chain full of interesting facts like:

  • 5 sentences started with ‘I’
  • The world “apple” was followed by the word “pie” twice
  • The word “will” can be followed by “be”, “make” or “taste”

Great! We now have a tool that can help us build large Markov Models with significant amounts of real data. Tune in next time to see what interesting stuff we can do with that sort of information.

Let’s Program Some Weird Markov Chains Part 1:We’re Making A What?

Markov Chains are, in my personal opinion, one of the niftiest things to be found in the computer science bag of tricks. They are easy to understand and simple to program but still powerful enough to do some really cool stuff.

So what is a Markov Chain?

A Markov Chain is a way to model or simulate any sort of list of things or events where the next item in the list is influenced by the item that came before.

One of the best examples is human language. Here, try to fill in the blank: “He saw a bright red _____”.

You probably came up with something like “a bright red apple” or “a bright red car”.

You might have come up with something like “a bright red star” or “a bright red dragon”.

You probably didn’t think anything obviously wrong like “a bright red bluebird” or “a bright red emerald”.

And you definitely didn’t think of anything grammatically nonsensical like “a bright red sleeping” or “a bright red quickly”.

See? All I had to do was tell you the start of a sentence and your years of experience with human language let you instinctively guess what sort of words probably would or wouldn’t come next.

A Markov Chain is just a way to formally define this idea of “what will probably come next”.

For instance, a Markov Chain describing a family cookbook might include mathematical facts like:

  • The word “red” is followed by the word “sauce” 50% of the time, the word “apple” 25% of the time and the word “pepper” 25% of the time.
  • The phrase “preheat the oven to” is followed by “350 degrees” 80% of the time.
  • The phrase “2 cups of” is followed by the word “water” 75% of the time, the word “milk” 20% of the time and the word “butter” 5% of the time.

Great. We now know what words tend to follow other words. That’s interesting trivia, but what can we do with it?

What we can do is simulate new cook book instructions by chaining our rules together. That’s why they’re called Markov Chains and not Markov Factoids.

Example: if you know the word “the” is sometimes followed by “red” and that “red” is sometimes followed by “apple” and that “apple” is sometimes followed by “tastes” and that “tastes” is sometimes followed by “good” then you can logically chain these rules together and create the full sentence “The red apple tastes good”.

That Seems Too Conveninet. What’s the catch?

Sadly, not every chain makes sense.

Consider the sentence “Sally ate red apple pie into pieces of oil”.

Every individual step in this sentence makes sense. “Sally ate” makes sense. So does “ate red” and “red apple” and “apple pie”. And there’s nothing wrong with “pie into” or “into pieces”. And “of oil” is something you’d expect to see a lot of in a cook book.

It’s only when you link all these individually valid chains together that you wind up with nonsense. So while our cooking book Markov Chain has the potential to put together consistent sounding cooking instructions it also has the potential to spout complete gibberish.

Oh No! Can We Make The Markov Chain’s Smarter?

The reason our imaginary Markov Chain generated nonsense is because it is only smart enough to look one word ahead or behind at a time, but in English you often need two or three or more words of context for things to make sense.

That suggests we could improve the quality of our Markov Chain by getting rid of “word A leads to either word B or word C” and instead upgrade to “phrase A B C leads to either word X or word Y”.

But watch out! The more context you force on your Markov Chain the less flexible it becomes, until eventually it becomes entirely useless.

How could too much context be a problem? Well, suppose you are trying to build a Markov Chain based off of Lord of the Rings with the goal of creating unique and original sentences that sound like something Tolkien would actually write. Your want to make sure you get high quality sentences so you design your Markov Chain base itself off of the past 10 words of context.

Unfortunately when you run your program you find that you aren’t getting original sentences at all! Your program does nothing but randomly print entire sentences taken word for word from the original books.

The problem was you got too specific. Tolkien might have used the individual word “elves” in hundreds of different ways throughout his books but the specific phrase “For the Elves the world moves, and it moves both…” only shows up once, leading to a boring Markov Model that does nothing but tell you with 100% certainty how specific sentences actually ended.

So building useful Markov Chains requires a certain amount of experimentation in order to find an input length with enough context to provide intelligent results but short enough that it doesn’t just directly copy whatever it is you’re modeling.

What Are They Good For?

So are Markov Chains good for anything other than generating inedible recipe fragments or producing text that sound vaguely like they came from famous books?

Of course!

Search engines use them for ranking pages and predicting user behavior. Smart phone keyboards use them for auto complete. Statisticians use them for studying medicine, economics, sociology and pretty much everything else where things happen in a specific order.

They’re also a handy way to build video game AIs that act in intelligent semi-random ways.

So understanding the basics of Markov Chains will come in handy regardless of whether you plan to try and cure cancer or just program a really awesome giant robot videogame.

Or both. Dare to be a modern day Renaissance Man!

The Title Is “Weird” Markov Chains. What is that about?

Like I said, you can use Markov Chains to model anything that has an order. But whether or not said Markov Chain is actually good for anything is completely separate matter.

So for this Let’s Program we’re going to start with a classic word based Markov Chain to analyze some public domain books and then try to mimic their writing style. That should be enough to prove we know what we’re doing. But after that we’re going to try a couple experimental ideas that may or may not produce anything worthwhile.

The first experiment will be to try and grab a bunch of musical scores and then build some Markov Chains that model what notes should come next in a song given the last couple of notes. This might let us generate some interesting sounding semi-random music. Or it might just generate the sound of a cat walking on a piano.

After that we’ll finish up with an even stranger experiment based around grabbing a bunch of thematically related images and using them to build a Markov Chain that predicts what color/shade a pixel should be based on the pixels around it. We will then use this to generate new images that I am almost certain will look like static but might include recognizable fragments that actually look related to the our original images. It’s an experiment I’ve been wanting to try for a while, honestly, and I’m glad to finally be getting around to it.

Spoiler Free Dark Souls Tips For New Players

So the Dark Souls remaster has been released and you’re thinking about finally giving the series a try. But at the same time you’re a little bit worried about the game’s legendary difficulty and reputation for frustration. Maybe the game isn’t for you?

Well don’t worry!

It’s true that Dark Souls is a difficult game that requires patience, practice and lots of trial and error. But most of the worst frustration comes from the game’s steep learning curve. A learning curve that I’m going to try and smooth out for you with some spoiler free tips.

Tip #1: Dark Souls Is a Puzzle Game as Much as Anything Else

A big part of enjoying any game is understanding what kind of game it is, and Dark Souls has strong puzzle elements to go along with it’s action RPG core.

Figuring out how to get from one bonfire checkpoint to the next requires you to figure out where all the enemies and traps are, how they behave and how to outsmart them. This requires a lot of trial and error and, yes, death. Now if you think of each death a sign that you are “losing” this can be very frustrating. But instead I would encourage you to think of each death as one more piece of the puzzle. There is no more shame in dying in Dark Souls than there is in twisting a Rubik’s Cube and finding you moved some colors into the wrong spot.

Did you try to cross a bridge only to get ambushed by enemies throwing fire bombs from above? Don’t think of that death as a loss, think of it as the price for learning an important piece of information. Did your next three attempts to get past that bridge also end in death? Still not failures, those were experiments on different strategies for solving this particular puzzle. Just keep thinking and experimenting and practicing and you’ll eventually find the “solution” that will let you dodge traps, avoid ambushes, outsmart enemies and fight on your terms.

Tip #2: Equipment (almost) Never Goes Obsolete

In most RPGs you expect find better weapons and armor on a pretty regular basis. Some games (looking at you Diablo!) even revolve around enemies dropping better equipment every few minutes.

Not Dark Souls.

The equipment from the first few hours of the game, when fully upgraded, is only a few percentage points behind the best late game hidden weapons and armor. And since those late game weapons and armor often require extremely high stats you might decide to not use them even after you find them!

That means that, depending on your character build, you might only switch to new weapons and armor two or three times throughout the entire game. This means that you can safely spend resources upgrading early game weapons and armor without feeling like you’re going to find something better in just a few minutes. Speaking of which…

Tip #3: Leveling Up Your Equipment Is More Important Than Leveling Up Your Character

Dark Souls doesn’t have gold and XP like most fantasy games but instead just gives you universal “souls” that can be used for everything from shopping to upgrading weapons to leveling up and buying new stat points. In the long run you’ll want to do a little of all three but for most of the game you won’t have enough souls to focus on everything you want.

So if you have to make a tough decision on what to do with your most recent haul of souls here’s a good rule of thumb: Leveling up your equipment at the blacksmith once is worth several character levels. Simple put the benefit of boosting your sword from +2 to +3 is just down right better than boosting your character from 20 to 21 or even 25.

And don’t worry about upgrade materials! The game has an unlimited supply of all the non-top-tier upgrade materials so there’s no risk that upgrading your early game weapons is somehow going to get in the way of upgrading future equipment.

There are, of course, some exceptions. If you’re only one point of strength away from being able to use a new sword or a couple points of endurance away from being able to wear heavier armor than by all means focus on that. But in general when you have a ton of souls and aren’t sure where to spend them a quick trip to the blacksmith to see if there are any upgrades available is a good choice.

Tip #4: Find a Shield with 100% Block ASAP

Just because you have a shield doesn’t meant you can fully prevent being injured by enemy attacks. In fact, some of the weaker shields only prevent half of the damage you take, meaning you can still wind up dead even if you block every incoming attack.

To prevent this take a good look at the stats of every shield you find. You should notice a set of defensive numbers listing what percentage of each type of damage it blocks. You want to find one that blocks 100% of physical damage. This will let you fully block the vast majority of attacks in the game, perfect for staying safe while trying to learn the attack patterns of new enemies.

Don’t worry about this too much. There are a lot of 100% physical block shields in the game. You’ll probably find one within the first few hours no sweat. The only trick is knowing what they are, how to recognize them and why you want them.

Tip #5: Level Up Vitality, Endurance and Enough Attack Stats To Use Your Favorite Weapon

Dark Souls has a lot of stats and deciding which ones to raise when you level up can be very intimidating. Here is a basic run down for a solid build to help you through your first playthrough:

Vitality gives you more health. As a new player you absolutely need this. A good place to put a couple dozen points.

Endurance gives you more stamina and equipment load. Stamina lets you run, dodge, attack and block for longer. Equipment load influences how much armor you can wear before slowing down. This means high endurance will let you move quickly while wearing heavy armor and continue fighting for longer without having to rest. Definitely worth a few dozen points.

Finally every weapon you find will have minimum stat requirements. These are a good guide for what you need to raise. Do you like fighting with axes? Notice that late game axes need a lot of strength? Raise your strength. Prefer spears? Notice they need medium amounts of strength and dexterity? Make that your goal.

This basic approach should give you a nice solid foundation. You can worry about the exact benefits of every stat later on when you’re more comfortable with the basics. (Hint: While all stats can be raised to 99 there is almost to benefit to raising them higher than 50)

Tip #6: Watch Your Equip Load

You’ll probably notice pretty quickly that the heavier your weapons and armor are the slower your character moves. Load yourself up too much and dodging becomes almost impossible.

Prevent this by keeping track of your equip load, which is a comparison of the weight of your current equipment compared to the maximum weight enabled by your endurance. Keep this below 50% to move and dodge at a decent speed. And note that only equipped items count. You can have 500 pounds of spare swords in your inventory no problem.

What if your maximum weight isn’t enough for the weapons and armor you want? Either settle for lighter armor or raise your endurance until you can handle a higher equip load.

Tip #7: Consider Avoiding Magic For Your First Playthrough

Magic spells in Dark Souls aren’t skills you learn by leveling up but are instead pieces of equipment you have to find from hidden treasures or buy from hidden merchants and teachers. On your first playthrough it’s entirely possible you’ll completely miss most of the spells, leaving a magic focused character crippled.

So for your first, blind attempt at the game play it safe and focus on using weapons. Then after you’ve beaten the game once and know where to find the spells (or don’t mind the spoilers involved in looking their locations up) you can start a second magic focused playthrough.

Let’s Program a 3D Browser Game: Index and Code

Did you know that modern browser support full 3D graphics?

Most people don’t. Probably because most websites don’t use it. Probably because it’s not really a useful feature for the average news site or online shopping center.

But it’s still a really cool technology and definitely something to keep in mind in case you ever find yourself needing to offer users a degree of interactivity or visualization above and beyond 2D text and images. I can imagine all sorts of scenarios where being able to offer a 3D model viewer or simple simulation would go a long way to making a complex topic understandable.

So let’s dive into the topic by building a simple 3D maze based off of old school first person dungeon crawlers. We’ll be using a library called three.js to handle the nitty gritty details and will cover enough of the basics of user input and game loop logic to at least give you a starting point for any interactive 3D websites you may find yourself needing to build.

Show Me The Finished Code

You can find the final product here. Play around with it to get a feel for what we’re going to build or save the page to your local machine to look at the source code. It’s all Javascript so need for compilers; the whole program is right in the <script> tags of that single page (although you’ll need to download a copy of the three.js library as well to get it working locally).

Show Me The Articles

Let’s Program A 3D Browser Game Part 1: You Can Do That?!

Let’s Program A 3D Browser Game Part 2: Three.js To The Rescue

Let’s Program A 3D Browser Game Part 3: Feeling Boxed In

Let’s Program A 3D Browser Game Part 4: A Different Kind Of Software Architect

Let’s Program A 3D Browser Game Part 5: Before You Can Learn To Walk You Must Learn To Dungeon Crawl

Let’s Program A 3D Browser Game Part 6: Forward To Tomorrow!

Let’s Program A 3D Browser Game Part 7: You Shall Not Pass

Let’s Program A 3D Browser Game Part 8: Dungeon Maintanence

Let’s Program A 3D Browser Game Part 9: Look, Shiny

Let’s Program A 3D Browser Game Part 10: Sometimes I Surprise Myself

Let’s Program A 3D Browser Game Part 11: Hunting For Treasure