Let’s Program Some Weird Markov Chains Part 5: Digital Cat Walking On A Virtual Piano

Now that we’ve programmed a very standard Markov Chain let’s fulfill our original promise and try to program something slightly weird: A Markov Chain for music.

This will be a little tricky, but isn’t quite as hard as it could be since music and text actually have a lot in common. Both are linear sequences of symbols that represent sounds and in both there are definite patterns of what symbols/sounds should and should not follow each other.

So our plan is pretty straightforward. In the same way that we analyzed public domain books to create a data structure of what words follow what other words we are now going to analyze public domain music to create a data structure of what notes follow what other notes.

The biggest trick here is that most digital music files don’t actually contain information about notes. Instead they just contain information about sound. Your average MP3 file or CD is in reality just a list of millions of sound wavelengths that need to be played one after another at a rate of several thousand a second. This is great for things like recording the exact reverberation of a single guitar string but completely useless if all you want to know is what single note that guitar was playing.

The one major exception is MIDI. MIDI files don’t store sound samples but instead store a list of instruments, the notes they play and when those notes should be played. This is exactly what we need!

The last missing piece is a tool for actually reading and writing those MIDI files. We could definitely write one on our own but for convenience I’d suggest using a third party library. For python I’ll be using a library called mido (https://github.com/olemb/mido/)

Make Some Noise

There is a lot of cool and complicated things you can do with MIDI but I don’t understand even half of it so we’re going to just stick with something super simple. For instance, to make a single piano play a series of random notes we can do something like this in a new musictest.py file:

from mido import Message, MidiFile, MidiTrack
import random

newMid = MidiFile()
track = MidiTrack()
newMid.tracks.append(track)

track.append(Message('program_change', program=0, time=0))

for x in range(480):
    noteNum = 60
    noteNum = random.randint(50,70)
    track.append(Message('note_on', note=noteNum, velocity=64, time=250))

newMid.save('new_song.mid')

I seriously don’t quite understand how this works but I “think” the initial ‘program_change’ message is used to set the instrument. If you don’t like piano change program=0 to some other number and see what happens!

We then randomly generate 480 notes roughly around middle C (value 60) and have each of them play 250 milliseconds after each other.

Finally we write it all to file, resulting in a roughly two minute music clip that sounds like someone randomly banging on a piano. Not terribly impressive but it shows we can programatically generate music.

We can also generate chords instead of single notes. To do this only the first note in each cord has a delay. The other notes in the chord have a delay time of “0” so that they play at the exact same time as the first note.

from mido import Message, MidiFile, MidiTrack
import random

newMid = MidiFile()
track = MidiTrack()
newMid.tracks.append(track)

for x in range(480):
    choice = random.randint(0,3)
    if(choice==0):
        track.append(Message('note_on', note=60, velocity=64, time=500))
        track.append(Message('note_on', note=63, velocity=64, time=0))
        track.append(Message('note_on', note=65, velocity=64, time=0))
    elif(choice == 1):
        track.append(Message('note_on', note=58, velocity=64, time=500))
        track.append(Message('note_on', note=63, velocity=64, time=0))
        track.append(Message('note_on', note=65, velocity=64, time=0))
    elif(choice == 2):
        track.append(Message('note_on', note=60, velocity=64, time=500))
        track.append(Message('note_on', note=63, velocity=64, time=0))
        track.append(Message('note_on', note=67, velocity=64, time=0))
    elif(choice == 3):
        track.append(Message('note_on', note=60, velocity=64, time=500))
        track.append(Message('note_on', note=65, velocity=64, time=0))
        track.append(Message('note_on', note=68, velocity=64, time=0))

newMid.save('new_song.mid')

Markov Models By Hand

So… how do we take these notes and cords and arrange them into a Markov Chain? In Python it’s pretty easy because Python dictionaries can use tuples of numbers as keys just as easily as they can use strings. And we already know how to build Markov Chains using Python dictionaries, so we can just translate our old code to this new use case.

markovChain = {(60, 63, 65): {(60,63,65): 1, (58, 63, 65): 2, (60,65): 1},
   (58, 63, 65): {(60,63,65): 3, (58, 63, 65): 1, (60,65): 1},
   (60, 65): {(60,63,65): 3, (58, 63, 65): 2}}

There you have it, a dictionary that connects chords to other chords and how often each pair happened. Of course the chords and counts are all made up here but hopefully you can see how we could use this same strategy to represent the chord pairs found in a real piece of music.

Let’s finish this proof of concept up by using this musical Markov chain to randomly put together a piece of “music”. (Spoiler: It’s not going to sound good since we’re using made up numbers instead of real musical data).

First off we’ll need to pull in our getRandomFromProbabilityList from our string based experiments. We can just copy paste it directly thanks to the fact that it doesn’t care what kind of keys it works with. Words, numbers, tuples, whatever. It all works.

Next up let’s figure out a way to turn our tuples into actual musical notes. The basic idea is that every tuple will have at least one note that needs to be played 500 milliseconds after the last note. If the tuple has more than one note the rest need to be set up to play at the same time as the first note of the tuple, which we can pull off with a 0 millisecond delay. We can get the extra notes in a chord by using [1:] syntax to get everything in a tuple except the first entry. So something like this:

def addChordToTrack(track, musicalTuple):
    track.append(Message('note_on', note=musicalTuple[0], velocity=64, time=500))
    for subNote in musicalTuple[1:]:
        track.append(Message('note_on', note=subNote, velocity=64, time=0))

And now it’s pretty easy to write code that uses our probability list to semi-randomly string together a few hundred notes:

currentChord = (60, 63, 65)

for x in range(480):
    currentChord = getRandomFromProbabilityList(markovChain[currentChord])
    addChordToTrack(track, currentChord)

Or putting it all together:

from mido import Message, MidiFile, MidiTrack
from random import randint

def getRandomFromProbabilityList(probabilityList):
    total = 0
    for word, count in probabilityList.items():
        total = total + count
        chosenProbability = randint(0, total -1)
    for word, count in probabilityList.items():
        if(chosenProbability < count):
            return word
        chosenProbability = chosenProbability - count

def addChordToTrack(track, musicalTuple):
    track.append(Message('note_on', note=musicalTuple[0], velocity=64, time=500))
    for subNote in musicalTuple[1:]:
        track.append(Message('note_on', note=subNote, velocity=64, time=0))

markovChain = {(60, 63, 65): {(60,63,65): 1, (58, 63, 65): 2, (60,65): 1},
   (58, 63, 65): {(60,63,65): 3, (58, 63, 65): 1, (60,65): 1},
   (60, 65): {(60,63,65): 3, (58, 63, 65): 2}}

startingChord = (60, 63, 65)

newMid = MidiFile()
track = MidiTrack()
newMid.tracks.append(track)
track.append(Message('program_change', program=0, time=0))

currentChord = startingChord
    for x in range(480):
        currentChord = getRandomFromProbabilityList(markovChain[currentChord])
        addChordToTrack(track, currentChord)
        newMid.save('new_song.mid')

All right! We now have a program for creating random midi files that sound like a three year old banging on a piano.

Next time we’ll get rid of our horrible hand-built test Markov Chain and build one based off of actual chord patterns found in actual music. And then we’ll try using that real world data to generate a music file that might actually sound like music.