Super Brains, Super Data and Chaos Theory

Posted on February 26, 2014 by Scott

I just finished reading Count to a Trillion by John C. Wright. To summarize without spoiling: It’s a story about the futuristic space adventures of a man who suffers from occasional bouts of post-human hyper-intelligence. Expect mystery, treachery, high-tech duels and the mandatory beautiful space princess.

A major theme in Count to a Trillion is the existence of what I’m going to call “super brains”, extremely intelligent entities that make normal human geniuses look like mere children. Inspired by this book I’ve spent the past few days playing with the question: What are the sort of things that a super brain should and should not be able to do? What are the realistic limits that even a super high IQ entity would have to deal with?

After much thought I think it is safe to say that all brains are limited by the data available to them. For example, even the most talented of accountants will have trouble helping you balance your checkbook if you don’t keep your receipts or if you lie about your expenses.

This limit applies to super brains too. A super brain might be able to instantly calculate the proper flight path to put a satellite in orbit around Jupiter but he’s still going to need to know how much the satellite weighs, how much stress it can handle and what kind of engine it uses. Give the super brain bad information about the satellite’s weight or construction and his answer will be just as disastrously wrong as a random guess. Like we say in computer science: Garbage in, garbage out.

But there is more to data than just whether you have it or not. You also have to consider the quality of your data. When I eyeball a jug of milk and claim there are “about two cups left” I probably actually have anywhere between one and a half to two and a half cups of milk. This data is less accurate and precise then if I were to spend an afternoon carefully measuring the remaining milk with scientific instruments in order to report that I was 99.9% sure that the jug held 2.13 cups of milk, plus or minus half a teaspoon.

This is important because different problems require different levels of data precision. If you want a super brain to help you decide what to make for dinner “we have about 2 cups of milk left” is more than enough information. But if you want his help sending a satellite to Jupiter you’re going to need to do a lot better than “I think it weighs somewhere between 2 and 3 tons”.

But it gets even more interesting! Chaos Theory shows that there are certain problems that require infinitely accurate data to solve, or at the very least data so close to infinitely accurate that no human could ever collect it. Even worse, not only do you need infinitely accurate “super data” to solve a chaos problem, without super data you can’t even predict how wrong you’ll be.

You see, with a normal problem having data that is 10% wrong will lead to an answer that is 10% wrong. Which is actually very useful because it let’s you say “our data isn’t perfect, but it’s close enough so our answer will also be close enough.”

But chaos theory introduces problems where data that is 10% wrong will lead to an answer that is unpredictably wrong. You might be 10% wrong or you might be 80% wrong or you might only be 1% wrong. There’s no way to tell. In the words of Edward Lorenz, father of chaos theory:

Chaos: When the present determines the future, but the approximate present does not approximately determine the future.

The classic example of a chaotic problem is predicting the weather. Weather is a complex, iterative system where today’s weather leads to tomorrow’s weather which leads to the next day’s weather and so on. Theoretically you should be able to predict tomorrow’s weather just by looking at today’s weather. You can then use that prediction to calculate the next day’s weather again and again as far out into the future as you want.

The problem is that making even a tiny mistake when recording today’s weather will lead to small but unpredictable errors in your prediction for tomorrow. A small error in your prediction for tomorrow means a moderate error in your prediction for next week. A moderate error in your prediction for next week means a huge error in your prediction for next month. And a huge, unpredictable error in your prediction for next month means we have no idea what the weather will be like exactly one year from now.

The only way to avoid this endless cascade of errors is to make sure you start with perfect super data. But collecting super data is somewhere between “ridiculously hard” and “genuinely impossible”. For example: in order to put together some super weather data you would need an accurate list of air pressure and temperatures from every single point on the globe. And to keep that data accurate you would need to know every time a child laughed, a butterfly took flight or anyone lit a match. Unless you’ve got a plan for turning the entire planet into a giant self-monitoring barometer* you might as well give up.

Without this super data even a super brain would have no chance at predicting the far future of weather. It doesn’t matter that the super brain perfectly understands the mechanics of weather and can manipulate billions of numbers in his head; without perfect data he is stuck to the same seven day forecasts as normal weathermen. Chaos Theory paints a clear limit beyond which pure intelligence cannot take you.

All of which suggests a good rule of thumb for writing realistic super brain characters: The more chaotic a problem is the less useful the super brain is. Even characters with post-human levels of super-intelligence can’t predict the future of chaotic systems.

A super brain might be able to invent a new branch of chemistry, jury rig a spaceship and learn an entire language just by skimming a dictionary but he still won’t be able to predict next year’s weather or make more than an educated guess at which countries will still be around 1000 years in the future.

But this is just a rule of thumb. If you have a really interesting story idea that requires the main character to be able to predict the future of the human race with 99% accuracy then go ahead and write it. As long as it is entertaining enough (tip: lots of explosions and space princesses) I’m hardly going to throw a fit that you bent a few laws of mathematics.

* Fun story idea: Humanity builds a hyper-intelligent computer and asks it to solve the climate so we can finally stop arguing about what the climate is doing and how much of it is our fault. The computer concludes that it needs super data to do this and unleashes swarms of nano-bots to turn the entire surface of the earth into a giant weather monitoring station. Can our heroes stop the nanobots and shut down the super brain AI before it wipes out the entire human race?

Let’s Program A Swarm Intelligence 5: Defining The Swarm

Posted on February 15, 2014 by Scott

A Closer Look At Particles

Up to this point I’ve been really vague about how the particle swarm itself actually works. How exactly do the particles search for new answers? How do they communicated with each other? How do we program a particle? Time to find out.

Let’s start by remembering back to the very first post of this Let’s Program. The one where I explained that a particle swarm optimizer is just a system for making a lot of random guesses and then adjusting those guesses.

This gives us a pretty good guess of what our particles need to look like. If every particle represents a random guess at the best possible solution to our problem then every particle needs some sort of data structure for keeping track of its current guess. And since we want to update those guesses as we go we are also going to need some sort of data structure for keeping track of the next guess we want to make. Finally, since we want to keep track of the guesses we made each particle should have some sort of record of past guesses.

To summarize, our particle needs three things:

A guess at a best answer
An idea of what it should guess next
A history of all it’s guesses

For example, if we’re optimizing a two variable chemistry equation one of our particles might look like this:

Current Guess: Set temperature to 400 Kelvin and Pressure to 2 atmospheres
Next Guess: Decrease temperature by 20 Kelvin and increase pressure by 0.3 atmospheres
History: 450 Kelivin and 1.8 atmospheres, 420 Kelvin and 1.9 atmospheres

But wait, there’s more! Back in the first post I also mentioned that the key to particle swarm optimization is to think of your random guesses like co-ordinates on a graph. So instead of thinking of your current guess as a temperature and a pressure think of it as an x-coordinate and a y-coordinate.

By the same logic you can think of the next-guess not as an adjustment, but as a velocity. Don’t say that you want to decrease the first variable’s value by 10 and increase the second variable’s value by 20. Instead say that the particle has an x velocity of -10 and a y velocity of 20.

With this perspective we can take the two variable chemistry equation example from above and turn it into a particle that looks like this:

Position: x: 400, y: 2
Velocity: x-velocity: -20, y-velocity: 0.3
History: (440, 1.7), (420, 1.4)

Now just imagine another 99 particles with their own random positions and velocities and you’ve got a swarm!

The Logic Of The Swarm

You’ve just seen how a little mental juggling lets us turn a set of random guesses into a two dimensional particle zipping through a two dimensional world. Nifty, but why bother?

Well, in order for our swarm intelligence to actually act intelligently the particles need some way to adjust their search patterns based on the information they and their neighbors have found. They need some way to compromise between searching off in their own corner and helping the swarm examine known good answers. Giving them a velocity makes this easy.

The process works like this: Each particle regularly checks in with the swarm and asks for the co-ordinates of the best answer that any particle has found so far. The particle then compares its current velocity to the location of the best answer so far. If the particle is not already moving towards the best answer so far it adjusts its own velocity to a compromise between where it was already going and the direction it would need to go to reach the best answer.

For example, imagine a particle is currently at position (10, 10) with a velocity of (1, 1). The particle checks with the swarm and finds out that the best answer so far is at position (-5, -5). Since it turns out that the particle is moving in the exact opposite direction of the best answer so far it adjusts it’s velocity a little, probably to something like (.8, .8). Still the wrong direction, but slower. If no better answer than (-5,-5) is found the particle will keep adjusting its velocity little by little until it’s eventually moving directly towards (-5, -5).

We’ll cover the exact math behind moving particles and updating velocity in a few more posts. For now just make sure you’re comfortable with the idea.

Also, this isn’t the only particle swarm algorithm. Other more complicated algorithms exist. One popular approach* has each particle update it’s velocity by looking not only at the global best answer so far, but also the best answer it has personally found. This can help avoid the problem of particles being so obsessed with the global best answer that they skip over even better answers near their random starting location. We’re not going to be bothering with that right now though. Just thought you might like to know.

Object Orienting Your Lisp

With all that theory out of the way we can finally write some Lisp. Not a lot of Lisp just yet though. We’ll mostly just be writing a simple data structure to represent a particle.

Like we mentioned above every particle in our 2D swarm the following pieces of data: an x-position, a y-position, an x-speed, a y-speed and history. Creating a Lisp data structure to hold all that is as simple as using the defclass function. The first argument is the name of the class, the second argument is just a blank list (unless you’re using class inheritance, which we aren’t) and the final argument is a list of the data slots you want inside the class.

(defclass particle-2d ()
    (x-coord
    y-coord
    x-vel
    y-vel
    history))

Now that we’ve defined a particle-2d class creating particles is as easy as a call to make-instance. Here I’ve nested it inside a defparameter in order to assign the new particle to a test variable I have unimaginatively called *test-particle*.

(defparameter *test-particle*
    (make-instance 'particle-2d))

Once you have an actual particle-2d you can access specific data slots using the slot-value function. The first argument is the class object you want to get data out of and the second argument is the name of the data-slot you’re interested in. Her I’ve nested the slot access inside of a setf in order to assign some actual data to the *test-particle* that I just created.

(setf (slot-value *test-particle* 'x-coord) 10)
(setf (slot-value *test-particle* 'y-coord) 5)
(setf (slot-value *test-particle* 'x-vel) 1)
(setf (slot-value *test-particle* 'y-vel) 2)
(setf (slot-value *test-particle* 'history) '(10 5))

We can also use slot-value to read data without setting it. For instance, here’s a convenient little function for printing out a particle in a human friendly format. This uses the powerful format function. If you ever find yourself needing to write a Lisp function that ties lots of data together into one big string you’ll definitely want to look up all the cool things you can do with format.

(defun print-particle-2d (particle)
    (format t "x:~a~%y:~a~%x-vel:~a~%y-vel:~a"
        (slot-value particle 'x-coord)
        (slot-value particle 'y-coord)
        (slot-value particle 'x-vel)
        (slot-value particle 'y-vel)))

Now let’s define the swarm. The swarm really has only two jobs: to keep track of the best answer so far and to provide a convenient link to all the particles.

(defclass swarm-2d ()
    (best-x
    best-y
    best-answer
    particle-list))

The Promise Of More Compact Code

The previous code was a little bulky. Do we really want to have to write five lines of setf and slot-value every time we need a new particle? Fortunately the answer is no. We can hide all that tedious work inside a nice, clean function. Even better, Lisp has some built in tricks that make creating and manipulating objects like our particle-2d much easier. So don’t worry, if our code really starts weighing us down with it’s bulk we have plenty of options for slimming it down.

Until then let’s just celebrate that we’ve got some great foundation code for our swarm. With our particle object and swarm object defined we’re all set to start building the code to generate an entire swarm filled with dozens of particles eager to rush out and solve our problems.

So be sure to tune in next time!

* At least I assume it’s popular since it’s the one that made it into the Wikipedia article on particle swarms

Let’s Program A Swarm Intelligence 4: Specifying A Specific Use Case

Posted on February 1, 2014 by Scott

Making Hard Problems Easier

When coding a difficult program it usually helps to try and split your problem into multiple, smaller problems that are individually much easier to solve. For example, you’d be crazy to try and build an entire 3D graphics engine all at once. It would be much smarter to start out by focusing just on drawing simple shapes. And then move on to calculating lighting. And then move on to adding color and texture. Eventually all the simple pieces come together to help create the big complicated program you wanted in the first place.

In this case, my eventual goal is to write a generic particle swarm optimizer that can handle equations with any number of variables. The swarm AI will be flexible enough to do equally well whether it’s trying to find the low point in a 2D graph or balance a chemical equation with twenty-seven different variables. Hardly cutting-edge research, but still a moderately challenging problem for a part-time blogger with limited free time. Finding some way to split this project into smaller more blog-able parts would be a big help!

Which is why I’m going to start by programming a simple swarm that can only handle two variable equations. This very specific goal will let me focus on the fundamental logic of our AI without worrying about the added complexity of trying to figure out how many variables we’re trying to optimize. We can take the easy way out and just hard-code all of our functions to always expect two variables. Only after the core AI logic is out of the way will I worry about the issue of flexibility.

By splitting the problem of “Build a generic particle swarm AI” into the two mini-problems “Build a 2D particle swarm” and “Upgrade a 2D particle swarm” I hope to save myself a lot of confusion and frustration.

But why start specifically with a 2D swarm? Why not a 3D or 4D or 11D swarm? Mostly because two dimensional equations are easy to visualize and hand-optimize, which will make this program much easier to test and debug. Focusing entirely on optimizing eleven variable equations would still let me start with a specific AI and then upgrade it later, but double checking my logic in eleven dimensions would be a lot more difficult and take a lot more time than with a 2D swarm.

Designing The Function Call

Now that we know that we’re going to be specifically building a function for optimizing two variable equations we can start thinking about what our function should look like. What do we want to call it? What information does it need? How do we want to pass it that information?

Well, an equation optimizer obviously needs a reference to the function representing the equation we want optimized. We also probably need to give the optimizer some boundaries to let it know what the biggest and smallest possible variables it should work with are. We might only want the swarm to consider positive numbers or maybe restrict it to x and y values between positive and negative ten.

Putting our wish list together here is a sample of what our function call might look like if we wanted to optimize another function called “reactor-output” and restrict our x inputs to between positive and negative one thousand while restricting our y input to somewhere between zero and fifty:

(swarm-minimize-2d #’reactor-output ‘(-1000 1000) ‘(0 50))

Two new bits of Lisp syntax to look out for here. First, the pound-apostrophe-function-name syntax of #’reactor-output. This is a lisp shortcut that says “this is a function reference”. That’s all it takes to pass a function around like data in Lisp.

The next thing you should notice is the apostrophe before the two lists. Normally when Lisp evaluates a list it tries to interpret the first symbol as a function and the rest of the list as arguments. Prefacing a list with a ‘ lets Lisp know that this particular list is actually a list and shouldn’t be evaluated.

Now that we have a sample of how we want to call our function let’s take a stab at actually defining it. We know we want to call it “swarm-minimize-2d” and that it has three arguments (a function reference, a set of x boundaries and a set of y boundaries).

(defun swarm-minimize-2d (fn x-limits y-limits)
   “A particle swarm optimizer for two dimensional equations”
   ;Optimization code will go here)

Good start!

Designing The Return Value

The next step ask ourselves what we want the output to look like. The most obvious and important piece of output will be the coordinates of the most optimal solution found by the swarm. For convenience we can just put this in a two item list. Ex: (4 7)

Honestly, the coordinates of the optimal solution are the only thing we need for a working solution. But I think it would also be interesting to get a report of how the swarm found that answer. So in addition to the solution let’s include a chronological record of every point searched by every particle in the swarm. We can implement this with a list of lists of coordinates structured something like this:

( ((1 2) (2 2) (3 2))

((-1 -4) (0 -3) (1 -2))

((5 4) (4 3) (3 2)))

Give your eyes a minute to adjust to the parenthesis. I’ll wait.

This sample list represents three particles that each tried three different locations. The first particle started at (1, 2), moved to (2, 2) and finished at (3, 2). The second particle started at (-1, -4), moved to (0, -3) and finished at (1, -2). I’ll leave it up to you to interpret what the third particle did.

… done yet?

If you came up with any answer other than “The third particle started at (5, 4), moved to (4, 3) and finished at (3, 2)” you need to go back and try again.

So we’ve got a good grasp on both the values we want to return from our function. But how do we go about returning multiple values from one function? Well, the most obvious answer would be to put both values into some sort of list or array and then return that. This is going to create an almost painfully nested data structure, but it will work fine. Lisp does have some other interesting ways to return multiple values but I think they are overkill for this particular situation.

So let’s update our function skeleton with a prototype return line. I’m also going to introduce the Lisp list function, a straightforward bit of code that just takes all of it’s arguments and builds a list out of them. We can use it for linking together our optimal solution and swarm history into a single output list.

(defun swarm-minimize-2d (fn x-limits y-limits)
   “A particle swarm optimizer for two dimensional equations”
   ;Optimization code will go here
   (list optimal-solution particle-history))

It’s important to note that this code is currently broken. If you type it into your REPL and try to run it you’ll probably get some sort of error like “SWARM-MINIMIZE-2D: variable OPTIMAL-SOLUTION has no value”. That’s OK for now. This function skeleton was mostly just a though experiment to help us get a handle on the real function we’re going to build starting next post.

So be sure to check back next time where I start writing some Lisp code you can actually run.

Let’s Program A Swarm Intelligence 3: All Your Max Are Min

Posted on January 31, 2014 by Scott

A Closer Look At Optimization

We’ve got a problem and we’ve got a programming language. Anything else we need to get this Let’s Program going?

Well, we really probably should spend a little more time talking about exactly what it means to optimize an equation. Just saying that we want our swarm to find “better” answers, hopefully the “best” answer, doesn’t really tell us all that much. What does a better answer look like?

The definition of good and better answers actually depends a lot on the problem being solved. If you’re trying to optimize your company’s profits you probably want the biggest value you can find. A two million dollar profit is more optimal than a one million dollar profit.

On the other hand imagine you are trying to optimize a chemical factory that produces toxic waste. Toxic waste is expensive to dispose of so in this case the optimal answer is the smallest answer you can come up with. Producing one ton of waste per month is more optimal than producing ten tons.

Now let’s imagine that in order to minimize your toxic waste you figure out that you’re going to need exactly one thousand gallons of fresh Chemical X every hour. So you start developing a Chemical X production system. Optimizing this system means producing as close to one thousand gallons per hour as possible. Making too little Chemical X slows down your system. Making too much Chemical X wastes money. You have an exact target to hit.

Summary: The Three Types Of Optimization

All numeric optimization problems can be classified as one of the following three types:

Minimum: Optimizing towards the smallest result possible.

Maximum: Optimizing towards the biggest result possible.

Target: You already have a specific, ideal result in mind. Optimization is the process of finding out how close you can get to that desired result.

Quick example. After collecting some data you find five possible solutions with the following answers: 50, 75, 100, 120 and 160.

If you are optimizing towards a minimum then 50 is the most optimal answer of the five.

If you are optimizing towards a maximum then 160 is the most optimal answer of the five.

If you are optimizing towards a target of 90 then 100 is the most optimal answer. If your target was 130 instead then 120 would be the most optimal answer.

Three In One Optimization: Everything Is A Minimum

If there are three different kinds of optimization then you might think that means we need to write three different optimization algorithms. But with just a little cleverness you can get away with only one optimization function. This is because all three types of optimization can be reduced to different kinds of minimums.

Minimum’s can be treated as minimums because that’s what they actually are. That was easy.

A maximum can be turned into a minimum by multiplying all values by -1. So instead of looking for the max between 1, 2 and 3 (which would be 3) we look for the minimum between -1, -2 and -3 (answer is still the 3).

A target goal can be turned into a minimum by focusing not on the answers themselves but on the absolute difference between the answers you find and the result you want. If your target is 100 and your answers are 90, 105 and 115 then you would reduce them to differences of 10, 5 and 15. We minimize to find the smallest difference (5) which lets us know which value was closest to our target (105).

Conclusion: We Need To Focus On Minimums

Now that we know that every optimization we want to do can be modeled as a minimization problem we’re ready to jump into actual problem solving. Check back in a few days and I’ll start outlining my design approach and share my fist bits of skeleton code.

Note to readers from the future: You don’t have to check back in a few days. You can just click on the next post whenever you want. Aren’t you lucky?

Let’s Program A Swarm Intelligence 2: Learning To Love Lisp

Posted on January 25, 2014 by Scott

Today’s Programming Language Of Choice Is…

Normally this is where I would launch into a long and tedious explanation of how important it is to choose the right language for your program. But this time I’m going to skip all that and just flat out tell you I’m going to use Lisp for our particle swarm optimizer.

You see, Lisp has become one of my favorite programming languages. But since my current job is focused entirely around PHP and Perl I don’t really have any professional opportunities to use it. So if I want to practice the language I have to do it on my own time. And, hey, this blog just happens to count as my own time.

Lips Is Actually A Good Choice For This Problem

The number one reason I plan on writing this swarm intelligence in Lisp is because I like Lisp. But even if I didn’t I would still have to admit that it’s a decent match for the problem being solved.

Remember, the basic idea behind an optimizer is to create a program that can be handed an arbitrary mathematical function and then find a really good solution to it. In computer terms this means we probably want to create a function that can accept other functions as an argument. And Lisp is famous for the way it lets you toss functions around like any other piece of data. No messing around with function pointers of weird memory tricks: Just grab a function and stuff it into a function call.

Of course, Lisp is hardly the only language to let you do this. In fact, most modern languages let you pass functions around like variables if you try hard enough. So if you really really don’t want to write in Lisp you could replicate everything I do here in a different language. In fact, that might be a good exercise for anybody wanting to practice their favorite language.

But why bother? Lisp is actually a pretty easy language to learn. It does have some super-advanced features that can be hard to wrap your head around but we don’t actually need any of those for this program. And the mere fact that you are willingly reading an article about AI programming suggests you are more than smart enough to learn basic Lisp in practically no-time at all.

Everything You Need To Know About Lisp In One Example

If you really want to learn Lisp you should probably read a book on the topic. I would personally recommend Practical Common Lisp, mostly because the author has been kind enough to post the entire text online for free.

But you don’t need an encyclopedic knowledge of Lisp just to follow along with my particle swarm code. All you really need is a basic understanding of some simple Lisp ideas.

With that in mind I will now attempt to create a single code example that will demonstrate enough Lisp syntax to explain the entire language to anyone who already understands at least one other programming language. I’d like to ask the audience to please be quiet as I attempt this daring feat:

(function-name argument1 (another-function another-argument) argument3)

Tada!

This single line of code shows off three very important aspects of the Lisp programming language:

Almost everything in Lisp is made out of lists. Lists are just parenthesis with symbols inside them. You can include an entire list inside of another list.
A function call is just a list where the first symbol is the name of a function and the rest of the symbols are arguments to that function. Ex: addition looks like this: (+ 1 2)
If one of the arguments to your function is a function call it will be evaluated first and it’s return value will be passed as an argument. Ex: (+ 5 (* 2 6) ) becomes (+ 5 12).

That’s pretty much it. Lisp can be written 90% the same as any other programming language once you get used to the fact that you’ll be writing (doStuff A B) instead of doStuff(A, B);

Everything You Need To Know About Lisp In A Few Examples

So… Lisp is all about lists and putting the functional call inside the parenthesis instead of before them. Cool. Anything else we need to know?

I guess there are one or two other things you probably need to know before you can really start programming. So let’s jump into a few quick examples of the Lisp way to do common programming tasks.

Variables In Lisp

The most obvious programming task is creating variables and giving them values. In Lisp you create global variables like this:

(defparameter variable-name starting-value)

And you can change variables like this:

(setf variable-name new-value)

Please note that there was no equals sign in that code. In Lisp the equal sign is actually used to test for equality instead of doing assignments.

Lisp also supports local variables for those situations where you have a variable that should only exist inside of a certain function. It looks kind of like this:

(let ( variable/value pairs) your code)

A more thorough and accurate example would be something like this:

(let ((x 10)
      (y 20))
   (* x y))

Don’t sweat it if you’re having trouble keeping track of all those nested lists and parenthesis. It will come to you in time.

Functions In Lisp

The next question on everybody’s mind is probably “How do you define functions in Lisp?”. With the handy defun function of course!

(defun name-of-function (arguments for function)
   “Optional help string”
   body of function)

For a really quick (and stupid) example:

(defun add-two (number)
   “Adds two to a number”
   (+ 2 number))

You might be wondering, “Where’s the return statement in that function?” Doing some math isn’t very useful if we don’t actually return the answer.

Well, Lisp doesn’t have return statements. Instead it automatically returns whatever the result of the last line of code in the function happens to be. This is convenient because that’s usually what we wanted anyways. Of course, if it isn’t you can always mimic a return statement by making sure the variable you want is the last thing mentioned in the function.

(defun add-two-but-return-original (number)
   (+ 2 number)
   number)

Welcome To The REPL

So we can declare functions now, but how exactly do we run them?

Well, first off you’re going to need a Lisp interpreter/compiler. I personally use CLISP and can vouch that it works great on both Linux and Windows.

While you wait for that to download let’s talk a little bit about what exactly it is you’re about to install. CLISP isn’t just a compiler or interpreter, it’s an interactive tool known as a “Read Evaluate Print Loop”, or REPL.

Like the name suggests the REPL is just an endless loop that reads in programmer input, tries to evaluate that input as Lisp code and then prints the result of whatever it just evaluated. Not quite following me? Well then just boot up CLISP (it’s done downloading by now, right?) and follow along with these examples.

[1]> 7

7

We type in the number 7. The REPL reads the 7 and then evaluates it. 7 evaluates to… well… 7. Finally the REPL prints out that 7. Nothing interesting here.

[2]> (+ 7 5)

12

This time we type in the list (+ 7 5). The REPL reads the list in and tries to evaluate it as a function call. It succeeds and does some addition for us. Finally it prints the result of that addition, in this case 12.

[3]> (defun hello-world-printer () (print “Hello World”))

HELLO-WORLD -PRINTER

Now I’m defining a new function using the define function syntax I explained a few paragraphs ago. The REPL reads in my input, evaluates it as a function and then returns the function name. I can now run that function by typing in (hello-world-printer) into the REPL.

But sometimes you don’t want to type your functions straight into the REPL. Sometimes it’s easier to write your code in a text file that can be saved and shared. In fact, trying to write a serious program entirely through the REPL would be a nightmare. Fortunately you can do this instead:

[4]> (load “myprogram.lisp”)

That’s all it takes to get the Lisp REPL to act more like a normal compiler or interpreter. Just write your entire program and use the load function to process the whole thing all in one go. And with that last bit of advice we’re ready for the meat of this Let’s Program.

Is That Really All We Need To Know?

I think so. You know how to create variables, update variables and write functions. Hardly enough to be a professional Lisp programmer but more than enough to follow along with my code and understand it well enough to write your own version in whatever language you want.

So give yourself a pat on the back. You not only now know more about swarm intelligences than most people who ever set foot on this earth, you also now know more about Lisp. Learning new things is fun like that.

Let’s Program A Swarm Intelligence 1: Introducing Particle Swarm Optimizers

Posted on January 21, 2014 by Scott

Welcome Back Dear Readers!

That last Let’s Program went pretty well. Built a nifty little chatbot and got some good reader feedback. So let’s give this a second go. Our topic this time? Swarm Intelligences!

What Is A Swarm Intelligence?

A swarm intelligence is a large-scale alien brain capable of controlling billions and billions of deadly warriors at one time. They grow stronger and smarter by absorbing the DNA and memories of other species. They are ruthless, efficient and unstoppable.

I’m Pretty Sure That Isn’t Right

Oh, sorry. Looks like my old Starcraft manual got mixed in with my research material. My bad. Good game though. I really ought to get around to buying the sequel. Rumor has it this one has 3D graphics and everything!

Anyways, let’s start over!

What Is A Swarm Intelligence?

The phrase “Swarm Intelligence” refers to a bunch of different problem solving algorithms that were all inspired by watching how real life animals make group decisions. Things like flocks of birds, schools of fish, colonies of ants and swarms of bees have all inspired their own types of swarm intelligence. While each swarm intelligence algorithm is different most are based around the idea of creating a large number of semi-independent problem solving programs that work together and share information. Why have just one AI working on your problem when you can have 100?

Of course, the fact that there are so many different kinds of swarm intelligences means we can’t really just program a generic “Swarm Intelligence”. We’re going to have to choose a specific kind. After a little research I decided that this Let’s Program is going to focus on a type of AI called a “Particle Swarm Optimizer”.

So for the rest of this series when I say “Swarm Intelligence” know that I’m probably referring to a “Particle Swarm Optimizer”. But always remember, there are lots of different kinds of swarm intelligence and they all work differently.

What Is A Particle Swarm Optimizer?

A “Particle Swarm Optimizer” is an optimizer that uses a particle swarm to do it’s optimizing.

That last sentence was very very accurate but it honestly wasn’t very useful. Let’s break it down a little further. First off let’s cover what an “Optimizer” is.

When we talk about “solving” a problem we usually think of finding one right answer. But not all problems have one right answer. Lots of real world problems actually have an infinite number of solutions that are all “right”. But even if they are all “right” some of those solutions will be better than others.

For example, imagine you run an oil refinery. By adjusting the refinery’s temperature and pressure you discover you can increase or decrease the amount of oil you refine per hour. Any temperature and pressure setting that refines enough oil to pay your bills is a “right” answer, but there are “better” settings that will refine more oil per hour and earn you enough cash to not just pay the bills but also give your employees a big bonus and buy yourself a Ferrari.

Obviously we aren’t going to settle for just any old “right” answer when there is a “better” answer and a Ferrari on the line. This process of trying to find better solutions even after finding one “right” answer is known as “Optimization” and there are lots of different ways to do it.

The “Particle Swarm Optimizer” approach to optimization is to basically make a whole bunch of random guesses at possible best solutions. You then look at which guess performed the best and use that information to adjust all of your guesses. You then see what the best solution from your adjusted guesses was and do the whole thing over again. After several thousand or million random guesses (depending on how much time you can spare) you should eventually have settled on one best guess.

But why is this called a “Particle Swarm Optimizer” and not an “Intelligent Mass Guessing Optimizer”? (Besides the obvious fact that the first name is much much cooler)

Well, let’s go back to our oil refinery example where we can adjust the temperature and pressure. Every guess we make will just be a temperate mixed with a pressure. Now let’s make a graph where the X axis is temperature and the Y axis is pressure. Now let’s mark down every guess we’ve made so far. You should end up with a big cloud of random dots.

Now every time we update our guess we also update our graph. The cloud of random dots will start to move around and eventually drift towards good answers. If you squint a little it will look like the dots are working together to explore new possible solutions. In fact, it will look like a big swarm of particles drifting through space and gathering around good solutions.

Hence “Particle Swarm Optimizer”.

The difference between a bunch of guesses and swarm of particles is just a matter of perspective

Particle Swarm Weaknesses

Now that you know how particle swarm optimizers more or less work you can probably start to see a few potential problems.

First off, particle swarms only work in situations where you can make random guesses and get an answer back immediately. Making millions of guesses isn’t a good strategy if you have to wait several hours or days to find out which guesses did well and which flopped. Usually this means that you can only use a particle swarm to optimize problems that you understand well enough to program into your computer.

For instance, if there is a complex equation explaining how temperature and pressure influence your hypothetical oil refinery you could use a particle swarm to optimize that equation. On the other hand if you’re really not sure how your oil refinery works* particle swarm optimization would be a really bad idea. Every time your program made a new guess you would have to physically adjust the refinery, take some physical measurements and then type them into your computer. Yuck.

So particle swarms only work with complete equations or with automated testing equipment that can perform experiments and report results really really fast.

Also, because particle swarms operate by making intelligent guesses there is no actual guarantee that they will find the “best” solution available. There is always a risk that the swarm will drift past the “best” answer and instead get attached to a “good” answer. This is actually a common problem with lots of AI techniques though, not just particle swarms. And as you’ll see in a few paragraphs this is actually less of a problem for particle swarms than it is for many other algorithms.

Why Use A Particle Swarm Optimizer?

So why would we want to use an AI that requires a full mathematical description of the problem we want to solve, especially if it can’t even guarantee a truly optimal solution? If we’ve already managed to reduce our problem to an equation can’t we just use calculus to find the true best value?

Excellent question, clever reader. If you are just trying to optimize a two variable equation you probably are better off using calculus and just outright solving the thing. Not every problem needs an AI.

But lots of problems that are too complicated to solve with calculus, or at least too complicated to solve quickly. Using calculus to solve a seventeen variable equation is really really hard. If some of the variables are dependent on each other it gets even harder. We’re talking about the sort of problem that would be worth a PhD.

But the particle swarm’s guessing approach works just as well in seventeen dimensions** as it does in two. It doesn’t really care how hard it is to derive or integrate an equation because it never has to do either of those things. It just guesses, checks and adjusts.

So you could spend ten years trying to solve an impossible calculus problem… or you could just plug it into a swarm intelligence, wait an hour or two and get an answer that’s probably close enough to the true optimum for all practical purposes.

But what if you are really worried about missing out on the true optimum? Well, particle swarms are less likely to miss true optimums than many other types of AI. This is because the particles start out spread all over search space. This increases the chance that at least one particle will notice really good data and let the rest of the swarm know that they are looking in the wrong place. A less swarmy algorithm that starts looking in the wrong place is more likely to just get stuck there.

The particles help each other avoid getting stuck on a “merely good” answer

Finally, particle swarms are useful because of their ability to handle “rough” data. Imagine a step function where you use one equation when temperature is below 100 degrees and another equation when it is above 100 degrees. Lots of problem solving algorithms will throw a fit when they hit that gap between equations. But since the particle swarm is just guessing and checking as it flies through problem-space it doesn’t really care that the data does really weird things around 100 degrees. It just notes whether the end result is better or worse than before and keeps going.

The particle swarm has no problem with rough data that would destroy a calculus based AI

Conclusion

Now you know what a swarm intelligence is and have been introduced to particle swarm optimization. Congratulations! You are now part of the less than 1% minority of the population that has ever bothered to study AI in any depth.

But why settle for that? Let’s take it to the next level by actually programming one, putting ourselves in the 1% of the 1% of the population that enjoys amateur AI programming. I have no idea if this is a particularly useful super-minority to be part of… but why let that stop us?

* Why did you buy an oil refinery if you didn’t understand how it worked? Invest more responsibly in the future!

** Have fun imagining a seventeen dimensional swarm of particles flying through seventeen dimensional space.

Different Types Of 70% Complete

Posted on January 9, 2014 by Scott

When building software, all else being equal, it is better for 70% of your features to be 100% functional than it is to have 100% of your features 70% functional.

That is all. Just something to think about when trying to plan a big software project.

Let’s Program A Chatbot: Index And Code

Posted on December 26, 2013 by Scott

Introduction

Have you ever wondered how chatbots work? Do you want to get some practice with regular expressions? Need an example of test driven development? Want to see some Perl code?

If the answer to any of those questions was “Yes” then you’re in luck because I happen to have just finished a little series of posts on writing pattern matching chatbots using Perl, regular expressions and test driven development. Enjoy!

Index

Let’s Program A Chatbot 1: Introduction And Theory

Let’s Program A Chatbot 2: Design Before You Code

Let’s Program A Chatbot 3: Choosing A Programming Language

Let’s Program A Chatbot 4: Let’s Talk About Test Driven Development

Let’s Program A Chatbot 5: Finally, Code!

Let’s Program A Chatbot 6: Don’t Fear The Regex

Let’s Program A Chatbot 7: To Be Or Not To Be, That Is The Use Case

Let’s Program A Chatbot 8: A Little Housecleaning

Let’s Program A Chatbot 9: The Grammar Police

Let’s Program A Chatbot 10: Chatting With The Bot

Let’s Program A Chatbot 11: Bad Adjectives

Let’s Program A Chatbot 12: When The Answer Key Is Wrong

Let’s Program A Chatbot 13: What’s Mine Is Yours

Let’s Program A Chatbot 14: Variety Is The Spice Of Life

Let’s Program A Chatbot 15: “ELIZA Effect” Should Be A Movie Title

Let’s Program A Chatbot 16: Testing On Live Subjects

Let’s Program A Chatbot 17: Blitzcode!

Let’s Program A Chatbot 18: A Bit Better Than Before

Let’s Program A Chatbot 19: Third Time’s The Charm

Let’s Program A Chatbot 20: What Next?

Complete Code

If you follow along with the posts you should be able to write your own chatbot from scratch. But if you don’t have the time for that or just want some reference code I have also provided my complete chatbot, user interface and testing suite: Complete Chatbot Code

Let’s Program A Chatbot 20: What Next?

Posted on December 26, 2013 by Scott

I’m Done… But Maybe You Aren’t

When I decided to do this whole “Let’s Program A Chatbot” I really only had two goals: Show how to build a very simple chatbot from scratch* and demonstrate the basics of test driven software development. And I feel like I’ve done that. Objective complete, gain 500xp and proceed to the next quest.

But I’m willing to bet that some of my readers aren’t satisfied yet. This humble set of posts was probably enough for those of you where just curious or bored, but what about those of you with very specific chatbot related goals? Goals that you haven’t reached yet. What are you supposed to do now?

That really depends on what exactly your goals are.

If You Want More Practice Writing Software

DELPHI is a good practice project, just complicated enough to require some real thought and development strategies while still being simple enough that you only need a few weekends to get some serious work done. If your main goal is to just get a little bit better at software or practice your Perl and regular expressions then it might make sense to just keep working on improving DELPHI.

Here are some (relatively) simple activities that you might want to try:

Keep Fine Tuning DELPHI: DELPHI is far from perfect and there are lots of ways it could be enhanced or fixed. For example, a lot of my test users asked DELPHI about the weather so it might be interesting to write some high-priority rules that can recognize questions about weather and generate weather themed answers. For further ideas just find a test user, record their conversation and then look for any instance where DELPHI didn’t act as smart as you wanted. Then write up some test cases and see if you can’t get DELPHI acting a little smarter.

Turn DELPHI Into ELIZA: ELIZA is a famous chatbot that pretends to be a psychiatrist by turning user input into questions. With a little research I’m sure you could find a copy or summary of the basic patterns and responses it used to do that. Now erase all of DELPHI’s old chat patterns and create a brand new set based off of ELIZA. Creating new regex patterns to match ELIZA’s strategy will be great practice.

Create A Whole New Set Of Rules And Responses: Instead of replacing DELPHI’s rules with ELIZA’s rules, why not try creating your own brand new set of rules and building a completely unique chatbot? Like baseball? Create a new set of rules oriented around explaining the game and sharing trivia about famous players. Are you a history buff? Create rules for answering questions about your favorite historic event or person. This will give you great practice at designing chatbots, writing test cases and coding up regular expressions.

Of course, if all you ever do is write and replace chat rules you’ll always be limited by the innate weaknesses of simple pattern matching. For a better chatbot and some more serious coding practice why not try tackling one of these projects:

Give DELPHI A Memory: Wouldn’t it be nice if DELPHI could ask the user for their name and then remember it? Or if DELPHI could remember that the user’s last question was about their dog and then use that information to create a personalized error message the next time it get’s confused? (I’M NOT SURE WHAT TO SAY TO THAT. TRY ASKING ME ANOTHER QUESTIONS ABOUT YOUR DOG). To pull this off you’d have to create a new data structure to hold DELPHI’s memories and rewrite the response generation system to figure out what information it should store and when it should retrieve it. Tricky, but satisfying.

Connect Input To Function Calls: Currently DELPHI does the exact same thing for all input patterns: It grabs an output pattern from an array and prints it to the screen. But what if we need more flexibility? Maybe when the user asks “What time is it?” we want to run a special sub-function that figures out what time it is and builds a response around that? Or maybe we want DELPHI to be able to answer “What is X?” questions by searching Wikipedia?

If you want DELPHI to do something different for each type of input you’ll need to find some way to associate input patterns with function calls instead of with simple output arrays. In Perl you can do this with function references. Exactly how would you do this? That’s for you to figure out. This is an advanced exercise after all.

Port DELPHI To A New Language: DELPHI uses regular expressions to match user input to chatbot output. Most modern languages support regular expressions or have easy-to-find regular expressions libraries. So why not recreate DELPHI in C++ or Java or Lisp or Python or Ruby or whatever it is all the cool kids are doing these day. Depending on what language you choose and how unlike Perl it is you may have to redesign major features like how the chat patterns are stored, but that’s what makes this a good advanced exercise.

If You Want To Make Serious Chatbots

What if you aren’t here to practice your software skills? What if you’re here because you really want to build a fully featured chatbot as soon as possible? In that case messing around with a toy like DELPHI is probably a waste of your time. It would make much more sense to use an existing tool that already has all the features you wat.

Now I’ve never had to deploy a professional quality chatbot, but from what I’ve seen ALICE is probably the way to go. ALICE uses the AIML standard to allow for very powerful pattern matching style chatbots. It also seems to already have support for advanced features like context and memory, which would be a challenge to shove into DELPHI.

Even better, there are tons of existing AIML libraries that can be downloaded to make your chatbot an instant expert on a wide variety of subjects. So not only do you not have to write your bot from scratch, you don’t have to write your responses from scratch either. Just grab a library and then tweak it for your specific needs.

So if you need to build a strong chatbot fast I’d recommend learning about AIML and downloading ALICE.

If You Want To Write Programs That Really Understand Language

By this point in the series you’ve probably realized that DELPHI (and most other chatbots) doesn’t actually understand English or how grammar works. DELPHI just looks at input, compares it to a list or patterns and then spits out a response. It’s a glorified “if… then…” statement.

On the one hand it’s pretty cool that such simple tricks can create an almost human-feeling program. On the other hand it’s pretty disappointing that DELPHI is only a trick. That’s like learning that the magicians at Vegas are just using slight of hand. You wanted real magic. You wanted a program that really understands human language.

Lucky for you there is a lot of research going into teaching human language to computers. Just go to your nearest university level library or favorite search engine and do some research on “Natural Language Processing”. This is an extremely large area of research that includes topics like: Labeling words as their part of speech, statistically analyzing books, automatic language translation, parsing sentences, building sentences and using context to deal with ambiguity. Not to mention the endless philosophical debates on what it even means for a computer to “understand” or “know” something.

But be warned that this sort of stuff gets complicated fast. Dive into this topic and you’re very quickly going to find yourself learning more about linguistics, grammar, statistics and logic than you may have ever wanted. It’s a fun ride but don’t go into it thinking you can read a book or two and then slap together an AI that passes the Turing Test with flying colors. This is hard, serious research at the cutting edge of computer science.

But if you manage to push that cutting edge further, even by a little…

Conclusion

DELPHI is done and I hope all my wonderful readers have enjoyed themselves. I’ve done my best to show you a few fun tricks and point the insatiably curious towards more ambitious projects and sources of knowledge. All that’s really left now is clean up work. I should probably make an index page that links all these individual posts together. Probably ought to post my complete code as well.

*As a young college student I had the hardest time finding any books or tutorials on chatbots. That was really frustrating and this whole series has basically just been me writing the sort of article I wish I could have read five years ago.

Christmas In One Minute

Posted on December 25, 2013 by Scott

Don’t you hate waiting for Christmas? Have you ever wished you could make it happen now?

With physics you can! Relativity says that the closer you get to the speed of light the slower your personal time moves compared to stationary objects. Move fast enough and years of Earth time can pass in the literal blink of an eye.

But how fast is fast enough? For example, how fast would you have to move if you wanted it to only be sixty seconds until Christmas? Well, this handly little Perl program can tell you the answer. Be aware that it does use the DateTime module to calculate how many seconds from now until Christmas morning so you may need to download that module before it runs.

#! /usr/bin/perl

# A silly little script for calculating how fast you need to move
# to make Christmas happen in one relativisitc minute

use DateTime;

my $c = 299792458; #speed of light in meters per second

# Calculate how fast you have to be moving to achieve a certain
# normal time to percieved time ratio. A.K.A. the Lorentz factor
sub speedFromTimeDilation{    
    my $timeDilation = $_[0];
    return sqrt($c * $c * (-1/($timeDilation * $timeDilation)+1));
}

#You may need to adjust this for your timezone
$now = DateTime->now;
$now->set_time_zone( 'America/Chicago' );
# In my family Christmas starts at 7 in the morning. Change this if you need
$christmasMorning = DateTime->new(year => $now->year, month => 12, day => 25, hour => 7);
$christmasMorning->set_time_zone( 'America/Chicago' );

#Make sure we are comparing now to next Christmas
if($now > $christmasMorning){
    $christmasMorning->set_year($christmasMorning->year+1);
}

$secondsUntilChristmas = $christmasMorning->epoch - $now->epoch;

#We can't make Christmas come slower
if($secondsUntilChristmas < 60){
    print "It is less than a minute until Christmas morning. You can wait that long\n";
    exit;
}

#Ratio between actual time to Christmas and our desired one minute wait
$desiredTimeDilation = $secondsUntilChristmas/60;

$neededVelocity = speedFromTimeDilation($desiredTimeDilation);

print "It is $secondsUntilChristmas seconds until Christmas morning\n";
print "To compress this into sixty seconds you would need to dilate time to:\n";
print 60/$secondsUntilChristmas*100, "% of normal\n";
print "Which would require a velocity of:\n";
print "$neededVelocity meters per second\n";
print "which is: \n";
print $neededVelocity/$c*100, "% of the speed of light\n";

Here is some sample output

It is 36423 seconds until Christmas morning
To compress this into sixty seconds you would need to dilate time to:
0.164731076517585% of normal
Which would require a velocity of:
299792051.236407 meters per second
which is:
99.9998643182701% of the speed of light

Ouch. It’s only ten hours until Christmas and we still have to go faster than 99% the speed of light. Not exactly a feasible goal. Oh well, looks like we’ll just have to get to Christmas the old fashioned way: by waiting.