Super Brains, Super Data and Chaos Theory

Posted on February 26, 2014 by Scott

I just finished reading Count to a Trillion by John C. Wright. To summarize without spoiling: It’s a story about the futuristic space adventures of a man who suffers from occasional bouts of post-human hyper-intelligence. Expect mystery, treachery, high-tech duels and the mandatory beautiful space princess.

A major theme in Count to a Trillion is the existence of what I’m going to call “super brains”, extremely intelligent entities that make normal human geniuses look like mere children. Inspired by this book I’ve spent the past few days playing with the question: What are the sort of things that a super brain should and should not be able to do? What are the realistic limits that even a super high IQ entity would have to deal with?

After much thought I think it is safe to say that all brains are limited by the data available to them. For example, even the most talented of accountants will have trouble helping you balance your checkbook if you don’t keep your receipts or if you lie about your expenses.

This limit applies to super brains too. A super brain might be able to instantly calculate the proper flight path to put a satellite in orbit around Jupiter but he’s still going to need to know how much the satellite weighs, how much stress it can handle and what kind of engine it uses. Give the super brain bad information about the satellite’s weight or construction and his answer will be just as disastrously wrong as a random guess. Like we say in computer science: Garbage in, garbage out.

But there is more to data than just whether you have it or not. You also have to consider the quality of your data. When I eyeball a jug of milk and claim there are “about two cups left” I probably actually have anywhere between one and a half to two and a half cups of milk. This data is less accurate and precise then if I were to spend an afternoon carefully measuring the remaining milk with scientific instruments in order to report that I was 99.9% sure that the jug held 2.13 cups of milk, plus or minus half a teaspoon.

This is important because different problems require different levels of data precision. If you want a super brain to help you decide what to make for dinner “we have about 2 cups of milk left” is more than enough information. But if you want his help sending a satellite to Jupiter you’re going to need to do a lot better than “I think it weighs somewhere between 2 and 3 tons”.

But it gets even more interesting! Chaos Theory shows that there are certain problems that require infinitely accurate data to solve, or at the very least data so close to infinitely accurate that no human could ever collect it. Even worse, not only do you need infinitely accurate “super data” to solve a chaos problem, without super data you can’t even predict how wrong you’ll be.

You see, with a normal problem having data that is 10% wrong will lead to an answer that is 10% wrong. Which is actually very useful because it let’s you say “our data isn’t perfect, but it’s close enough so our answer will also be close enough.”

But chaos theory introduces problems where data that is 10% wrong will lead to an answer that is unpredictably wrong. You might be 10% wrong or you might be 80% wrong or you might only be 1% wrong. There’s no way to tell. In the words of Edward Lorenz, father of chaos theory:

Chaos: When the present determines the future, but the approximate present does not approximately determine the future.

The classic example of a chaotic problem is predicting the weather. Weather is a complex, iterative system where today’s weather leads to tomorrow’s weather which leads to the next day’s weather and so on. Theoretically you should be able to predict tomorrow’s weather just by looking at today’s weather. You can then use that prediction to calculate the next day’s weather again and again as far out into the future as you want.

The problem is that making even a tiny mistake when recording today’s weather will lead to small but unpredictable errors in your prediction for tomorrow. A small error in your prediction for tomorrow means a moderate error in your prediction for next week. A moderate error in your prediction for next week means a huge error in your prediction for next month. And a huge, unpredictable error in your prediction for next month means we have no idea what the weather will be like exactly one year from now.

The only way to avoid this endless cascade of errors is to make sure you start with perfect super data. But collecting super data is somewhere between “ridiculously hard” and “genuinely impossible”. For example: in order to put together some super weather data you would need an accurate list of air pressure and temperatures from every single point on the globe. And to keep that data accurate you would need to know every time a child laughed, a butterfly took flight or anyone lit a match. Unless you’ve got a plan for turning the entire planet into a giant self-monitoring barometer* you might as well give up.

Without this super data even a super brain would have no chance at predicting the far future of weather. It doesn’t matter that the super brain perfectly understands the mechanics of weather and can manipulate billions of numbers in his head; without perfect data he is stuck to the same seven day forecasts as normal weathermen. Chaos Theory paints a clear limit beyond which pure intelligence cannot take you.

All of which suggests a good rule of thumb for writing realistic super brain characters: The more chaotic a problem is the less useful the super brain is. Even characters with post-human levels of super-intelligence can’t predict the future of chaotic systems.

A super brain might be able to invent a new branch of chemistry, jury rig a spaceship and learn an entire language just by skimming a dictionary but he still won’t be able to predict next year’s weather or make more than an educated guess at which countries will still be around 1000 years in the future.

But this is just a rule of thumb. If you have a really interesting story idea that requires the main character to be able to predict the future of the human race with 99% accuracy then go ahead and write it. As long as it is entertaining enough (tip: lots of explosions and space princesses) I’m hardly going to throw a fit that you bent a few laws of mathematics.

* Fun story idea: Humanity builds a hyper-intelligent computer and asks it to solve the climate so we can finally stop arguing about what the climate is doing and how much of it is our fault. The computer concludes that it needs super data to do this and unleashes swarms of nano-bots to turn the entire surface of the earth into a giant weather monitoring station. Can our heroes stop the nanobots and shut down the super brain AI before it wipes out the entire human race?

Let’s Program A Swarm Intelligence 5: Defining The Swarm

Posted on February 15, 2014 by Scott

A Closer Look At Particles

Up to this point I’ve been really vague about how the particle swarm itself actually works. How exactly do the particles search for new answers? How do they communicated with each other? How do we program a particle? Time to find out.

Let’s start by remembering back to the very first post of this Let’s Program. The one where I explained that a particle swarm optimizer is just a system for making a lot of random guesses and then adjusting those guesses.

This gives us a pretty good guess of what our particles need to look like. If every particle represents a random guess at the best possible solution to our problem then every particle needs some sort of data structure for keeping track of its current guess. And since we want to update those guesses as we go we are also going to need some sort of data structure for keeping track of the next guess we want to make. Finally, since we want to keep track of the guesses we made each particle should have some sort of record of past guesses.

To summarize, our particle needs three things:

A guess at a best answer
An idea of what it should guess next
A history of all it’s guesses

For example, if we’re optimizing a two variable chemistry equation one of our particles might look like this:

Current Guess: Set temperature to 400 Kelvin and Pressure to 2 atmospheres
Next Guess: Decrease temperature by 20 Kelvin and increase pressure by 0.3 atmospheres
History: 450 Kelivin and 1.8 atmospheres, 420 Kelvin and 1.9 atmospheres

But wait, there’s more! Back in the first post I also mentioned that the key to particle swarm optimization is to think of your random guesses like co-ordinates on a graph. So instead of thinking of your current guess as a temperature and a pressure think of it as an x-coordinate and a y-coordinate.

By the same logic you can think of the next-guess not as an adjustment, but as a velocity. Don’t say that you want to decrease the first variable’s value by 10 and increase the second variable’s value by 20. Instead say that the particle has an x velocity of -10 and a y velocity of 20.

With this perspective we can take the two variable chemistry equation example from above and turn it into a particle that looks like this:

Position: x: 400, y: 2
Velocity: x-velocity: -20, y-velocity: 0.3
History: (440, 1.7), (420, 1.4)

Now just imagine another 99 particles with their own random positions and velocities and you’ve got a swarm!

The Logic Of The Swarm

You’ve just seen how a little mental juggling lets us turn a set of random guesses into a two dimensional particle zipping through a two dimensional world. Nifty, but why bother?

Well, in order for our swarm intelligence to actually act intelligently the particles need some way to adjust their search patterns based on the information they and their neighbors have found. They need some way to compromise between searching off in their own corner and helping the swarm examine known good answers. Giving them a velocity makes this easy.

The process works like this: Each particle regularly checks in with the swarm and asks for the co-ordinates of the best answer that any particle has found so far. The particle then compares its current velocity to the location of the best answer so far. If the particle is not already moving towards the best answer so far it adjusts its own velocity to a compromise between where it was already going and the direction it would need to go to reach the best answer.

For example, imagine a particle is currently at position (10, 10) with a velocity of (1, 1). The particle checks with the swarm and finds out that the best answer so far is at position (-5, -5). Since it turns out that the particle is moving in the exact opposite direction of the best answer so far it adjusts it’s velocity a little, probably to something like (.8, .8). Still the wrong direction, but slower. If no better answer than (-5,-5) is found the particle will keep adjusting its velocity little by little until it’s eventually moving directly towards (-5, -5).

We’ll cover the exact math behind moving particles and updating velocity in a few more posts. For now just make sure you’re comfortable with the idea.

Also, this isn’t the only particle swarm algorithm. Other more complicated algorithms exist. One popular approach* has each particle update it’s velocity by looking not only at the global best answer so far, but also the best answer it has personally found. This can help avoid the problem of particles being so obsessed with the global best answer that they skip over even better answers near their random starting location. We’re not going to be bothering with that right now though. Just thought you might like to know.

Object Orienting Your Lisp

With all that theory out of the way we can finally write some Lisp. Not a lot of Lisp just yet though. We’ll mostly just be writing a simple data structure to represent a particle.

Like we mentioned above every particle in our 2D swarm the following pieces of data: an x-position, a y-position, an x-speed, a y-speed and history. Creating a Lisp data structure to hold all that is as simple as using the defclass function. The first argument is the name of the class, the second argument is just a blank list (unless you’re using class inheritance, which we aren’t) and the final argument is a list of the data slots you want inside the class.

(defclass particle-2d ()
    (x-coord
    y-coord
    x-vel
    y-vel
    history))

Now that we’ve defined a particle-2d class creating particles is as easy as a call to make-instance. Here I’ve nested it inside a defparameter in order to assign the new particle to a test variable I have unimaginatively called *test-particle*.

(defparameter *test-particle*
    (make-instance 'particle-2d))

Once you have an actual particle-2d you can access specific data slots using the slot-value function. The first argument is the class object you want to get data out of and the second argument is the name of the data-slot you’re interested in. Her I’ve nested the slot access inside of a setf in order to assign some actual data to the *test-particle* that I just created.

(setf (slot-value *test-particle* 'x-coord) 10)
(setf (slot-value *test-particle* 'y-coord) 5)
(setf (slot-value *test-particle* 'x-vel) 1)
(setf (slot-value *test-particle* 'y-vel) 2)
(setf (slot-value *test-particle* 'history) '(10 5))

We can also use slot-value to read data without setting it. For instance, here’s a convenient little function for printing out a particle in a human friendly format. This uses the powerful format function. If you ever find yourself needing to write a Lisp function that ties lots of data together into one big string you’ll definitely want to look up all the cool things you can do with format.

(defun print-particle-2d (particle)
    (format t "x:~a~%y:~a~%x-vel:~a~%y-vel:~a"
        (slot-value particle 'x-coord)
        (slot-value particle 'y-coord)
        (slot-value particle 'x-vel)
        (slot-value particle 'y-vel)))

Now let’s define the swarm. The swarm really has only two jobs: to keep track of the best answer so far and to provide a convenient link to all the particles.

(defclass swarm-2d ()
    (best-x
    best-y
    best-answer
    particle-list))

The Promise Of More Compact Code

The previous code was a little bulky. Do we really want to have to write five lines of setf and slot-value every time we need a new particle? Fortunately the answer is no. We can hide all that tedious work inside a nice, clean function. Even better, Lisp has some built in tricks that make creating and manipulating objects like our particle-2d much easier. So don’t worry, if our code really starts weighing us down with it’s bulk we have plenty of options for slimming it down.

Until then let’s just celebrate that we’ve got some great foundation code for our swarm. With our particle object and swarm object defined we’re all set to start building the code to generate an entire swarm filled with dozens of particles eager to rush out and solve our problems.

So be sure to tune in next time!

* At least I assume it’s popular since it’s the one that made it into the Wikipedia article on particle swarms

Let’s Program A Swarm Intelligence 4: Specifying A Specific Use Case

Posted on February 1, 2014 by Scott

Making Hard Problems Easier

When coding a difficult program it usually helps to try and split your problem into multiple, smaller problems that are individually much easier to solve. For example, you’d be crazy to try and build an entire 3D graphics engine all at once. It would be much smarter to start out by focusing just on drawing simple shapes. And then move on to calculating lighting. And then move on to adding color and texture. Eventually all the simple pieces come together to help create the big complicated program you wanted in the first place.

In this case, my eventual goal is to write a generic particle swarm optimizer that can handle equations with any number of variables. The swarm AI will be flexible enough to do equally well whether it’s trying to find the low point in a 2D graph or balance a chemical equation with twenty-seven different variables. Hardly cutting-edge research, but still a moderately challenging problem for a part-time blogger with limited free time. Finding some way to split this project into smaller more blog-able parts would be a big help!

Which is why I’m going to start by programming a simple swarm that can only handle two variable equations. This very specific goal will let me focus on the fundamental logic of our AI without worrying about the added complexity of trying to figure out how many variables we’re trying to optimize. We can take the easy way out and just hard-code all of our functions to always expect two variables. Only after the core AI logic is out of the way will I worry about the issue of flexibility.

By splitting the problem of “Build a generic particle swarm AI” into the two mini-problems “Build a 2D particle swarm” and “Upgrade a 2D particle swarm” I hope to save myself a lot of confusion and frustration.

But why start specifically with a 2D swarm? Why not a 3D or 4D or 11D swarm? Mostly because two dimensional equations are easy to visualize and hand-optimize, which will make this program much easier to test and debug. Focusing entirely on optimizing eleven variable equations would still let me start with a specific AI and then upgrade it later, but double checking my logic in eleven dimensions would be a lot more difficult and take a lot more time than with a 2D swarm.

Designing The Function Call

Now that we know that we’re going to be specifically building a function for optimizing two variable equations we can start thinking about what our function should look like. What do we want to call it? What information does it need? How do we want to pass it that information?

Well, an equation optimizer obviously needs a reference to the function representing the equation we want optimized. We also probably need to give the optimizer some boundaries to let it know what the biggest and smallest possible variables it should work with are. We might only want the swarm to consider positive numbers or maybe restrict it to x and y values between positive and negative ten.

Putting our wish list together here is a sample of what our function call might look like if we wanted to optimize another function called “reactor-output” and restrict our x inputs to between positive and negative one thousand while restricting our y input to somewhere between zero and fifty:

(swarm-minimize-2d #’reactor-output ‘(-1000 1000) ‘(0 50))

Two new bits of Lisp syntax to look out for here. First, the pound-apostrophe-function-name syntax of #’reactor-output. This is a lisp shortcut that says “this is a function reference”. That’s all it takes to pass a function around like data in Lisp.

The next thing you should notice is the apostrophe before the two lists. Normally when Lisp evaluates a list it tries to interpret the first symbol as a function and the rest of the list as arguments. Prefacing a list with a ‘ lets Lisp know that this particular list is actually a list and shouldn’t be evaluated.

Now that we have a sample of how we want to call our function let’s take a stab at actually defining it. We know we want to call it “swarm-minimize-2d” and that it has three arguments (a function reference, a set of x boundaries and a set of y boundaries).

(defun swarm-minimize-2d (fn x-limits y-limits)
   “A particle swarm optimizer for two dimensional equations”
   ;Optimization code will go here)

Good start!

Designing The Return Value

The next step ask ourselves what we want the output to look like. The most obvious and important piece of output will be the coordinates of the most optimal solution found by the swarm. For convenience we can just put this in a two item list. Ex: (4 7)

Honestly, the coordinates of the optimal solution are the only thing we need for a working solution. But I think it would also be interesting to get a report of how the swarm found that answer. So in addition to the solution let’s include a chronological record of every point searched by every particle in the swarm. We can implement this with a list of lists of coordinates structured something like this:

( ((1 2) (2 2) (3 2))

((-1 -4) (0 -3) (1 -2))

((5 4) (4 3) (3 2)))

Give your eyes a minute to adjust to the parenthesis. I’ll wait.

This sample list represents three particles that each tried three different locations. The first particle started at (1, 2), moved to (2, 2) and finished at (3, 2). The second particle started at (-1, -4), moved to (0, -3) and finished at (1, -2). I’ll leave it up to you to interpret what the third particle did.

… done yet?

If you came up with any answer other than “The third particle started at (5, 4), moved to (4, 3) and finished at (3, 2)” you need to go back and try again.

So we’ve got a good grasp on both the values we want to return from our function. But how do we go about returning multiple values from one function? Well, the most obvious answer would be to put both values into some sort of list or array and then return that. This is going to create an almost painfully nested data structure, but it will work fine. Lisp does have some other interesting ways to return multiple values but I think they are overkill for this particular situation.

So let’s update our function skeleton with a prototype return line. I’m also going to introduce the Lisp list function, a straightforward bit of code that just takes all of it’s arguments and builds a list out of them. We can use it for linking together our optimal solution and swarm history into a single output list.

(defun swarm-minimize-2d (fn x-limits y-limits)
   “A particle swarm optimizer for two dimensional equations”
   ;Optimization code will go here
   (list optimal-solution particle-history))

It’s important to note that this code is currently broken. If you type it into your REPL and try to run it you’ll probably get some sort of error like “SWARM-MINIMIZE-2D: variable OPTIMAL-SOLUTION has no value”. That’s OK for now. This function skeleton was mostly just a though experiment to help us get a handle on the real function we’re going to build starting next post.

So be sure to check back next time where I start writing some Lisp code you can actually run.