» Let’s Program A Chatbot 1: Introduction And TheoryScott Cornaby

What Is Let’s Program?

You’ve probably heard of “Let’s Play”, where someone plays a videogame and then posts their experience online. There are usually plenty of screen-shots or even videos with the player’s voice dubbed over the game.

They can be a lot of fun.

This has inspired me to try and create the first (that I know of) “Let’s Program”. I am going to write a program and blog about the process. There will be code samples and screen-shots. (UPDATE: A few months after writing this I stumbled upon a bunch of older coding blogs with titles like “Let’s Code” which just goes to show that no matter how much online research you’ve done for your “original idea” you probably still haven’t done enough)

I can hear your complaints already. “Scott,” you hypothetically ask, “How is this any different from the thousands of other code tutorials online like that handy MongoDB series you wrote last month?”

Good question. “Let’s Program” is different because it’s going to be much less formal than a normal tutorial. Also, I’m calling it something different. That’s two whole differences! More than enough for != to return true.

Sure, you might argue otherwise… but you’re on the other side of the Internet and I can’t hear you. Take that hypothetical complaining reader!

Anyways, let’s get things started!

Chatbots A.K.A. How To Make Friends And Imitate People

A chatbot is a program that simulates human conversation. Usually typed conversation. For example, here is a sample of someone interacting with ELIZA, a classic program that pretends to be one of those annoying psychiatrists that does nothing but parrot back everything you say in the form of a question. The program is the one talking in all caps:

Men are all alike.

IN WHAT WAY?

They're always bugging us about something or other.

CAN YOU THINK OF A SPECIFIC EXAMPLE?

Well, my boyfriend made me come here.

YOUR BOYFRIEND MADE YOU COME HERE?

He says I'm depressed much of the time.

I AM SORRY TO HEAR YOU ARE DEPRESSED.

ELIZA is a little stiff but not obviously inhuman. Pretty good for a bunch of ones and zeros zipping through a chunk of silicon. My goal is to build something similar. Which shouldn’t be too hard since the creator of ELIZA was nice enough to publish a paper on how he did what he did.

How Do Chatbots Work?

There are tons of theoretical ways to simulate human conversation, but the most popular and widely used technique is simple pattern matching. If the user says X the computer says Y. If the user says A the computer says B. If the user says “I need a vacation” the computer says “Why do you need a vacation?”.

But that seems really tedious. If we need a different rule for every single thing the user might say we would have to program billions and billions of rules. Isn’t there a better way?

Yes there is! Instead of matching specific user inputs to specific computer responses we can just look for generic patterns. So instead of creating a rule for “I need a vacation”, “I need a sandwich”, “I need a nap” and so on we just create a single rule for “I need X”.

We use the same technique for generating computer responses. Instead of writing responses like “Why do you need a vaction?”, “Why do you need a sandwich?” and “Why do you need a nap?” we just write one response “Why do you need X?” where X is pulled straight out of the user’s original input.

Strengths And Weaknesses Of Pattern Matching Chatbots

There is one super huge advantage to pattern matching: it’s easy to program.

And that is a really huge advantage. A program isn’t good for anything until it actually works. A medium quality program that takes a month to develop is much more useful than a high quality program that is so complex that it never actually get’s completed.

But while pattern matching is probably the most pragmatic way to build a simple chatbot it does have a few downsides.

First, a simple pattern matching bot has no memory. You could build a pattern where the user says “My name is X” and the computer responds “Hello X”. But if the user’s next line was “What is my name?” there is no way for the computer to respond appropriately. The name variable X has already disappeared.

Second, pattern matching bots tend to panic when people don’t talk exactly like the program expects. If the user types “I really need a vacation” but the program only has an “I need X” pattern it won’t recognize the match. That extra word “really” breaks everything.

The same problem shows up with contractions. If the program is looking for “I can not” and the user types “I can’t” then we’ve got a problem.

Third, pattern matching bots aren’t smart enough to deal with context. As a human you know that there is a huge difference between the phrases “I am Scott” and “I am sleepy” but to a pattern matcher both of those sentences just look like “I am X”.

There are ways around most of these problems. For instance, with a little extra work you can give your bot the ability to remember simple facts like the user’s name and then use those facts as part of it’s pattern matching and responses. And you can fix the contraction problem by having the program expand “can’t” into “can not” before trying to find a matching pattern.

Pattern Matching Priority

There is one final trick to pattern matching chatbots that we need to think about. What if the user’s input matches two different patterns? Let’s imagine we have three rules:

“I feel sick” => “You should see a doctor.”

“I feel X” => “Why do you feel X?”

“I X” => “Is it always about you?”

Hopefully you can see that if the user types “I feel sick” it will match all three input patterns. How do we decide which response to use? The solution is to assign every pattern rule a priority. When user input matches more than one pattern we use the response from the highest priority rule.

In this situation we would probably decide that “I feel sick”, being a very specific rule, should get a higher priority than the more generic “I feel X”. And we would probably give the super-generic “I X” a very low priority since we only want to use it as a last resort when nothing else matches.

Now we can use these priorities to determine that the appropriate response to “I feel sick” is “You should see a doctor”. These priorities also help us figure out the right response to input like “I feel sad” (Why do you feel sad?) and “I don’t want to talk to you anymore.” (Is it always about you?).

Hey, I Found A Bunch OF Chatbot Programs Online! Why Write Another One?

One of the best way to get better at programming is to practice programming. Whether or not the final product is useful really doesn’t matter; the important part is what you learned during the development process.

It’s actually a lot like jogging. You spend thirty minutes running around the block only to wind up back where you started. What a waste! Except for the fact that all that “pointless” jogging is keeping your heart, lungs and legs healthy.

So no, this chatbot isn’t going to ever be used for anything important. But the process of writing it will hopefully help me and my readers to become better software developers.

Plus, programming is fun and I always wanted to be able to say “I wrote an AI capable of basic human speech”.

Conclusion

Tune in next time as I take this Let’s Program to the next level with some actual design documents outlining where I hope to take this program.

References

Finding a good book on chatbot design is surprisingly hard. I personally had the most luck with Paradigms of Artificial Intelligence Programming: Case Studies In Common Lisp by Peter Norvig.

Chapter 5 is an in-depth study of ELIZA, the ancestor of pretty much all pattern matching chatbots. Later chapters also tackle the much more difficult problem of natural language processing, which could be interesting to anyone who wants to go beyond simple chatbots and explore the heavy duty problems and techniques involved in getting a computer to really handle human language.

Plus it has a chapter on writing an advanced AI for playing Othello. And who doesn’t love boardgames?

One warning: This book is a little on the difficult side and is written in a very formal, academic tone that takes some getting used to. It also focuses entirely on the programming language Lisp, which (unfortunately) isn’t exactly super popular anymore.

So if the idea of learning an entire new language with some very unique syntax seems overwhelming you might want to pass on this particular book. But that’s where I learned the techniques I’m going to use in this Let’s Program so I felt I owed it to Mr. Norvig to mention his excellent book.