Let’s Program A Chatbot 13: What’s Mine Is Yours

Posted on November 26, 2013 by Scott

The Last Test Case (For Now…)

We’re down to our final test case. Are you excited? I’m excited!

Test Case 4 Failed!!!

Input: Do my readers enjoy this blog?

Output: Fate indicates that my readers enjoy this blog

Expected: Fate indicates that your readers enjoy this blog

Hey, that’s just a “do” rule. We already solved that problem last time. What’s going on here?

Oh, wait. The problem isn’t the “do”. The problem is that the questions mentioned “my readers” and DELPHI was supposed to be smart enough to switch the answer around to “your readers”. But DELPHI didn’t do that. We should fix that.

1st And 2nd Person Made Easy

The idea of first versus second person is way too complex for a simple pattern matching chatbot like DELPHI. But the idea of replacing word A with word B is simple enough. And it turns out that replacing first person words with second person words, and vice-versa, is good enough for almost every question that DELPHI is going to run into.

But be careful! When trying to swap A to B at the same time you are swapping B to A it is very possible to accidentally end up with all A. What do I mean? Consider this example:

My dog is bigger than your dog.

We switch the first person words to second person words:

Your dog is bigger than your dog.

Then we switch the second person words to first person:

My dog is bigger than my dog.

I’m sure you can see the problem.

The other big issue to look out for is accidentally matching words we don’t want to. Allow me to demonstrate:

You are young

We want to change that to:

I am young

But if all we do is blindly swap “I” for “you” we can easily end up with this:

I am Ing

For an even worse example consider this one:

I think pink is nifty.

You thyounk pyounk yous nyoufty.

Solving The Problems

Switching “you” to “I” while avoiding chaning “young” to “Ing” is pretty simple with regular expressions. All we have to do is use the “word boundary” symbol \b. Like so:

\byou\b

This will automatically skip over any instances of “you” that are directly attached to other letters or symbols.

Making sure that we don’t accidentally switch words from first to second person and then back from second to first will be a little more tricky. There are several possible solutions, some involving some cool regex and Perl tricks, but for now I’m just going to use to something very straightforward.

Basically I’m going to replace every first and second person word with a special placeholder value that I’m relatively certain won’t show up in normal DELPHI conversations. Then I will change all the placeholder values to their final. Here is how this will work with the above example:

My dog is bigger than your dog.

We switch the first person words to placeholders

DELPHIyour dog is bigger than your dog.

Then we switch the second person words to placeholders. Because we used a the placeholder “DELPHIyour” instead of plain “your” we don’t accidentally switch the first word back to “my”.

DELPHIyour dog is bigger than DELPHImy dog

Then we replace the placeholders

Your dog is bigger than my dog.

Here It Is In Code

I like foreach loops, so I’m going to implement this as two arrays and two foreach loops. The first array will contain regular expressions for finding first and second person words along with the place holders we want to replace them with. The second will contain regular expressions for finding placeholders and replacing them with the proper first and second phrases.

To implement this I just drop these variables and this function into DELPHI.pm right after generateResponse. The only new coding trick to look for is the ‘i’ modifier on the end of some of the rgeular expressions. This is the “case insensitive” switch and makes sure that DELPHI can match the words we want whether they are capitalized or not*.

#Dictionaries used to help the switchFirstAndSecondPerson function do its job
my @wordsToPlaceholders;

$wordsToPlaceholders[0][0]=qr/\bI\b/i;
$wordsToPlaceholders[0][1]='DELPHIyou';

$wordsToPlaceholders[1][0]=qr/\bme\b/i;
$wordsToPlaceholders[1][1]='DELPHIyou';

$wordsToPlaceholders[2][0]=qr/\bmine\b/i;
$wordsToPlaceholders[2][1]='DELPHIyours';

$wordsToPlaceholders[3][0]=qr/\bmy\b/i;
$wordsToPlaceholders[3][1]='DELPHIyour';

$wordsToPlaceholders[4][0]=qr/\byou\b/i;
$wordsToPlaceholders[4][1]='DELPHIi';

$wordsToPlaceholders[5][0]=qr/\byour\b/i;
$wordsToPlaceholders[5][1]='DELPHImine';

my @placeholdersToWords;

$placeholdersToWords[0][0]=qr/DELPHIyou/;
$placeholdersToWords[0][1]='you';

$placeholdersToWords[1][0]=qr/DELPHIyour/;
$placeholdersToWords[1][1]='your';

$placeholdersToWords[2][0]=qr/DELPHIyours/;
$placeholdersToWords[2][1]='yours';

$placeholdersToWords[3][0]=qr/DELPHIi/;
$placeholdersToWords[3][1]='I';

$placeholdersToWords[4][0]=qr/DELPHImine/;
$placeholdersToWords[4][1]='mine';

$placeholdersToWords[5][0]=qr/DELPHImy/;
$placeholdersToWords[5][1]='my';

sub switchFirstAndSecondPerson{
    my $input =$_[0];

    foreach my $wordToPlaceholder (@wordsToPlaceholders){
        $input =~ s/$wordToPlaceholder->[0]/$wordToPlaceholder->[1]/g;
    }

    foreach my $placeholderToWord (@placeholdersToWords){
        $input =~ s/$placeholderToWord->[0]/$placeholderToWord->[1]/g;
    }

    return $input;
}

Using The New Function In Generate Response

With that out of the way all that is left is to figure out where inside of generateResponse we should be calling this function. My first thought was to just stick onto the end of the function by finding the original return statement:

return $response;

And replacing it with this:

return switchFirstAndSecondPerson($response);

Now this is where test driven development comes in handy because that simple change did indeed pass test case 4… but it also caused messes like this:

Test Case 0 Failed!!!

Input: Will this test pass?

Output: you predict that this test will pass

Expected: I predict that this test will pass

…

Test Case 8 Failed!!!

Input: Pumpkin mice word salad

Output: you don’t want to talk about that. Please ask you a question

Expected: I don’t want to talk about that. Please ask me a question

We’ve accidentally made it impossible for DELPHI to talk in first person, which wasn’t what we wanted at all. We only wanted to change first and second words from the user’s input fragments, not from our carefully handwritten DELPHI responses. Which is a pretty good hint that we should have called firstToSecondPerson on the users input BEFORE we tried to parse it and generate a response, not after. Maybe right at the beginning of the function:

sub generateResponse{
    my $userInput = $_[0];
    $userInput = switchFirstAndSecondPerson($userInput);

    foreach my $chatPattern (@chatPatterns){

        if(my @UIF = ($userInput =~ $chatPattern->[0])){
            my $response = $chatPattern->[1];
            for(my $i=0; $i<@UIF; $i++){
                my $find = "UIF$i";
                my $replace = $UIF[$i];
                $response =~ s/$find/$replace/g;
            }
            return $response;
        }
    }
    return "Base Case Failure Error!";
}

The Moment Of Truth

Did we do it? Did we resolve our final use case?

Drum roll please…………

Test Case 0 Passed

Test Case 1 Passed

Test Case 2 Passed

Test Case 3 Passed

Test Case 4 Passed

Test Case 5 Passed

Test Case 6 Passed

Test Case 7 Passed

Test Case 8 Passed

Test Case 9 Passed

Test Case 10 Passed

Test Case 11 Passed

Test Case 12 Passed

Test Case 13 Passed

——————–

Passed 14 out of 14 tests

All Tests Passed!

WHOOO! GO US!

Note To Exceptionally Clever Readers

All my readers are clever, but some of you are exceptionally clever. And you may have noticed that switchFirstAndSecondPerson always returns lowercase words even when the original word was capitalized or at the beginning of the sentence. This isn’t a huge problem, but if you’re a perfectionist it might be bugging you to accidentally change “I care about grammar” to “you care about grammar” instead if “You care about grammar”.

One easy solution would be to update DELPHI to capitalize it’s entire output. People are used to computer programs SPEAKING IN ALL CAPS and it saves us the effort of having to actually teach DELPHI anything about proper capitalization.

If you don’t like the caps lock look you could instead update DELPHI to always make sure output starts with a capital. More often than not this is all it takes to make sentence look like real English.

Or you can do just do what I do and ignore the problem. I’m not going to worry too much about the occasional lowercase “you” or “my” unless users start complaining. And since this program isn’t intended for any real users that’s not likely to ever happen. Customer satisfaction is easy when you have no customers!

Conclusion

That’s it! We’ve passed all of our primary use cases. DELPHI is done.

Or is it? If you can remember all the way back to the original design document one thing we wanted out of DELPHI was the ability to generate random responses to questions. DELPHI currently just guesses “yes” to all questions which is both useless and boring. So while we hit a very important benchmark today we’re still not quite done.

* You know what else case insensitive regular expressions would be good for? Making DELPHI more accepting of user input that isn’t properly capitalized. Expect this to happen in a future blog post.

Let’s Program A Chatbot 12: When The Answer Key Is Wrong

Posted on November 23, 2013 by Scott

Unrealistic Expectations

Sometimes you get halfway through a project only to realize you don’t have the time or money to do what you originally planned to do*. When that happens you have no choice but to rethink your plans, either lowering your expectations or setting a new deadline. Admittedly both approaches generally involve getting frowned at by both management and your customers but sometimes you really have no choice. Even the best of developers have limits.

Why am I bringing this up? You’ll understand in a minute, but I will tell you that it involves these still unresolved use cases:

Test Case 2 Failed!!!

Input: Does this program work?

Output: I’m sorry, could you try rewording that?

Expected: Fate indicates that this program works

Test Case 3 Failed!!!

Input: Do computers compute?

Output: I’m sorry, could you try rewording that?

Expected: Fate indicates that computers compute

At first this doesn’t look so bad. The use cases are “Do X Y?” and “Does X Y?” and all DELPHI has to do is respond back “Yes X Y”. Hardly seems like a challenge. We’ll just slip this new rule into our list after the “or” rule and right before the “is” rule.

push(@chatPatterns,
   [qr/\A(?:Do|Does) (.+)\?\z/,
      "Fate indicates that UIF0"]);

Very simple. We look for any question that starts with some form of “Do” (notice the non-capture ?: symbol) and then we just replace that one question word with our “Fate indicates that” prediction. Is that really all it took?

Test Case 2 Failed!!!

Input: Does this program work?

Output: Fate indicates that this program work

Expected: Fate indicates that this program works

Test Case 3 Passed

A success and a failure is still an overall failure. So now we need to find out what went wrong with Test Case 2 that didn’t go wrong with test Case 3. If you look closely at the expected vs actual output the only issue is verb agreement. It should be “program works”, with an ‘s’, but all we got was the original “program work” from the question.

This problem really only shows up in the third person where the question is phrased as “Does X VERB” and the answer needs to be in form “X VERBs”. It’s really a pretty simple grammar rule. At least, it’s simple for a human. DELPHI is going to need a lot of help.

Hmmm… maybe we can solve this by just slipping an ‘s’ onto the end of our response. Of course, since this only applies to third person questions we’ll have to split the original rule into two rules. Notice that only the “does” version glues a final s onto the end of the User Input Fragment from the original input:

push(@chatPatterns,
   [qr/\ADo (.+)\?\z/,
      "Fate indicates that UIF0"]);

push(@chatPatterns,
   [qr/\ADoes (.+)\?\z/,
      "Fate indicates that UIF0s"]);

Test Case 2 Passed

I’m Still Not Sure This Is Really Working

Just gluing an ‘s’ to the end of the input doesn’t seem very sophisticated. Sure, it passed our test case but I’m not sure it will really work in all scenarios. So how about we write a new test case just to make extra sure we really solved our problem?

$testCases[13][0] = "Does adding an s work well?";
$testCases[13][1] = "Fate indicates that adding an s works well";

Nope!

Test Case 13 Failed!!!

Input: Does adding an s work well?

Output: Fate indicates that adding an s work wells

Expected: Fate indicates that adding an s works well

Adding an ‘s’ to the end of the sentence isn’t enough because what we truly want is an ‘s’ on the end of the verb and there is no guarantee that the verb will be the last word in the sentence. So to fix this problem we are going to need to either:

Develop a complex system for identifying the verb in an arbitrary sentence
Decide that we don’t care about adding ‘s’s to verbs

I’m going to go with option number 2 and come up with a new definition of what is considered a “correct” answer to a “does” question.

The New Test Case

There is an easy way around having to reformat our verbs and that is by including the word “does” inside the response. For instance, these two sentences basically mean the same thing:

This sentence looks equal to the other sentence

This sentence does look equal to the other sentence

This means that we can change the response to “Does X Y?” from “Yes, X Ys” to the much simpler “X does Y”. Now we are dealing with the exact same problem we already solved for “X is Y” and “X will Y”.

Here are our updated test cases:

$testCases[2][0] = "Does this program work?";
$testCases[2][1] = "Fate indicates that this program does work";

$testCases[13][0] = "Does this approach work better?";
$testCases[13][1] = "Fate indicates that this approach does work better";

And here is our updated “does” rule (the “do” rule can stay the same):

push(@chatPatterns,
   [qr/\ADoes ($noncaptureAdjectiveChain[a-zA-Z]+) (.+)\?\z/,
      "Fate indicates that UIF0 does UIF1"]);

And, finally, here are the results

Passed 13 out of 14 tests

Test Failure!!!

Did We Learn Anything Useful Today?

The moral of today’s story is that sometimes a test case that is really hard to solve represents a problem with your expectations as much as your program. If you’re on a tight budget or schedule** sometimes it makes sense to stop and ask yourself “Can we downgrade this requirement to something simpler? Can we delay this requirement until a later release?”

After all, good software today and the promise of great software tomorrow is better than insisting on great software today and never getting it.

Although sometimes you can manage to deliver great software today and that’s even better. Reach for the stars, bold readers. I have faith in your skills!

Conclusion

Did you notice that the success rate on our last testing run was 13 out of 14? That means we’re almost done! At least, we’re almost done with the first test version of the code. I’m sure the instant we ask a human tester to talk to DELPHI we’re going to find all sorts of new test cases that we need to include.

But future test cases are a problem for the future. For now we’re only one test case away from a significant milestone in our project. So join me next time as I do my best to get the DELPHI test suite to finally announce “All Tests Passed!”

* Even worse, sometimes you’ll find out that what you want to do is mathematically impossible. This is generally a bad thing, especially if you’ve already spent a lot of money on the project.

** Or if you’re writing a piece of demo software for your blog and don’t feel like spending more than a few dozen hours on what is essentially a useless toy program

Book Review: Implementing Responsive Design by Tim Kadlec

Posted on November 19, 2013 by Scott

You Need Me To Do What?!

Let’s say that you’re a programmer with no real talent for or interest in web design. But the startup you work at really needs someone to redesign their product to be more mobile friendly and they don’t have time to hunt for and hire a real designer. What is a programmer to do?

For me the answer was “Buy a book”. Ideally something simple enough that you don’t need to be an expert designer, deep enough to give you a real understanding of the field and short enough that you can finish the material and get back to work fast.

Implementing Responsive Design by Tim Kadlec turned out to be just about perfect.

Programmer Friendly

Implementing Responsive Design seems to have been directed more towards designers than developers but is overall easy to follow as long as you know the technical basics of HTML and CSS (and as a web programmer you probably do). There is no tricky vocabulary and you aren’t expected to be a graphical wizard who Photoshops in their sleep. There is nothing in the book that requires any existing experience with design or any special software and as a programmer in a hurry I really appreciated that.

Even better, the book has half a dozen practical examples complete with screen shots and sample code showing how different techniques lead to different looks on both desktop and mobile. The book also does a good job of covering the theory behind responsive mobile-first design which really helped me get into the head of how designers think. Learning how to properly think about mobile design is much more useful than just memorizing a few CSS rules.

Covers A Lot Of Territory Very Quickly

The book weighs in at a slim 250 pages making it the sort of thing you can read in one or two evenings. It starts with the absolute basics of “What is responsive design?” (creating web pages that change their layout depending on screen size) and then spends a few chapters tackling both the basic tools of reactive design and the thought process behind deciding how to design a reactive page in the first place.

After that is taken care of the book spends a little time exploring some more advanced techniques for optimizing loading times and enhancing the user experience for specific platforms. It then briefly covers some promising responsive technologies being developed, muses a bit about the future of web design and before you know it the book is done, having covered a lot of valuable information in a very short amount of time. Once again this is a very good thing for people like me who need to learn a lot of new things very quickly.

A Starting Point, Not A Reference Book

The one thing you should be aware of is that Implementing Responsive Design doesn’t have all the answers. And some of the answers it does have will probably be obsolete by the time you buy the book. Web technology is changing fast!

But you don’t really need all the answers. As long as you know what questions to ask you can find pretty much anything on the Internet. What this book is for is teaching you enough about responsive design to figure out what questions to ask in the first place. It helps you understand fundamental theories and techniques and any programmer worth his salt should be able to use that as a springboard to start researching specific solutions to their own specific problem.

Final Thoughts: A Good Buy For People Who Don’t Know Anything And Want To Fix That

Before this book all I knew about mobile design was that you could theoretically get a page to render differently based on whether it was on a phone or on a computer. 250 pages later I have a big grab-bag of common techniques for making this happen and, more importantly, I feel like I understand the motivation behind responsive web design. It changed how I look at putting content together and in an age of smartphones and tablets I think that developing an expanded and more flexible idea of what layout means is an invaluable skill.

On the other hand, if you’re already have some experience with designing pages that work well on both mobile and desktop you probably won’t find too much in this book you don’t already know.

But as a programmer I thought Implementing Responsive Design was a worthwhile read, even if I never have to program a mobile website by hand again. After all, the better we programmers understand how the user hopes to browse our websites and how the designers hope to style them the better job we can do of making sure our code and data supports a future full of diverse devices.

Let’s Program A Chatbot 11: Bad Adjectives

Posted on November 19, 2013 by Scott

Not As Easy As It Looked

Eeny meeny miny moe, which test case do I want to show?

Test Case 0 Failed!!!

Input: Will this test pass?

Output: I’m sorry, could you try rewording that?

Expected: I predict that this test will pass

This doesn’t look so bad. We already wrote a rule for “Is X Y?” so writing a rule for “Will X Y?” should be as easy as copy pasting and switching a few words. Behold!

push(@chatPatterns,
   [qr/\AWill ([a-zA-Z]+) (.+)\?\z/,
      "I predict that UIF0 will UIF1"]);

I’ll just drop that into the rules list right after the “Is X Y?” rule and we should be good to go.

Test Case 0 Failed!!!

Input: Will this test pass?

Output: I predict that this will test pass

Expected: I predict that this test will pass

Uh oh. That didn’t quite work. DELPHI did manage to figure out that test 0 was a “Will X Y?” style question but when generating the answer it put the “will” in the wrong place. Can you figure out why?

[Please use this break in the blog’s flow to consider why this happened.]

The problem here has to do with how we defined the rule. We’ve been calling it “Will X Y?” but the rule is actually more like “Will Noun Verb?” or “Will Noun-Phrase Verb-Phrase?”.

Our current dumb rule assumes that the noun will always be the first word after the word “Will” and that everything else will be part of the verb phrase. This works out great for sentences like “Will Batman catch the villain?” but completely falls apart when you start adding adjectives to the noun and get things like “Will the police catch the villain?”

So what we really need is a “Will” rule that is smart enough to group common adjectives with the noun and treat them all like one big super-noun. Here is a quick first pass (WARNING: WEIRD REGULAR EXPRESSION AHEAD):

/\AWill ((?:(?:this|the|that|a|an) )*[a-zA-Z]+) (.+)\?\z/

Don’t panic just yet, this rule is actually a lot simpler than it looks. But first you need to understand what all those “?:” symbols are doing. Hopefully you remember that parenthesis create capture groups that group patterns together and then store their matches for future use. But sometimes you want to group patterns together without storing them for later. You can accomplish this by starting your capture group with the special symbols “?:”, which then turns of the capturing and lets you use the parenthesis as a simple grouping tool.

This is important for our “Will” rule because we want to capture the entire noun-phrase and the entire verb-phrase but we don’t want to capture any of the individual parts of those phrases. For example, we have improved our noun-phrase by adding in two groups of nested parenthesis for handling common article adjectives. The inner parenthesis match common adjectives and the outer parenthesis make sure there is a space following each adjective. We mark both these rules as “?:” noncapturing because while we certainly do want to match nouns that start with a series of adjectives we only want to capture those adjectives as part of the noun and not on their own.

What would happen without those noncapturing symbols? Well, the first big parenthesis set would capture the entire noun-phrase and substitute it into the output just like we want. But the second capture group wouldn’t be the verb-phrase like we originally wanted. Instead the second capture group would be the inner parenthesis matching the articles leading to all sorts of problems. See for yourself:

Input: Will this test pass?

Output: I predict that this test will this

Expected: I predict that this test will pass

See what I mean? We successfully grabbed “this test” and put it into the answer as a noun-phrase but we then grabbed “this ” as our second capture group while the verb-phrase “pass” got pushed into a later capture group slot. Not what we wanted at all.

Instead we’ll just tell the inner parenthesis not to capture. Now the noun-phrase always goes in slot one and the verb-phrase always goes in slot two and everything works wonderfully.

Test Case 0 Passed

…

Passed 7 out of 11 tests

Wait A Minute, Isn’t This A Problem For “Is” Rules Too?

Clever readers might be asking themselves “If adjectives broke our simple “Will X Y?” rule, then won’t they break our old “Is X Y?” rule too?” Well good job for noticing that clever readers, because that’s the exact problem we see in our next test case:

Test Case 1 Failed!!!

Input: Is the sky blue?

Output: Fate indicates that the is sky blue

Expected: Fate indicates that the sky is blue

Fortunately we can fix it the exact same way:

/\AIs ((?:(?:this|the|that|a|an) )*[a-zA-Z]+) (.+)\?\z/

Test Case 1 Passed

…

Passed 8 out of 11 tests

You Shouldn’t Copy Paste Code

There is one little problem with this approach to adjectives: I’m hard coding a big list of words and then copy pasting it into multiple functions. This will be a real pain if we ever have to update the list in the future. For instance, if we wanted to add possessive adjectives into the list (my, your, his, her, their) we would have to rewrite two different rules. And if we ever decide a third rule needs access to the list we’ll have to copy paste the whole thing.

Much better to turn that portion of the rules into a separate variable that can be included in multiple functions. Which in Perl you can do like this:

#put this before the code starts to build the pattern and response array
my $commonAdjectives=qr/(?:this|the|that|a|an)/;
my $noncaptureAdjectiveChain=qr/(?:$commonAdjectives )*/;

And now we can just update the rules to use these handy variables anywhere we want to match an arbitrarily long chain of adjectives with a single space after every word.

/\AIs ($noncaptureAdjectiveChain[a-zA-Z]+) (.+)\?\z/

/\AWill ($noncaptureAdjectiveChain[a-zA-Z]+) (.+)\?\z/

Those of you following along in a language other than Perl will have to figure out on your own how and if your language handles inserting variables into a regular expression. If all else fails you can always just go back to the copy pasting thing.

Let’s Test By Adding Some More Adjectives

Now that we can add new adjectives to two different rules by just updating a single variable we should write a few new tests and make sure it works. How about these?

$testCases[11][0] = "Will his code compile?"
$testCases[11][1] = "I predict that his code will compile";

$testCases[12][0] = "Is this big blue box actually a time machine?";
$testCases[12][1] = "Fate indicates that this big blue box is actually a time machine";

The first test is a straightforward test to make sure we can add possessives to the adjective list. The second test is a little bit more complex, requiring us to not only add two new adjectives to our list (big and blue) but also testing to make sure the code can chain multiple adjectives together into a row.

Of course, right now they both fail. The first test doesn’t recognize “his” as an adjective so it assumes it is a noun and puts the “will” in the wrong place. The second test recognizes “this” as an adjective but not “big” and does the same thing.

Test Case 11 Failed!!!

Input: Will his code compile?

Output: I predict that his will code compile

Expected: I predict that his code will compile

Test Case 12 Failed!!!

Input: Is this big blue box actually a time machine?

Output: Fate indicates that this big is blue box actually a time machine

Expected: Fate indicates that this big blue box is actually a time machine

But after updating our list of adjectives:

my $commonAdjectives=qr/(?:this|the|that|a|an|his|her|my|your|their|big|blue)/;

Test Case 11 Passed

Test Case 12 Passed

——————–

Passed 10 out of 13 tests

Test Failure!!!

How Many Adjectives Do We Need?

DELPHI now knows how to handle 12 different adjectives. And while that is pretty nifty it’s worth pointing out that the English language has a lot more than just 12 adjectives. In fact, English is one of the world’s largest languages* and easily has several tens of thousands of adjectives. Even worse, English allows you to “adjectivify”** other words to creating new adjectives on the spot, like so:

“These new computery phones have a real future-licious feel to them but with the default battery they’re actually kind of brickish.”

My spell checker is convinced that sample sentence shouldn’t exist but even so you probably understood what I meant. Which just goes to show the huge gap in how good humans are at flexible language processing and how bad computers still are.

But what does this mean for DELPHI? Do we need to generate a giant adjective list? Do we need to teach it how to handle nouns and verbs that have been modified to act like adjectives? Do we need to spend twelve years earning multiple PhDs in computer science and linguistics in order to build a more flexible generateResponse function?

Well… no. Remember, our goal isn’t to create a program that can fully understand the human language. We just want a bot that can answer simple questions in an amusing way like some sort of high-tech magic eight ball. As long as DELPHI can handle simple input and gracefully reject complex input it should feel plenty intelligent to the casual user.

Furthermore, we can actually depend on users to play nice with DELPHI. Most people, after being scolded by DELPHI once or twice for trying to be clever will start to automatically pick up on what sorts of inputs do and don’t work. The fact that DELPHI can’t handle obscure adjectives will eventually teach users to stick to straightforward questions.

All things considered we can probably “solve” the adjective problem by teaching DELPHI the hundred most common adjectives in the English language and then hoping that users never bother going beyond that. Later on we can have some test users talk to DELPHI and use their experiences to decide whether or not we need to add more adjectives or build a more complex system.

Conclusion

Today we caught a glimpse of how simple pattern matching chatbots can completely fall apart when confronted with real English. But we also saw a quick way to band-aid over the worst of these problems and we have hope that our bot can be written in such a way that users never noticing that DELPHI is too dumb to understand that “house” and “that big house over there” are actually the same thing.

Next time, more test cases and more examples of English language features that are annoying to program around.

* As the popular saying goes: English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary.

** Look, I just verbed a noun!

Let’s Program A Chatbot 10: Chatting With The Bot

Posted on November 15, 2013 by Scott

There Is More To Life Than Tests

So far we’ve focused entirely on running our chatbot through automated tests. But eventually we’ll want some way for actual users to talk to DELPHI too. And since I just recently finished separating the chat code from the testing code I figure now is a great time to also introduce some user focused code.

Getting DELPHI to talk to a human is pretty easy. The generateResponse function already knows how to… well… generate responses to input. All that’s left is figuring out how to feed it human input instead of test input. Perl let’s us do this in under ten lines (which I put in a file name “chat.pl”):

#! /usr/bin/perl -w

use strict;

require 'DELPHI.pm';

while(<>){
   chomp;
   print "DELPHI: ", DELPHI::generateResponse($_), "\n";
}

You Promised No Tricky Perl!

Oh, I did promise that. So I guess the only honorable thing to do is to write a new version of “chat.pl” that doesn’t use quite so many shortcuts.

#! /usr/bin/perl -w

use strict;

require 'DELPHI.pm';

while( my $userInput = <STDIN> ){
   chomp($userInput); #Remove trailing newline character with chomp
   my $response = DELPHI::generateResponse($userInput);
   print "DELPHI: ", $response, "\n";
}

There, that’s better. Everything is much easier to understand now. We have a simple while loop that grabs lines of input from standard input, letting the user type questions for DELPHI. And then since DELPHI doesn’t like newlines we use the handy Perl function chomp to remove them from the input. Now that we have a user input string with no nasty newline at the end we pass it DELPHI::generateResponse and finally print out DELPHI’s reply for the user to read.

This Is A Horrible User Interface

If you tried to run “chat.pl” as is you probably noticed that it’s not very user friendly. When you first start the program it just sits on the command line and hopes that the user will eventually figure out he’s supposed to type something. And the only way to break out of the loop is to send the end of file character (ctrl-z or ctrl-d depending on your OS). We can do better than this!

#! /usr/bin/perl -w

use strict;

require 'DELPHI.pm';

print "DELPHI: The oracle greets you. What questions are on your mind?\n";
print '>';

while(my $userInput = <STDIN>){
    chomp($userInput); #Remove trailing newline character with chomp
    
    if(lc($userInput) eq 'goodbye' || lc($userInput) eq 'quit'){
        last; #Break out of while loop
    }
    
    my $response = DELPHI::generateResponse($userInput);
    print "DELPHI: ", $response, "\n";
    print '>';
}

print "DELPHI: Come back anytime\n";

Mostly cosmetic changes here. We start the program with some quick text that lets the user know they are dealing with a question answering bot and we end with a friendly farewell. We also put a helpful ‘>’ at the start of every input line so they user knows when they’re supposed to be typing.

I’ve also added the ability to exit the loop by typing “goodbye” or “quit”. To help with this I use the function lc, which returns an all lowercase version of whatever the user types. This way we don’t have to worry about users that capitalize “Goodbye” or people who try to “EXIT” with their caps lock key on. We then break the loop by using the last command which immediately exits the loop.

DELPHI’s First Words

Now DELPHI is currently only passing 6 out of 11 test cases so it’s really not ready for genuine human conversation. But we can still fire up our chat program and at see what happens.

DELPHI: The oracle greets you. What questions are on your mind?

>Is this working?

DELPHI: Fate indicates that this is working

>That’s nice.

DELPHI: I don’t want to talk about that. Please ask me a question

>Goodbye

DELPHI: Come back anytime

Not a wonderful conversation, but it does prove that DELPHI works just as well interactively as it does during tests. Now we can get back to writing more rules and responses in hope of getting DELPHI to perform better the next time we decide to talk to it directly.

Is DELPHI Too Strict?

If you’ve been playing with “chat.pl” you might have noticed that DELPHI is currently really stuck up about how you do or don’t capitalize words. For example:

DELPHI: The oracle greets you. What questions are on your mind?

>is this working?

DELPHI: I’m sorry, could you try rewording that?

>Is this working?

DELPHI: Fate indicates that this is working

I don’t plan to do anything about this problem right now. I just wanted to point it out to everyone so that you won’t think I overlooked this fairly big usability flaw. I’ll (probably) eventually fix this be rewriting DELPHI to care less about case, but for now I’m just going to go back to writing rules.

Let’s Program A Chatbot 9: The Grammar Police

Posted on November 13, 2013 by Scott

Low Hanging Fruit

We still have lots of test cases to try and pass. Some of them are easy and some of them will be pretty hard. Today I’m going to knock out a few of the easier ones.

Specifically, I’m going to try to get DELPHI to notice when a user has made a simple grammar mistake in their question. Things like starting a sentence with “Why” but not including a question mark at the end or ending a sentence with a question mark but not beginning with any question words that DELPHI recognizes. When a user makes this kind of mistake we want to point it out to them and give them a quick hint about how to better format their questions to get a good response out of DELPHI.

I Don’t Understand The Question

Here’s the first test case I want to work with:

Test Case 7 Failed!!!

Input: Pumpkin mice word salad?

Output: I don’t want to talk about that. Please ask me a question

Expected: I’m sorry, could you try rewording that?

This test case represents a user who has typed a question that DELPHI doesn’t understand. At least, we think it’s a question since it has a question mark at the end. So we want to suggest to the user that they reword the question in a simpler way. Hopefully this will convince users who type things like “Dodgers win the world series?” to try again with the better formatted “Will the Dodgers win the world series?”

The rule for this is pretty simple. We just create a low priority rule that matches a question mark anywhere in the user’s input. We let the high priority rules catch all the good input with question marks and then use this rule to clean up whatever is left.

push(@chatPatterns, 
        [qr/\?/,
            "I'm sorry, could you try rewording that?"]);

I put his rule in the system right above the final catch all pattern. I’m sure you can figure out why I didn’t try to put it after the catch all pattern. In any case:

Test Case 7 Passed

Is That Supposed To Be A Question?

Here’s the second test case I plan on fixing today:

Test Case 9 Failed!!!

Input: Why do you say things like that

Output: I don’t want to talk about that. Please ask me a question

Expected: Did you forget a question mark? Grammar is important!

This test case is the opposite of the last test case. This time the user input starts with a well recognized question word but it doesn’t end with a question mark. We want to remind the user that question marks are important so they can rewrite the question in a format the DELPHI will understand.

This is another easy low-priority rule. Once again we put it near the end of our rule list to make sure that it only catches question word input that didn’t match any of the previous, better rules.

I am introducing some new regex syntax here though. The ‘|’ symbols stand for “or” and lets us create a regular expression that will match any one item from a list of possibilities. That way we can build one rule to catch lots of different question word beginning inputs.

push(@chatPatterns, 
        [qr/\A(Why|Is|Are|Do|Does|Will)/,
            "Did you forget a question mark? Grammar is important!"]);

Besides the new “or” syntax there shouldn’t be anything surprising here. I start out with the \A anchor to indicate that we’re only looking for input that starts with a question word and then I make a list of all the common question words I expect DELPHI to run into.

I put this low priority rule after the question mark rule I just wrote but still before the catch all rule for the obvious reason that the catch all rule always needs to be last.

Test Case 9 Passed

Conclusion

With these two new rules DELPHI is now ready to criticize the grammar of anyone who dares to try to ask it a poorly formatted question. And our success rate is slowly creeping upwards!

Passed 6 out of 11 tests

Test Failure!!!

Sadly the test cases we are still failing happen to be the tricky ones, so we’re going to have to start doing some clever programming in the near future. But before that let’s take a little detour and build an actual user interface for DELPHI so that we can talk to it. I’m tired of letting the automated tests have all the fun, I want to talk to the chatbot too!

Let’s Program A Chatbot 8: A Little Housecleaning

Posted on November 7, 2013 by Scott

Just Because It Works Doesn’t Mean It’s Good

I’ll be honest, I’m a neat freak. And our current code is starting to get a little messy. Keeping our chatbot code in the same file as all our test code is not neat. Creating our array of chat patterns inside of the generateResponse method is not good. So let’s take just a few minutes and fix both of those problems.

The first thing I’m going to do is separate the test code from the chatbot code by moving generateResponse into a file named DELPHI.pm. Then I’m going to wrap it all inside of a package called DELPHI, which is basically just a way to attach a prefix to a bunch of Perl variables and functions. This will transform generateResponse into DELPHI::generateResponse.

Then I’m going to clean up generateResponse by moving the array of input patterns and responses outside of the function. Now the chat array will be built, once, when the DELPHI.pm file is first referenced. This helps keep generateResponse small and readable; no matter how many rules we add to our chatbot generateResponse will always remain short and easy to understand.

Updating our tests to work with these changes is a pretty simple two line change. First, we need to tell it to load the DELPHI.pm file by adding this simple command near the top of our file:

require 'DELPHI.pm';

The second change is just as easy. Our test code used to directly reference generateResponse, which worked because they were both part of the same file and package. Now that generateResponse lives in it’s own package we’ll have to change the line:

my $output = generateResponse($test->[0]);

Now it needs to be:

my $output = DELPHI::generateResponse($test->[0]);

And that’s all there is to it. Now our test program knows to look for generateResponse inside of the DELPHI package that it grabbed out of the DELPHI.pm file.

Now for the moment of truth… did it work? Did we successfully clean our code without breaking anything? Sounds like a job for our automated tests.

Passed 4 out of 11 tests

Test Failure!!!

That’s a relief! Even after modifying all that code the tests still run, no errors are thrown and we’re passing and failing all the same tests we used to.

The Code So Far

I figure now is a good time for a complete code dump in case anyone wants to run my code themselves and see what I’m doing. On the other hand… maybe you’re from the future where I’ve already finished DELPHI and released the complete code. In that case this old incomplete DELPHI code might be worth a laugh:

First up is DELPHI.pm. Note the “1” at the end of the file. Perl expects all packages to end with something that evaluates to true and the number 1 is the traditional way to do this.

package DELPHI;

my @chatPatterns;

push(@chatPatterns, 
        [qr/[a-zA-Z]+ or [a-zA-Z]+.*\?\z/,
            "Fate indicates the former"]);

push(@chatPatterns, 
        [qr/\AIs ([a-zA-Z]+) (.+)\?\z/, 
            "Fate indicates that UIF0 is UIF1"]);

push(@chatPatterns,
        [qr/\AWhy (.+)\?\z/,
            "Because of reasons"]);

push(@chatPatterns,
        [qr/.*/,
            "I don't want to talk about that. Please ask me a question"]);

sub generateResponse{
    my $userInput = $_[0];

    foreach my $chatPattern (@chatPatterns){
        
        if(my @UIF = ($userInput =~ $chatPattern->[0])){
            my $response = $chatPattern->[1];
            for(my $i=0; $i<@UIF; $i++){
                my $find = "UIF$i";
                my $replace = $UIF[$i];
                $response =~ s/$find/$replace/g;
            }
            return $response;
        }
    }
    return "Base Case Failure Error!";
}

1;

And then here is test.pl

#! /usr/bin/perl -w

use strict;

require 'DELPHI.pm';

my @testCases;

$testCases[0][0] = "Will this test pass?";
$testCases[0][1] = "I predict that this test will pass";

$testCases[1][0] = "Is the sky blue?";
$testCases[1][1] = "Fate indicates that the sky is blue";

$testCases[2][0] = "Does this program work?";
$testCases[2][1] = "Fate indicates that this program works";

$testCases[3][0] = "Do computers compute?";
$testCases[3][1] = "Fate indicates that computers compute";

$testCases[4][0] = "Do my readers enjoy this blog?";
$testCases[4][1] = "Fate indicates that your readers enjoy this blog";

$testCases[5][0] = "Is it better to be loved or feared?";
$testCases[5][1] = "Fate indicates the former";

$testCases[6][0] = "Why is natural language processing so hard?";
$testCases[6][1] = "Because of reasons";

$testCases[7][0] = "Pumpkin mice word salad?";
$testCases[7][1] = "I'm sorry, could you try rewording that?";

$testCases[8][0] = "Pumpkin mice word salad";
$testCases[8][1] = "I don't want to talk about that. Please ask me a question";

$testCases[9][0] = "Why do you say things like that";
$testCases[9][1] = "Did you forget a question mark? Grammar is important!";

$testCases[10][0] = "Is Perl a good choice for this program?";
$testCases[10][1] = "Fate indicates that Perl is a good choice for this program";

my $testCount=0;
my $successCount=0;

foreach my $test (@testCases){
    my $output = DELPHI::generateResponse($test->[0]);
    if( $output ne $test->[1] ){
        print "Test Case $testCount Failed!!!\n";
        print "Input: ".$test->[0]."\n";
        print "Output: $output\n";
        print "Expected: ".$test->[1]."\n";
    }
    else{
        print "Test Case $testCount Passed\n";
        $successCount++;
    }
    
    $testCount++;
}

print "--------------------";
print "\n";
print "Passed $successCount out of $testCount tests\n";
if($testCount == $successCount){
    print "All Tests Passed!\n";
}
else{
    print "Test Failure!!!\n";
}

A Note On Performance

My main goal here was to make the code easier to read and manage by separating my chatbot code from my test code and then further separating my response generation code from the response data. Not that my little 150 line script was actually that hard to read and manage in the first place. I probably could have safely procrastinated cleaning it up until I had a few dozen more test cases and a few dozen more rules.

But I was also curious if there would be any performance gain from moving the array creation outside of the generateResponse function. As things were the code had to rebuild the same nested array every time generateResponse was called. Sounds wasteful! Much better to move that array somewhere where it only has to be built once no matter how often generateResponse is called.

On the other hand, people who write compilers and interpreters* are really quite frightfully smart. One of their favorite optimization tricks is to find repetitive bits of code and then turn them into static data. Sometimes you can get away with something stupid like building a static array again and again and just rely on the compiler to fix it for you.

Which made me wonder: Would cleaning up generateResponse speed up my code or was the the system already optimizing my poor design decision? Let’s run some tests!

My particular test was pretty simple. I created a quick little file that included both version of generateResponse, one building the response array inside the method and one building the response array ahead of time. I then had both functions generate a response to a short phrase a million times ina row and kept track of how long it took for them to complete. Here are the results:

The method with array creation took 19.758241891861 seconds

The method without array creation took 9.73499703407288 seconds

So by moving array creation outside of the function we managed to double how fast the function runs. Apparently this particular mistake was too stupid for the compiler to fix on it’s own. Whoops.

Conclusion

A few quick changes and now our code is both cleaner and more efficient. Now to get back to adding new response patterns to DELPHI.

* Perl 5 is both interpreted and compiled depending on how you define interpret and compile.

Let’s Program A Chatbot 7: To Be Or Not To Be, That Is The Use Case

Posted on November 5, 2013 by Scott

Quick Review

Before we start writing new input patterns and response templates let’s take a look at the three we already have. If you’ve read this far I assume you understand enough about regular expressions that I don’t need to explain everything. If that is not true consider this a learning experience, like a mother bird pushing her chicks out of the nest and into the wide world of regex to help them learn to fly.

I don’t know if birds really do that, but it’s a nice metaphor. Anyways, on to the regular expressions!

Pattern: /\AIs ([a-zA-Z]+) (.+)\?\z/

Response: Fate indicates that UIF0 is UIF1

This is the basic pattern for finding sentences of the form “Is X Y?” and transforming them into the prediction “X is Y”.

Pattern: /\AWhy (.+)\?\z/

Response: Because I said so

This is the basic pattern for finding “Why X?” questions and then giving an unsatisfying excuse along the lines of “Just because”.

Pattern: qr/.*/

Response: I don’t want to talk about that. Please ask me a question

This is our catch-all base case. It matches anything and makes sure that even if the user types in something completely unexpected DELPHI can still generate a response instead of crashing.

Writing A New Rule

Ok, let’s fire up our automatic test and find a test we’re failing. We’ve barely written any code yet so there should be plenty to choose from.

Hmm… this one looks easy to fix.

Test Case 5 Failed!!!

Input: Is it better to be loved or feared?

Output: Fate indicates that it is better to be loved or feared

Expected: Fate indicates the former

This is the “or” question case and we want DELPHI to give us an answer about which of the two options is best. Instead it looks like DELPHI matched this with the “Is X Y?” pattern, leading to a rather dumb answer.

How to fix this? First, I’ll have to write an “or” pattern and response. Second, I’ll have to make sure that the “or” pattern has a higher priority than the “Is X Y?” pattern. Now please imagine me typing some code for about five minutes. Click click… and done:

$chatPatterns[0][0]=qr/[a-zA_Z]+ or [a-zA-Z]+.*\?\z/;
$chatPatterns[0][1]="Fate indicates the former";

$chatPatterns[1][0]=qr/\AIs ([a-zA-Z]+) (.+)\?\z/;
$chatPatterns[1][1]="Fate indicates that UIF0 is UIF1";

$chatPatterns[2][0]=qr/\AWhy (.+)\?\z/;
$chatPatterns[2][1]="Because I said so";

$chatPatterns[3][0]=qr/.*/;
$chatPatterns[3][1]="I don't want to talk about that. Please ask me a question";

The regex for “or” was pretty simple. We just look for the word “or” with at least one word before it, one word after it and a final ‘?’ at the end. And since we want this rule to be high priority we put it at the very top of our list. Now to see if it worked:

Test Case 5 Passed

…

Passed 3 out of 11 tests

We passed the test case and we didn’t lose either of the two test cases we were already passing. Cool!

Programmer Convenient Syntax

One issue with this latest change was that in order to put the new “or” rule in high priority slot index 0 I had to renumber every other entry in the array. That was kind of annoying and leaves me at risk of accidentally creating two rules with the same index (which would be bad).

So how about we switch to a syntax that let’s me insert new rules wherever I want? And as long as I’m rewriting my rules why I don’t fix the “Why” rule. When I first wrote it I accidentally programmed in a different “because” response than the test was expecting. By replacing the old bad response with a new good response I should be able to get up to passing four tests.

my @chatPatterns;

    push(@chatPatterns, 
            [qr/[a-zA-Z]+ or [a-zA-Z]+.*\?\z/,
                "Fate indicates the former"]);

    push(@chatPatterns, 
            [qr/\AIs ([a-zA-Z]+) (.+)\?\z/, 
                "Fate indicates that UIF0 is UIF1"]);
    
    push(@chatPatterns,
            [qr/\AWhy (.+)\?\z/,
                "Because of reasons"]);

    push(@chatPatterns,
            [qr/.*/,
                "I don't want to talk about that. Please ask me a question"]);

Same rules, but written in a slightly different way. Instead of explicitly choosing an index for each rule I use the push function to just glue new rules onto the end of the list. This means that I can change the priority of rules just by switching their order around, no need to recalculate indexes by hand. This will also make it easier to add new rules to the top or middle of the priority list.

You’ll also notice that I’m using the [ array, items, here ] syntax to build pattern and response arrays right inside of the push command. That feels a lot cleaner to me than trying to do something like this:

my @chatPatterns;

my @orPatternResponse;
$orPatternResponse[0] = qr/[a-zA-Z]+ or [a-zA-Z]+.*\?\z/;
$orPatternResponse[1] = "Fate indicates the former";
push(@chatPatterns, @orPatternResponse);

my @isPatternResponse;
$isPatternResponse[0] = qr/\AIs ([a-zA-Z]+) (.+)\?\z/;
$isPatternResponse[1] = "Fate indicates that UIF0 is UIF1";
push(@chatPatterns, @isPatternResponse);

Yuck! Just look at all those temporary variables I’d have to come up with names for. And I’m not even sure this code would work. I think pushing an array onto an array just glues them together instead of nesting them like we want. Let’s stick with the anonymous array brackets.

Conclusion

After adding the “or” rule and fixing our “why” response we now are passing 4 out of 11 test cases. That’s good progress! But adding more and more rules directly inside the generateResponse function is getting pretty messy. Maybe next time I’ll do something to clean that up.