» Let’s Program A Chatbot 7: To Be Or Not To Be, That Is The Use CaseScott Cornaby

Quick Review

Before we start writing new input patterns and response templates let’s take a look at the three we already have. If you’ve read this far I assume you understand enough about regular expressions that I don’t need to explain everything. If that is not true consider this a learning experience, like a mother bird pushing her chicks out of the nest and into the wide world of regex to help them learn to fly.

I don’t know if birds really do that, but it’s a nice metaphor. Anyways, on to the regular expressions!

Pattern: /\AIs ([a-zA-Z]+) (.+)\?\z/

Response: Fate indicates that UIF0 is UIF1

This is the basic pattern for finding sentences of the form “Is X Y?” and transforming them into the prediction “X is Y”.

Pattern: /\AWhy (.+)\?\z/

Response: Because I said so

This is the basic pattern for finding “Why X?” questions and then giving an unsatisfying excuse along the lines of “Just because”.

Pattern: qr/.*/

Response: I don’t want to talk about that. Please ask me a question

This is our catch-all base case. It matches anything and makes sure that even if the user types in something completely unexpected DELPHI can still generate a response instead of crashing.

Writing A New Rule

Ok, let’s fire up our automatic test and find a test we’re failing. We’ve barely written any code yet so there should be plenty to choose from.

Hmm… this one looks easy to fix.

Test Case 5 Failed!!!

Input: Is it better to be loved or feared?

Output: Fate indicates that it is better to be loved or feared

Expected: Fate indicates the former

This is the “or” question case and we want DELPHI to give us an answer about which of the two options is best. Instead it looks like DELPHI matched this with the “Is X Y?” pattern, leading to a rather dumb answer.

How to fix this? First, I’ll have to write an “or” pattern and response. Second, I’ll have to make sure that the “or” pattern has a higher priority than the “Is X Y?” pattern. Now please imagine me typing some code for about five minutes. Click click… and done:

$chatPatterns[0][0]=qr/[a-zA_Z]+ or [a-zA-Z]+.*\?\z/;
$chatPatterns[0][1]="Fate indicates the former";

$chatPatterns[1][0]=qr/\AIs ([a-zA-Z]+) (.+)\?\z/;
$chatPatterns[1][1]="Fate indicates that UIF0 is UIF1";

$chatPatterns[2][0]=qr/\AWhy (.+)\?\z/;
$chatPatterns[2][1]="Because I said so";

$chatPatterns[3][0]=qr/.*/;
$chatPatterns[3][1]="I don't want to talk about that. Please ask me a question";

The regex for “or” was pretty simple. We just look for the word “or” with at least one word before it, one word after it and a final ‘?’ at the end. And since we want this rule to be high priority we put it at the very top of our list. Now to see if it worked:

Test Case 5 Passed

…

Passed 3 out of 11 tests

We passed the test case and we didn’t lose either of the two test cases we were already passing. Cool!

Programmer Convenient Syntax

One issue with this latest change was that in order to put the new “or” rule in high priority slot index 0 I had to renumber every other entry in the array. That was kind of annoying and leaves me at risk of accidentally creating two rules with the same index (which would be bad).

So how about we switch to a syntax that let’s me insert new rules wherever I want? And as long as I’m rewriting my rules why I don’t fix the “Why” rule. When I first wrote it I accidentally programmed in a different “because” response than the test was expecting. By replacing the old bad response with a new good response I should be able to get up to passing four tests.

my @chatPatterns;

    push(@chatPatterns, 
            [qr/[a-zA-Z]+ or [a-zA-Z]+.*\?\z/,
                "Fate indicates the former"]);

    push(@chatPatterns, 
            [qr/\AIs ([a-zA-Z]+) (.+)\?\z/, 
                "Fate indicates that UIF0 is UIF1"]);
    
    push(@chatPatterns,
            [qr/\AWhy (.+)\?\z/,
                "Because of reasons"]);

    push(@chatPatterns,
            [qr/.*/,
                "I don't want to talk about that. Please ask me a question"]);

Same rules, but written in a slightly different way. Instead of explicitly choosing an index for each rule I use the push function to just glue new rules onto the end of the list. This means that I can change the priority of rules just by switching their order around, no need to recalculate indexes by hand. This will also make it easier to add new rules to the top or middle of the priority list.

You’ll also notice that I’m using the [ array, items, here ] syntax to build pattern and response arrays right inside of the push command. That feels a lot cleaner to me than trying to do something like this:

my @chatPatterns;

my @orPatternResponse;
$orPatternResponse[0] = qr/[a-zA-Z]+ or [a-zA-Z]+.*\?\z/;
$orPatternResponse[1] = "Fate indicates the former";
push(@chatPatterns, @orPatternResponse);

my @isPatternResponse;
$isPatternResponse[0] = qr/\AIs ([a-zA-Z]+) (.+)\?\z/;
$isPatternResponse[1] = "Fate indicates that UIF0 is UIF1";
push(@chatPatterns, @isPatternResponse);

Yuck! Just look at all those temporary variables I’d have to come up with names for. And I’m not even sure this code would work. I think pushing an array onto an array just glues them together instead of nesting them like we want. Let’s stick with the anonymous array brackets.

Conclusion

After adding the “or” rule and fixing our “why” response we now are passing 4 out of 11 test cases. That’s good progress! But adding more and more rules directly inside the generateResponse function is getting pretty messy. Maybe next time I’ll do something to clean that up.