The Last Test Case (For Now…)
We’re down to our final test case. Are you excited? I’m excited!
Test Case 4 Failed!!!
Input: Do my readers enjoy this blog?
Output: Fate indicates that my readers enjoy this blog
Expected: Fate indicates that your readers enjoy this blog
Hey, that’s just a “do” rule. We already solved that problem last time. What’s going on here?
Oh, wait. The problem isn’t the “do”. The problem is that the questions mentioned “my readers” and DELPHI was supposed to be smart enough to switch the answer around to “your readers”. But DELPHI didn’t do that. We should fix that.
1st And 2nd Person Made Easy
The idea of first versus second person is way too complex for a simple pattern matching chatbot like DELPHI. But the idea of replacing word A with word B is simple enough. And it turns out that replacing first person words with second person words, and vice-versa, is good enough for almost every question that DELPHI is going to run into.
But be careful! When trying to swap A to B at the same time you are swapping B to A it is very possible to accidentally end up with all A. What do I mean? Consider this example:
My dog is bigger than your dog.
We switch the first person words to second person words:
Your dog is bigger than your dog.
Then we switch the second person words to first person:
My dog is bigger than my dog.
I’m sure you can see the problem.
The other big issue to look out for is accidentally matching words we don’t want to. Allow me to demonstrate:
You are young
We want to change that to:
I am young
But if all we do is blindly swap “I” for “you” we can easily end up with this:
I am Ing
For an even worse example consider this one:
I think pink is nifty.
You thyounk pyounk yous nyoufty.
Solving The Problems
Switching “you” to “I” while avoiding chaning “young” to “Ing” is pretty simple with regular expressions. All we have to do is use the “word boundary” symbol \b. Like so:
\byou\b
This will automatically skip over any instances of “you” that are directly attached to other letters or symbols.
Making sure that we don’t accidentally switch words from first to second person and then back from second to first will be a little more tricky. There are several possible solutions, some involving some cool regex and Perl tricks, but for now I’m just going to use to something very straightforward.
Basically I’m going to replace every first and second person word with a special placeholder value that I’m relatively certain won’t show up in normal DELPHI conversations. Then I will change all the placeholder values to their final. Here is how this will work with the above example:
My dog is bigger than your dog.
We switch the first person words to placeholders
DELPHIyour dog is bigger than your dog.
Then we switch the second person words to placeholders. Because we used a the placeholder “DELPHIyour” instead of plain “your” we don’t accidentally switch the first word back to “my”.
DELPHIyour dog is bigger than DELPHImy dog
Then we replace the placeholders
Your dog is bigger than my dog.
Here It Is In Code
I like foreach loops, so I’m going to implement this as two arrays and two foreach loops. The first array will contain regular expressions for finding first and second person words along with the place holders we want to replace them with. The second will contain regular expressions for finding placeholders and replacing them with the proper first and second phrases.
To implement this I just drop these variables and this function into DELPHI.pm right after generateResponse. The only new coding trick to look for is the ‘i’ modifier on the end of some of the rgeular expressions. This is the “case insensitive” switch and makes sure that DELPHI can match the words we want whether they are capitalized or not*.
#Dictionaries used to help the switchFirstAndSecondPerson function do its job my @wordsToPlaceholders; $wordsToPlaceholders[0][0]=qr/\bI\b/i; $wordsToPlaceholders[0][1]='DELPHIyou'; $wordsToPlaceholders[1][0]=qr/\bme\b/i; $wordsToPlaceholders[1][1]='DELPHIyou'; $wordsToPlaceholders[2][0]=qr/\bmine\b/i; $wordsToPlaceholders[2][1]='DELPHIyours'; $wordsToPlaceholders[3][0]=qr/\bmy\b/i; $wordsToPlaceholders[3][1]='DELPHIyour'; $wordsToPlaceholders[4][0]=qr/\byou\b/i; $wordsToPlaceholders[4][1]='DELPHIi'; $wordsToPlaceholders[5][0]=qr/\byour\b/i; $wordsToPlaceholders[5][1]='DELPHImine'; my @placeholdersToWords; $placeholdersToWords[0][0]=qr/DELPHIyou/; $placeholdersToWords[0][1]='you'; $placeholdersToWords[1][0]=qr/DELPHIyour/; $placeholdersToWords[1][1]='your'; $placeholdersToWords[2][0]=qr/DELPHIyours/; $placeholdersToWords[2][1]='yours'; $placeholdersToWords[3][0]=qr/DELPHIi/; $placeholdersToWords[3][1]='I'; $placeholdersToWords[4][0]=qr/DELPHImine/; $placeholdersToWords[4][1]='mine'; $placeholdersToWords[5][0]=qr/DELPHImy/; $placeholdersToWords[5][1]='my'; sub switchFirstAndSecondPerson{ my $input =$_[0]; foreach my $wordToPlaceholder (@wordsToPlaceholders){ $input =~ s/$wordToPlaceholder->[0]/$wordToPlaceholder->[1]/g; } foreach my $placeholderToWord (@placeholdersToWords){ $input =~ s/$placeholderToWord->[0]/$placeholderToWord->[1]/g; } return $input; }
Using The New Function In Generate Response
With that out of the way all that is left is to figure out where inside of generateResponse we should be calling this function. My first thought was to just stick onto the end of the function by finding the original return statement:
return $response;
And replacing it with this:
return switchFirstAndSecondPerson($response);
Now this is where test driven development comes in handy because that simple change did indeed pass test case 4… but it also caused messes like this:
Test Case 0 Failed!!!
Input: Will this test pass?
Output: you predict that this test will pass
Expected: I predict that this test will pass
…
Test Case 8 Failed!!!
Input: Pumpkin mice word salad
Output: you don’t want to talk about that. Please ask you a question
Expected: I don’t want to talk about that. Please ask me a question
We’ve accidentally made it impossible for DELPHI to talk in first person, which wasn’t what we wanted at all. We only wanted to change first and second words from the user’s input fragments, not from our carefully handwritten DELPHI responses. Which is a pretty good hint that we should have called firstToSecondPerson on the users input BEFORE we tried to parse it and generate a response, not after. Maybe right at the beginning of the function:
sub generateResponse{ my $userInput = $_[0]; $userInput = switchFirstAndSecondPerson($userInput); foreach my $chatPattern (@chatPatterns){ if(my @UIF = ($userInput =~ $chatPattern->[0])){ my $response = $chatPattern->[1]; for(my $i=0; $i<@UIF; $i++){ my $find = "UIF$i"; my $replace = $UIF[$i]; $response =~ s/$find/$replace/g; } return $response; } } return "Base Case Failure Error!"; }
The Moment Of Truth
Did we do it? Did we resolve our final use case?
Drum roll please…………
Test Case 0 Passed
Test Case 1 Passed
Test Case 2 Passed
Test Case 3 Passed
Test Case 4 Passed
Test Case 5 Passed
Test Case 6 Passed
Test Case 7 Passed
Test Case 8 Passed
Test Case 9 Passed
Test Case 10 Passed
Test Case 11 Passed
Test Case 12 Passed
Test Case 13 Passed
——————–
Passed 14 out of 14 tests
All Tests Passed!
WHOOO! GO US!
Note To Exceptionally Clever Readers
All my readers are clever, but some of you are exceptionally clever. And you may have noticed that switchFirstAndSecondPerson always returns lowercase words even when the original word was capitalized or at the beginning of the sentence. This isn’t a huge problem, but if you’re a perfectionist it might be bugging you to accidentally change “I care about grammar” to “you care about grammar” instead if “You care about grammar”.
One easy solution would be to update DELPHI to capitalize it’s entire output. People are used to computer programs SPEAKING IN ALL CAPS and it saves us the effort of having to actually teach DELPHI anything about proper capitalization.
If you don’t like the caps lock look you could instead update DELPHI to always make sure output starts with a capital. More often than not this is all it takes to make sentence look like real English.
Or you can do just do what I do and ignore the problem. I’m not going to worry too much about the occasional lowercase “you” or “my” unless users start complaining. And since this program isn’t intended for any real users that’s not likely to ever happen. Customer satisfaction is easy when you have no customers!
Conclusion
That’s it! We’ve passed all of our primary use cases. DELPHI is done.
Or is it? If you can remember all the way back to the original design document one thing we wanted out of DELPHI was the ability to generate random responses to questions. DELPHI currently just guesses “yes” to all questions which is both useless and boring. So while we hit a very important benchmark today we’re still not quite done.
* You know what else case insensitive regular expressions would be good for? Making DELPHI more accepting of user input that isn’t properly capitalized. Expect this to happen in a future blog post.