Animatronic Talking Tree - Part 2 - Speech Recognition

by fjordcarver in Circuits > Robots

18871 Views, 56 Favorites, 0 Comments

Animatronic Talking Tree - Part 2 - Speech Recognition

comic1.jpg
If you followed along with my first Instructable, Animatronic Talking Christmas Tree, I showed you how to take an artificial tree, add some servos and an Arduino, connect it to a Processing sketch running on your computer, and make him talk and animate.

Now I want to take you a through a few more steps so that you can turn your treebot into an interactive installation. I am not going to walk you through another motion-triggered,  pre-recorded animatronic. Nor am I going to expand on the remote controlling of the tree, I have already shown you some simple techniques for that (In the first Instructable). I am going to show you how to make your Processing sketch recognize verbal commands, and perform both some useful and useless (but entertaining) things. If you follow through successfully you will have made a your push-button-reactive robot into one that listens and responds to your voice, just as a good tree should.

Here's what the tree will do when finished.


Things That You Will Need

DSC03463.JPG
As this is a continuation of the Animatronic Talking Tree Instructable, you should already be in possesion of everything that you will need. For those who are just joining in, you will require
  • One Animatronic Christmas Tree, or suitably serial controlled animatronic. (Maybe you have made a talking teddy, or  grapefruit instead)
  • A computer running Processing, which you already have if you have completed makiing the first item on the list.

Setting Up Voce

voce0.jpg
voce1.jpg
moremem2.jpg
First you will need to go and download the voce library. You can grab it here.

This is a java library and how I got it working in Processing is a little different than installing most libraries.

Start by unzipping it into the folder where Processing is installed into a folder called libraries. You will need to create the “libraries” folder in the newer versions of Processing as libraries are normally installed to the libraries folder in your sketchbook.

Start up Processing, and open a new sketch.

Now open windows explorer and navigate to the folder you just created.

Something like processing/libraries/voce-0.9.1

Now open the Lib folder. Select all of the .jar files (all the files in the folder minus the folder called “gram”), and drag them into your new sketch.

You should get a message saying 10 files have been added down in the debug window.

Finally we will need to increase the memory available for Processing as the voce library is demanding. Click on File/Preferecences and increase the available memory to 256 MB.

If you did not follow the first Instructable, go read it now, we will be building on top of what we have built, so you may be a little lost if you don't give a once through.

Start Sketching

comic2.jpg
We will start with nearly the same sketch that we finished with in the first Instructable. The key press and mouse press functions have been left out as we will now be working on getting the tree to react to our voice.

Type the following into the new sketch. (Alternatively you can grab the text file “voce1.txt” and copy/paste it into your sketch)

//import the libraries
import guru.ttslib.*;
import processing.serial.*;

//give our instances names
Serial treePort;
TTS tts;


//A string for holding things to say
String message = "Ho Ho Ho";


void setup(){
  //the following initiates the voce library
  voce.SpeechInterface.init("libraries/voce-0.9.1/lib", true, true,"libraries/voce-0.9.1/lib/gram","digits");
  //start our port and also tts
  treePort = new Serial(this,Serial.list()[0],9600);
  tts = new TTS();
  //the following settings control the voice sound
  tts.setPitch( 180 );
  tts.setPitchRange( 90 );
  //tts.setPitchShift( -10.5 );
  treePort.write("73");   //send command to turn on the lights and open the eyes
}

void draw(){
 
  if (voce.SpeechInterface.getRecognizerQueueSize()>0){    //if voce recognizes anything being said
      String s = voce.SpeechInterface.popRecognizedString();      //assign the string that voce heard to the variable s
      println("you said: " + s);                          //print what was heard to the debug window.
      respond(s);
    } 
  
}

You will notice that I am not using the mBrola voices any longer, I found that they were conflicting with voce. By tweeking the pitch, pitchrange, and pitchshift you can work out a voice that is similar. I understand that these are not Siri quality voices, but it is still a nice way to have your project talk. I feel that robots sound good when they sound like robots, but then that is just me.

You will also notice that we do not need to import the library, we did that directly by importing the jar files.

The structure of the initialization call in setup is as follows;

(location of library files, boolean for speech generation, boolean for speech recognition, location of the grammar files, and the name of the grammar file.)

Before we can start dealing with more dynamically generated speech, we will need to write a small function that takes care of dynamically animating the tree when it is speaking.

Go down to the bottom of you sketch, beneath the draw function, and add the following.

//This function will split the text up into multiple words, and decide how to animate depending on the length of each word and also pauses which are denoted by "!"
void respond(String input){
  if (input.length() > 0){  //we actually have something to say 
  voce.SpeechInterface.setRecognizerEnabled(false);    //stop listening, otherwise we will hear ourselves and go into a loop
  //this just splits up all the words sends motion
  String[] words = split(input," ");
  int howMany = words.length;
 
  for(int i=0;i<howMany;i++){
    String pieces[] = split(words[i],"!");  //if we see a ! then reading pauses slightly so it is a good time to blink
    if(pieces.length==2){
    treePort.write("1");
    int pause = int(random(100));
    if(pause>60){
      treePort.write("5");
    }
    else{
      treePort.write("7");
      delay(500);
    }
    }
    else{
      treePort.write("1");
    }
   
  }
  tts.speak(input);
  voce.SpeechInterface.setRecognizerEnabled(true);
}
}

 
This will basically animate your tree depending on the string that it is currently processing. This is a simple attempt. You could take it as far as you like by adding more custom movements over on the Arduino side and parsing the strings further for more accurate syncing.
 
If you run this sketch now, your robot should be able to recognize the words “one”,”two”, “three”, “four”, “five”, “six”, “seven”, “eight”, “nine”, “zero”, and “o”, and repeat them back to you when it hears them. Go ahead, run the sketch and give it a try. Be patient, it takes a minute (figuratively) to load up and another couple of seconds before the microphone is turned on and properly listening.

Another thing to take note of is how the recognizer tries to make sense out of nearly all the speech it hears, and so will blurt out numbers no matter what words you are speaking. Don't worry, we will deal with that in the following steps. 

Downloads

Grammar Files

gramfile.jpg
At this point you might be wondering why the sketch is only recognizing digits. Perhaps you took note of the fact that we referred to the digit grammar file in this line of code.

So what is a grammar file?

Even though voce has the ability to recognize around 120000 words, for most purposes only a few commands will be required. The words that are defined in a grammar file are the words that will be recognized for an application. So for each program, or sketch, that you create using voice will require a grammar file.

Grammar files for voce conform to the Java Speech Grammar File (JSGF) format.

The digits grammar file that we used is the example .gram file that wasincluded with the download. Lets have a look at a simple .gram file.

grammar clothes;
public <clothesTypes> =  (pants | shirt | socks);

In this example, the name of the grammar would be clothes, so we would also name our file clothes.gram. The grammar rule clothesTypes would be satisfied when the recognizer would hear any of the words int the list (pants, shirt, socks).

That is a basic grammar file. We will be using a slightly more complex one, but it is actually not required for a project of this scope. It just gives us a chance to build decent .gram file so we understand the structure when we want to implement it into something more complex.

Open up notepad, or your favourite text editor, and type the following. (or grab it )

#JSGF V1.0;

/**
* Grammar File example for Animatronic Tree
*/

grammar tree;

public <tree> = <address> <request> <requestTypes>;

public <vocabulary> = (<address> hello | hello <address>| thank you) * ;

public <extra> = (know | how | why | who | you | hoo | shoo);

<address> = (tree);

<request> = (tell | get the | what);

<requestTypes> = (a joke | weather | time is it | day is it);


Now lets have a look at our definitions.

We start with the name of the grammar, in this case tree. We then have a grammar rule <tree>  which requires three conditions to be met. Address, request, and request type. Lets scroll down and look at those now.

Address, would be the word “tree”, which in my case is what I call my tree, or address him as. You can change this to whatever you like so long as it is in the dictionary of known words. There are a few regular names in the file, but for this project I liked tree. A definition can also be a phrase, so you can get creative, I used “Skull do we know” as a name for another project. (His proper name was Skullduino)

Request and request type define how to ask for something, so the structure when we talk to the robot will be `Tree, request, requestType`. As you can see I put in a couple of ways to say things.

If we go back up to the two rules that we glossed over, we will see first a vocabulary rule, Satisfied for some parts with the address, and some without. For greeting the tree and saying thanks.

Then we have extra, which are just some words that are likely to be said between jokes. We put them in not so much to react to as to hear something between question and punch lines of jokes. If we did not include these `buffer`words, the recognizer would sit around waiting to hear one of the commands or structures we have in place before finishing a joke.

Save your grammar file in the gram folder in the libraries folder that we created earlier. Call it "tree.gram". 

Change the pointer from "digits" to "tree" in your voce initiation code, notice that we don`t write the .gram, and try it out. Your sketch should now be attempting to recognize the words we defined rather than the numbers from before.

//the following initiates the voce library
  voce.SpeechInterface.init("libraries/voce-0.9.1/lib", true, true,"libraries/voce-0.9.1/lib/gram","tree");

Downloads

comic3.jpg
Now that we have our tree recognizing some words and command structures, we can start creating some functions that will animate him dynamically.

Lets start with some simple dynamic information, the time, and the day.

First, comment out or remove the line...

respond(s);

from the if statement in your loop function. To comment it out just precede it with “//”, like this...

//respond(s);

We don't really want to make a parrot, but rather a tree that seems a little bit smart.

Add the lines that are bold to your loop function.

void draw(){
 
  if (voce.SpeechInterface.getRecognizerQueueSize()>0){    //if voce recognizes anything being said
      String s = voce.SpeechInterface.popRecognizedString();      //assign the string that voce heard to the variable s
      println("you said: " + s);                          //print what was heard to the debug window.
      //respond(s);
      if(s.equals("tree what time is it")){
          getTime();

     
        }
if(s.equals("tree what day is it")){
          whatDay();
     
        }

    } 
  
}

What this does is check to see if the string contained in s is equal to our comparison strings, and if it is, then a call is made to a function called getTime() or whatDay().

Scroll down to the bottom of your sketch, and add the following getTime() function, which will get the time, assign it to a String variable and then send it to our respond() function.

// Function for getting the time
void getTime(){
  
   int m = minute();  // Values from 0 - 59
   int h = hour();    // Values from 0 - 23
   boolean dn = false;
   String time;
   String daynight = "Ay em";    //A.M. is read as a single word with regard to our animation function so we cheat here.
  
    
   if(h>12){
     dn = true;
     h = h - 12;
      
     daynight = "pee em";  //P.M. is read as a single word with regard to our animation function so we cheat here.
   }
   if(h==0){
         h=12;
       }
  
   if(m<10){ //if minutes are less than ten, process it to sound natural, we don't say 5 zero one pm
      if(m==0){
        time = "It is now " + h + daynight;      //if minutes are at zero just say 5 pm
      }
      else{
       time = "It is now " + h + "! oh " + m + daynight;    // else lets say oh instead of zero
       println(time);
     }
}
  
   else{      //if minutes are greater than ten just say them normal
   time = "It is now "+ h + "! " + m  + daynight;
   println(time);
   }
  message = time;
  respond(message);
}

And now lets add code to ask what day it is as well.

First we will add a line to the declarations section of our sketch. Remember, that is the code before our setup function, at the top of the sketch.

//gregorian calendar for determining the day
GregorianCalendar gcal = new GregorianCalendar();

Add the following function to the bottom of your sketch.

//get the day of the week
void whatDay(){
  int week = gcal.getActualMaximum(Calendar.DAY_OF_WEEK);

   println("Day of week: " + week);

   int first = gcal.getFirstDayOfWeek() ;

   switch(first){
     case 1:
       println("Sunday");
       respond("Sunday");
       break;
     case 2:
       println("Monday");
       respond("Monday");
       break;
     case 3:
       println("Tuesday");
       respond("Tuesday");
       break;
     case 4:
      println("Wednesday");
      respond("Wednesday");
     break;
     case 5:
       println("Thrusday");
       respond("Thursday");
     break;     
     case 6:
       println("Friday");
       respond("Friday");
     break;
     case 7:
       println("Saturday");
       respond("Saturday");
     break; 
}
}

Save your file, and try running it. You should be able to ask your tree for the time and what day it is if you follow the correct structure and speak clearly.

You need to say any of the comparison strings we put in our code, like “tree, what time is it?”.

It is completely acceptable to speak your punctuation, the program is indifferent to it.

Downloads

Reading a Newsfeed

comic5.jpg
weather.jpg
Now to add another fun and slightly more useful feature, reading from a news feed. To keep things simple, I have just included code to read a simple weather feed, furthermore it is only retrieving the current weather conditions. This was more done as a proof of concept than as a robust weather forecasting robot.

I will not go into great detail about rss feeds, there is loads of information available out there. Simply put, we will be readingdata in from  an .xml file and parsing the text for the text we require.

Add this line to the declaration section

//the newsfeed to load
String url = "http://rss.theweathernetwork.com/weather/caon0696";

You will need to change the feed to represent your city. Search the weather network for your city and copy the city code at the end of the address bar once you've found it.

Now add an if statement beneath the others in the loop function defining the command words that will call the getWeather function.

if(s.equals("tree get the weather")){
          getWeather();
     
        }

Finally add the following function to the bottom of your sketch.

//get the weather
void getWeather(){
  String currentWeather;
  //load the feed
XMLElement rss = new XMLElement(this,url);
XMLElement[] titleXMLElements = rss.getChildren("channel/item/description");

String weather = titleXMLElements[0].getContent();
int index = weather.indexOf(",");
currentWeather = weather.substring(0,index);
index = weather.indexOf("&");
String temp = weather.substring(index-2,index);
int minus = temp.indexOf("-");

  currentWeather = "the current weather is " + currentWeather + "! , with a temperature of " + temp + " degrees celcius";

println(currentWeather);
message = currentWeather;
respond(message);
}

This function just loads the feed and parses the returned text in the String weather to return only the actual weather. Then a new String is composed to sound more natural when finally it is spoken back to us.

Go ahead and give it a try. Then join me in the next step where we will look at loading text.

Downloads

Loading Strings From Text Files

comic6.jpg
By loading our answers from text files, we will be able to have the tree provide different answers to the same questions.

We will start with a simple greeting. By loading a random greeting from a text file we can have a theoretically unlimited number of ways that the tree can respond to a simple 'hello'. Ours won't be unlimited, but you can put as many as you like.

So lets start by adding yet another function to our sketch. Once again down beneath everything and enter the following.

//generic get answer...loads a line from file
void getAnswer(String fileName){
String lines[] = loadStrings(fileName + ".txt");
int index = int(random(lines.length));  // same as int(random(4))
println(lines[index]);  // prints one of the lines from greetings.txt
message = lines[index];
respond(message);

}

 
This simple little function will allow us to load a random line from a specific text file, which we specify when we make a call to the function.

Before we can use it, we need to create a text file and place it in our sketch folder.

Open up Notepad and create a new file. Type a bunch of greetings that you would like to hear your robot respond with, each being followed by a line return. Like this.

Hello
Hey
Howdy
Ho Ho Ho
Hello Merry Christmas!
oh hello


Put as many zany entries as you like. I kept mine pretty tame, but feel free to give your tree some character. Some times you may want to use your own knowledge of how the program is running and include a few exclamation points for some blinking or a pause. Once you have a few entries, save it to the same folder as your sketch and give it a name that you can remember. I called mine "greetings.txt"

Now we just call the function specifying our text file when we want a greeting. Add this if statement to your code with the others.
 
if(s.equals("hello tree")){
          getAnswer("greetings");
     
        }
Using the same function, we can easily add other functionality. Just create the text file for the robots responces, add the required words to the .gram file, and call the getAnswer() function specifying the correct text file.

Let's add a response for whenever the tree hears the words 'Merry Christmas" and also "Thank you", because I find myself thanking the tree anyways. Good habits die hard?

So first we whip off two text files. I am calling the first one "christmas.txt" and filling it with Cheerful Christmas messages.

Merry Christmas! 
Happy Holidays!
Look how cheerful my balls look
Bling! I am a Christmas Tree


The other I am calling "thanks.txt" and filling with thanks.

Thank you for making me feel loved
you are welcome
I aim to please
It is a Christmas thing
If I was not stuck in this pot I would kiss you
kiss me
I am alive
no problem


Now just add to your vocabulary in your tree.gram file, like this,

public <vocabulary> = (<address> hello | hello <address>| thank you | merry christmas) * ;
 
and add the calls to the getAnswer() to your recognition handling code.
 
if(s.equals("merry christmas")){
          getAnswer("christmas");
    
        }

if(s.equals("thank you")){
          getAnswer("thanks");
   
        }

 
That is all there is to it. Go ahead, save your file and give it a try. Your Animatronic Christmas Tree should be turning into a regular little chatterbox. My tree is quickly becoming my plastic pal that is fun to be with.
 

Telling Jokes

comic7.jpg
I am using a simple example of telling a joke to get you started in some more complex conversation. Telling a joke is more complex in the sense that there is a small amount of back and forth. the robot needs to pose the riddle, wait until the "user" supplies an answer, and then give the punchline.

We will be employing the techniques from the last step in more creative ways.

This time we will start with the function and then create our text files.

Add the following function.

//tell a joke
void tellJoke(){
  boolean joking = true;
  String lines[] = loadStrings("jokes1.txt");
  int index = int(random(lines.length));  // retrieve a random joke from the file
  println(lines[index]);  // print to debug
  //voce.SpeechInterface.setRecognizerEnabled(false);
  respond(lines[index]);
  //4delay(3000);
  String s = voce.SpeechInterface.popRecognizedString();
    println("You said: " + s);
    //voce.SpeechInterface.setRecognizerEnabled(true);
    while(joking == true){
   if(voce.SpeechInterface.getRecognizerQueueSize() > 0){
      String answers[] = loadStrings("jokes2.txt");
      println(answers[index]);
      respond(answers[index]);
      joking = false;
    }
    }
     
 
}

Notice that while we tell the joke we have a boolean value, joking, set to true until the joke is finished. That is how we keep Tree attuned to what it is he is doing. Also you will notice that we first read a line from jokes1.txt and then retrieve a matching line from jokes2.txt for an anwser. It is as simple as that. You can apply this simple logic to a number of simple conversational 'hooks' to keep someone engaged.

Add this if() statement to your loop() function along with our other ones.

if(s.equals("tree tell a joke")){
          tellJoke();
     
        }



Just add the two text files to your sketch folder along with the others. "jokes1.txt" for the questions to your jokes, "jokes2.txt" for the answers.

By this point, your Animatronic Tree is just like mine, so congratulations for making it this far. If you have been reading this as an entry point into voice controlling your own project, then I hope that it has been informative.


Vote for me!!


Share and enjoy!