Skip to main content

Introduction to Rasa: the NLU chatbot framework

                       Introduction to Rasa

                                       Written by: shyambhu mukherjee

Motivation:

rasa chatbot introduction rasa image chatbot image

With the onset of 2021, I planned to up-skill in chatbot creation. For chatbot creation, there are a number of frameworks available; such as Dialogflow, RASA and others. I already wrote about what are chatbots and created a small appointment scheduler chatbot using Dialogflow by following google's developer course on the same. If you don't know what a chatbot is, read the above linked article first; and then continue in this post. 

Summary of the article:

In this article, we will first describe what Rasa is and what a normal chatbot anatomy looks like. Then we will quickly iterate over a few concepts related to chatbot framework. Finally we will document our process to create a health care chatbot by providing step by step guide to install, initialize, train and deploy a rasa bot in a linux machine. By the end of this post, you will be able to create simple chatbots using rasa. For advanced concepts, I will write a more advanced article in coming days and the link will be available in the conclusion of this article.

What is Rasa?

Rasa, like many other current upcoming open source tech companies, is a tech company focusing on conversational AI platform creation and enabling companies to create better and advanced chatbots leveraging the state of the art natural language understanding and dialogue management frameworks. It was founded by on 2016 by Alex Weidauer. Rasa is also the open source github repository which enables the whole tech behind rasa. It has more than 10k stars, as well as 450+ contributors. So when you talk about rasa in tech concept, you will probably refer to the rasa repository. 

Now that we have a bit of idea what is Rasa; let's quickly see what is chatbot and what all do we need to learn about it as a whole.

What are chatbots and what is there to learn about it?

Chatbots are basically software programs designed for chatting with humans like a human; to serve in different manners. In general, chatbots are used in customer service, healthcare for patient handling and other things, and, many other scenarios. In recent time, using AI, chatbots are also used as companion of humans, as well as, to impersonate some famous persons in places. 

In case of chatbots, there are basically three parts. The first part is the front part, which takes the user responses. The second part is a natural language understanding engine; which processes user responses, detects intents and entities from the user responses and provides to the third part. The third part, is the third-party handling of user requests and information. In this part, the entities and information from user response and interaction is provided to a API or a service, and then, the processed responses are used back in creating a response. 

Now, you may want to know, what is intent and what is entities. Intent refers to the goal or aim of the user response. For example, a "hi there" message means a introductory/greeting intent. But "schedule an appointment tomorrow 4pm" refers to appointment scheduling intent. It is necessary to track the intent of user response, as to generate correct responses to the user by the chatbots. 

Entities are roughly speaking, important keywords, present in user responses, which represent values worth storing from the responses and often are needed to generate the chatbot's response back. This includes names, organizations, locations, times and other types of words/phrases mentioned in user responses. Using machine learning models like named entity recognition models are used in detecting entities from user response. 

Now, let's talk about what are the things you need to therefore create a chatbot. 

Training items of chatbot:

Chatbots basically need 3 main action to do. 

(1) what is the intent of the user response

(2) what are the entities present in the user response

(3) given intent and entities, what action the chatbot should create.

Therefore, for creating any chatbot, we need to first imagine all the possible scenarios of chatting. For this, often the developer will follow "wizard of oz" technique. Just like in the wizard of oz movie, behind the curtain the wizard was just a man, similarly, the developer will sit in place of the chatbot, and emulate chat with users; to get idea of what all the conversations can be. For a more complex bot, there will be many paths of conversation and therefore extensive dialogue management is also needed.

Once the dialogue scenarios are created, one needs to detect what all intents and entities in each of these paths are needed to capture; as well as the developer will have to then write response actions. 

So this is how a chatbot creation will work on the top. Now that we have a good hold of the abstract process, we will go into actually creating a sample rasa bot.

Installing rasa in system:

rasa can be downloaded as a python library. Rasa as a python software, supports python 3.6 and 3.7 only. It is recommended to create a virtual environment while creating a new project. You can create one by writing 

python3 -m venv <env_name> 

or by writing python3.6 -m venv <env1_name>

Now, the fun of virtual environment is that it doesn't come with any python packages installed. So let's install the rasa now by typing:

pip install rasa

Once this is installed; we can start working with rasa.

The first rasa chatbot:

We will build our first rasa chatbot; which is a moodbot, by just writing the following command in the terminal:

rasa init

Once you press that, rasa creates a whole bot, with proper training of the models inside it and configuration setting and other things. Check out the code from the terminal.

 

Now, you will have to  press a bunch of yes here and there, before you come and face the question:

"? Do you want to speak to the trained assistant on the command line? 🤖  "

You will answer 'yes' to talk and experience the bot from terminal. Once you do that, rasa connects the bot and initiates a rasa server which enable the bot to talk with you. Check the relevant conversation with our bot below.


Basically the bot does only three things:

(1) it greets you

(2) it asks you your mood; and returns with a tiger cub's imgur image link if you sound sad.

(3) if you sound happy, then it returns "great! carry on".

Finally when you are done talking with the bot, you can type "/stop" and the server is killed as well the bot is stopped.

If you have learned react, then you will realize that it is similar to the first create react-app thing. 

Into the rasa project structure:

In this section we will explore the code and file structure of the rasa bot created above and therefore dissect a generic rasa project's architecture and also realize how the projects work in general. 

Open the Rasa project directory using visual studio or pycharm ide. Both allows you to look at the hierarchical structure of the project. On a top level, the rasa project contains:

(1) actions (2) data (3) models (4) tests and the following files

(5) config.yml (6) credentials.yml (7) domain.yml and (8) endpoints.yml

Clearly, 

actions contain the actions on fulfilllment level the bot can take.

data contains the following files: (a) nlu.yml (b) rules.yml (c) stories.yml

nlu.yml contains the training data for intents. rules.yml contains the rules the bot must obey.

stories.yml contain the dialogue management paths or stories; which the bot is supposed to take. Each stories contain story names, which contain steps. steps include action and intent one by one; which basically represent a dialogue.

models contain the zipped version of the models trained from the rasa model.

For now, we will not get into detail of the config, credentials and endpoints file. The domain.yml contains the all of the important information for the bot, such as all the intents, and all the responses. 

How to create new intents and train the bot for them?

To add new intents, we have to add new item to 3 places. We need to add intent to the nlu.yml files. In the nlu.yml file, you have to add intent name and the training data related to it. Also, in the domain.yml you will have to add the new intent name. To the responses, you need to create response for the new intent. Finally in the stories.yml, add new stories, by linking the new response from responses and the new intent. 

After adding all that information; you will have to put "rasa train" in terminal and run the command to train the model on new intent and training data. And voila! your new intent will have set correctly.

The rasa commands:

So in summary, there are a number of rasa commands we are using when interacting with rasa. Here is a small list of these commands and what are their uses:

(1) rasa init: begins a rasa moodbot chatbot project.

(2) rasa train: trains the NLU model using rasa. 

(3) rasa shell: starts the open-source server from rasa and starts a chat session in the terminal. This session ends with "/stop" input from the user.

(4) rasa interactive: starts an interactive chat session where the bot works interactively, i.e. it makes each decision and then let's you select whether that is correct or not; and the bot keeps learning in this way. To end this session, you have to press ctrl+c.

Conclusion:

So in this project, we have explored what is rasa, what is chatbot and then we installed rasa in our local, and created the mood bot. We also explored how a normal chatbot project looks like, what are its component and how to train it for custom intents. With all these knowledge, you can now go ahead and train the health bot which is taught in the basic course mentioned in the reference. Thanks for reading the first post of our rasa series. Stay tuned for the next posts in the rasa series, where we will discuss more complex chatbot features like forms, incorporating actions and others.

References:

Rasa for beginners

Comments

Popular posts from this blog

Tinder bio generation with OpenAI GPT-3 API

Introduction: Recently I got access to OpenAI API beta. After a few simple experiments, I set on creating a simple test project. In this project, I will try to create good tinder bio for a specific person.  The abc of openai API playground: In the OpenAI API playground, you get a prompt, and then you can write instructions or specific text to trigger a response from the gpt-3 models. There are also a number of preset templates which loads a specific kind of prompt and let's you generate pre-prepared results. What are the models available? There are 4 models which are stable. These are: (1) curie (2) babbage (3) ada (4) da-vinci da-vinci is the strongest of them all and can perform all downstream tasks which other models can do. There are 2 other new models which openai introduced this year (2021) named da-vinci-instruct-beta and curie-instruct-beta. These instruction models are specifically built for taking in instructions. As OpenAI blog explains and also you will see in our

Can we write codes automatically with GPT-3?

 Introduction: OpenAI created and released the first versions of GPT-3 back in 2021 beginning. We wrote a few text generation articles that time and tested how to create tinder bio using GPT-3 . If you are interested to know more on what is GPT-3 or what is openai, how the server look, then read the tinder bio article. In this article, we will explore Code generation with OpenAI models.  It has been noted already in multiple blogs and exploration work, that GPT-3 can even solve leetcode problems. We will try to explore how good the OpenAI model can "code" and whether prompt tuning will improve or change those performances. Basic coding: We will try to see a few data structure coding performance by GPT-3. (a) Merge sort with python:  First with 200 words limit, it couldn't complete the Write sample code for merge sort in python.   def merge(arr, l, m, r):     n1 = m - l + 1     n2 = r- m       # create temp arrays     L = [0] * (n1)     R = [0] * (n

What is Bort?

 Introduction: Bort, is the new and more optimized version of BERT; which came out this october from amazon science. I came to know about it today while parsing amazon science's news on facebook about bort. So Bort is the newest addition to the long list of great LM models with extra-ordinary achievements.  Why is Bort important? Bort, is a model of 5.5% effective and 16% total size of the original BERT model; and is 20x faster than BERT, while being able to surpass the BERT model in 20 out of 23 tasks; to quote the abstract of the paper,  ' it obtains performance improvements of between 0 . 3% and 31%, absolute, with respect to BERT-large, on multiple public natural language understanding (NLU) benchmarks. ' So what made this achievement possible? The main idea behind creation of Bort is to go beyond the shallow depth of weight pruning, connection deletion or merely factoring the NN into different matrix factorizations and thus distilling it. While methods like knowle