In December, Amazon announced Amazon Lex, and we’ve been excited about building an Amazon Lex chat-bot inside a mobile app ever since. I’ve even put together a quick tutorial below about how to let consumers talk to your mobile app by typing or voice.
But let’s back up a few steps. First, what is Lex? It’s a new service inside the Amazon Web Services (AWS ) platform that allows developers to take advantage of the same features used by Amazon Alexa, the popular consumer device that lets people control everything from their lights to music with simple voice commands.
Lex gives mobile developers like us the ability to create sophisticated chat-bots right inside an Android or iOS application. We’re able to tap into the brains of Lex, which contain much more than speech-to-text translation. Lex tries to understand user input in meaningful ways, so an application can return appropriate results without much work. It also integrates with other AWS services, such as as lambdas, gateways, and ec2 containers to give feedback to the user.
So how do you get started building an Amazon Lex chat-bot?
Building Blocks of a Chat Bot
At first glance, building a chat bot seems rather complicated. Taking any kind of user input and returning something meaningful back seems challenging. But the goal should be handling simple amounts of data to quickly complete a task quickly for a user. Anything more and a user probably will be overwhelmed. Simple is best for these bots.
Where to start with a bot? Determining a bot’s intents is the first step. An intent in a chat-bot is defined as “the goal the user wants to achieve”. Let’s use a pizza ordering chat-bot as an example. One intent could be defined as “Ordering a Pizza” while another intent could be “What Toppings Are Available”. There can be many intents per bot. Intents should be kept basic, small, and well defined to a goal so they don’t over complicate a user’s experience.
An utterance is a spoken or typed phrase that invokes the intent. “I’d like to order a pizza” or “Can I order a Meat Lovers pizza for delivery?” These are phrases that would invoke the “Ordering a Pizza” intent. By figuring out the user’s desired intent from an utterance, the intent can now be used to ask necessary questions to fulfill the goal of the user.
Now that the intent has been defined, as “Ordering a Pizza”, the chat bot needs to fill in all the necessary data slots. Slots are an input, a string, date, boolean, number etc that are needed to reach the goal of the intent. The chat bot will need to be smart enough to figure out what questions to ask the user in order to satisfy all the slots. For example, the “Ordering a pizza” intent might need to ask the following to complete the intent:
- What type of pizza?
- Meat Lovers
- Delivery Address?
- Address Input Type
- Payment Method?
- Credit Card
Here is an example text chat bot on how the bot fulfills the slots:
User: I’d like to order a pizza
Bot: Ok, what kind of pizza would you like to order?
User: I would really like to try out your Meat Pizza please!!!
Bot: Ok great! One Meat Lovers pizza has been added to your order.
Bot: Where should I deliver the pizza?
User: 5155 Financial Way
Bot: I’ll deliver to 5155 Financial Way Mason, OH 45040
Bot: How do you plan on paying for the pizza?
Bot: Great you’re all set! Your order will be delivered in approximately 30 minutes.
Each slot has a name, slot type, version, a prompt, and is it required. The prompt is what Lex will use to ask the user for the correct input. The slot types are the valid values a user can respond with. Slot types can be either custom defined or one of the Amazon built in values. Notice in the example the use of the built in StreetAddress slot type. Lex can help determine and sanitize a user’s address without any code. The pizzaType and paymentMethod slot types are defined on another screen and are simply lists of string values that are valid responses.
What Lex Provides
Lex is the engine that has to figure out what the user wants. The user says they want a “Meat Pizza”, but it is up to the Lex bot to translate that to “Meat Lovers”. The lex bot also must ignore unimportant input and recognize multiple slot responses. For example, the user might have said, “I’d like to pay cash for a pepperoni pizza please”. This processing is up to Lex to figure out that the user actually fulfilled two slots of data input with one voice/text input.
Business Logic Takes Over
Now that all the necessary data is gathered from the chat-bot, it can just be passed over in a normal HTTP request, or lambda function to be processed. For example, if the application server had a URL endpoint, there might be three parameters: pizzaType, deliveryAddress, and paymentMethod. The slots setup from the intent can now be used to execute that method. This can be done within the AWS console by passing data to a Lamda function, or the parameters can be returned to the client application that then calls a REST endpoint.
Integrating AWS Lex with an iOS or Android application
AWS provides libraries for iOS and Android dev to integrate with their services. With the announcement of Lex, AWS also provided client libraries for the bot service. In this blog post, an iOS example of Lex integration will be demonstrated, but the concepts for Android are similar.
Configuration - iOS/Swift
The above code sets up our application to talk to our AWS account(the CognitoIdenityId is specific to an AWS account). Once authentication and region configuration is complete, the app can integrate text or voice bot services.
Invoking the Bot with Voice or Text
There is actually very little work to do for the app itself. Simply invoke the bot with either text or voice and handle the output parameters when the user is done.
Voice Chat Button - View Controller
Use AWSLexVoiceButton and the corresponding AWSLexVoiceButtonDelegate for output callbacks
The above code simply links a button from the storyboard and provides success/error callbacks with the bot’s response.
A text view controller works similar to a voice view controller. To send text to a bot, use AWSLexInteractionKit. It has several methods on it to accept text input and respond with text output or even audio output. This class will also help store state, such as the current slots filled by the user’s input and the intent being invoked.
While Amazon Lex is still in BETA, it is becoming increasingly easy to create chat bots within a mobile application. This can be one more way for an app or product to reach consumers in a meaningful way.