Building Better Bots Using Amazon Lex (Part 1)
As Jeff Barr showed in his introductory blog post, Amazon Lex is a service that allows developers to build conversational interfaces for voice and text into applications. With Amazon Lex, the same deep learning technologies that power Amazon Alexa are now available to any developer, so you can quickly and easily build sophisticated, natural language conversational bots (chatbots
What does it take to develop a functional chatbot using Amazon Lex? If you use one of the examples in the
The basics
Chatbot design is a nascent discipline with few established norms. Only interactions with real users can teach us what’s frustrating and what’s delightful. We recommend that you treat this section as an exploration of design considerations, not as a guide for bot design. Here’s what we’ve learned from the millions of interactions with Amazon Alexa.
Simon Sinek urges us to start with why: Why does this chatbot exist? Good design starts with a clear goal: Who is the user, and what is that user trying to accomplish?
You also need to think about modality and medium. A helpful voice interface should anticipate that there are times when the user isn’t paying attention or simply can’t hear what was said, and offer the ability to repeat the last prompt or handle responses like “What?” or “Where were we?” gracefully. For a text interface, this may not be necessary. Some text interfaces might even support response cards, with images and buttons.
For the interface designer, emphasis is an essential tool. It lays out the norms for presenting information and recommending choices. The rules for achieving clarity vary depending on the mode (web UI, text chat, or voice). Depending on the goal and mode, the tools used for emphasis can differ. Consider these examples of how an order could be placed in each of the three modes:
Would you like me to place this order? (yes/no) | You would like me to order ____. Is that right? | |
Web (Point and Click) | Chat (Text) | Voice (Speech) |
With a web experience, the user experience can emphasize “Place your order” to clearly call it out as the recommended choice and to remind the user that this action is a commitment on the customer’s part. In a chat experience, the convention is to list options in parentheses to let the user know what’s expected. You also can use text placement, whitespace, and capital letters for emphasis.
In a voice interaction, you replace this convention with a new norm that assures the user that she is in control of when the order will be placed. For example, you can use a confirmation when making significant changes to the order. Additionally, with voice, you might be able to control the speech rate, for example, speak slower for emphasis. If no other means are available, you might simply use repetition. Consider this example.
Your order total comes to $55. Would you like me to place this order and charge $55 on the card saved with this account? |
Voice – Repetition for Emphasis |
Enter CoffeeBot
Let’s say that you need a voice bot to support conversations involving uncommon words such as when you order your latté: “May I have a triple mocha please?” Or, if you’re comfortable ordering a bot around, “Get me a triple mocha.”
Using the Amazon Lex console, we create a custom bot. We call it “CoffeeBot” and use an IAM role that has the appropriate permissions to invoke Amazon Lex. To learn how to do this, see the Amazon Lex Getting Started guide.
Lex terminology
As Jeff explained in his blog, Amazon Lex uses intents, slot types, and slots. To promote reusability, intents and slot types are associated with an AWS account and can be used by multiple Amazon Lex bots. As you make changes, you will notice that Amazon Lex automatically tracks a version for these resources so you know exactly what’s used for the particular version of the bot you’re testing. While this might seem insignificant at first, it’s essential to support concurrent and continuous development.
Conversation flow
When starting on a new bot, it’s often helpful to simply record how people normally engage in conversation for a single type of request, analyze those requests, and then extrapolate to a larger scope of requests. For CoffeeBot, let’s focus on how someone might order a mocha at a coffee shop.
Live Conversation | Phase of Conversation | Intents and Alternatives |
Hi – what can I get you today? > Oh hi! Can I get a large non-fat mocha please? |
Greetings & initial request – Beverage type: mocha – Beverage size: large – Creamer: non-fat |
Could start with just the drink type or another combination. |
Would you like that iced? > No thanks. |
Configure – Beverage temp: hot |
Just hot or iced? What are the other values? |
What kind of chocolate? > Dark |
Configure – Chocolate type: dark |
All drinks don’t allow chocolate. Only mocha? |
Any whip? > No, thanks. |
Configure – Whipped cream: no |
Are there different types of whipped cream? Maybe change to a different drink or size? |
Okay, one large single dark non-fat mocha, no whip. > Oh, can you add a pump of toffee? |
Confirm → Configure – Flavor: 1 pump toffee |
Could have just accepted here. Two kinds of flavors? Is there a limit? |
Sure! One large single dark non-fat mocha, no whip with 1 pump of toffee. What’s your name? > Jenny |
Confirm → Label – Name: Jenny |
Store name as preference in app |
Thanks, Jenny! Your drink will be on the right in just a few. That’ll be $4.17. Can I get you anything else? > No, thanks. |
Check out | Charge options for this |
Out of five. Here’s your change. Have a great day! > Yeah – you too! |
Check out → Done |
placeOrder: name, beverageConfig – places back-end Order – send email confirmation? – earn points? |
Real orders are a lot more complicated than this simple mocha example. There might be over a hundred thousand valid coffee drinks. Natural conversations don’t follow a rigid order; the user might change her mind partway through the conversation or skip parts of the conversation. This flexibility of flow is a hallmark of natural conversations.
Consider the case where CoffeeBot hears “double short mocha.” Did the user ask for a “short (8 oz.) mocha with two shots of espresso” or did something get lost while the user was actually asking for a “double shot mocha” without mentioning a size? Validating the input is still necessary. The next post in this series will cover input validation.
When designing conversational interfaces, it’s important to remain focused. Some of the complexity we noted naturally disappears when you start with an app that “knows” who the user is and can handle payment. If it was a particularly cold or wet day, this conversation might have included the weather as a topic. Although it might be fun to model such tangents, that might not be the best use of your time. Focus on the development tasks that substantially enhance the user experience instead of those that are merely nice to have.
Conversation: information
Analysis like this reveals the structure of these conversations. Although the snippets of information might be shared separately or together, in one of many possible sequences, and there might be optional add-ons for certain beverages. At the end of the day, there is a set of values that must be known in order to place an order for a mocha. Accepting that you are catering to an essential subset of what you have discovered, and that these slots will be refined over time using a feedback loop, you start with some slot types for CoffeeBot.
Development teams will create their own conventions for making it easy to find shared resources, but we’ll use “cafe” as a prefix for ours.
Slot Type | Slot Values |
cafeBeverageType | coffee; cappuccino; latte; mocha; chai; espresso; smoothie |
cafeCreamerType | two percent; skim milk; soy; almond; whole; non-fat; skim; half and half |
cafeStrength | single; double; triple; quad; quadruple |
cafeFlavor | vanilla; almond; French vanilla; caramel; hazelnut |
cafeBeverageSize | kids; small; medium; large; extra large; short; six ounce; eight ounce; twelve ounce; sixteen ounce; twenty ounce |
cafeBeverageTemp | kids; hot; iced |
cafeBeverageExtras | half sweet; semi sweet |
Conversation: goal
Next, define a goal, known as an ‘intent’ in Amazon Lex terminology. Although the bot might ultimately use multiple intents, let’s start with just one. (Multiple bots can use the same intent, and potentially different versions of the same intent.)
It’s important to provide some examples of what you might expect users to actually say. Amazon Lex calls these examples “utterances.” This is important because Amazon Lex uses them to train machine learning models to recognize the right intents. This text doesn’t have to exactly match what the user says , and it’s good to start with a small set and then add permutations as needed.
Each utterance refers to one or more slots, and those slots must be defined. Each slot can also be associated with one or more prompts that Amazon Lex uses to elicit the value from the user. The Amazon Lex dialog manager keeps track of the slot values. It can also use the priority of each required slot to decide which one to prompt for next, as follows.
At this point, the bot is ready to place an order for coffee. We will worry about fulfillment in our Part 2 post.
Conversation: test
Next, you build and test the bot. It’s very useful to test what you have so far because it’s easier to find and fix errors at this step than when you have a mobile app with a lot more going on. You can test it in the overlay that appears in the Amazon Lex console.
What just happened? Although we didn’t configure this exact text as an utterance, the input text “I want a mocha” was matched to the cafeOrderBeverageIntent you created and the utterance was interpreted as I want a {BeverageType: mocha}. Then, Amazon Lex determined that BeverageSize was required, and prompted with the default prompt for this slot, namely, What size? tall, medium, large?.
Finally, when all required slots have been filled, Amazon Lex simply displayed the values as requested (“Return parameters to client”).
Conversation: flow
What if Lex doesn’t understand? You can use clarification prompts in the intent editor on the console to try to elicit something different and, failing all else, exit gracefully.
With this configuration, we see that Amazon Lex uses different clarification prompts up to three times and then gives up using one of the hang-up phrases. Why multiple prompts? It’s true that you need just one, but having multiple prompts allows Amazon Lex to choose, which maintains spontaneity.
You also can configure the bot to prompt the user right before the order is placed. The confirmation prompts are listed in intent editor on the console.
Confirmation prompts don’t need to include every single slot that you have set up, especially when some slots are required only if others have been filled. It’s a good idea to include required slots in the prompt, but you can also use a code hook to customize the prompt. (More on that in the next post.)
Notice that the confirmation prompt can include slot values. Let’s see how that works out.
As expected, Amazon Lex determined that the required slots had been filled and presented the confirmation prompt. It also interpreted “nope” correctly and hung up. The second time around, Amazon Lex correctly interpreted “yep” as the affirmative and presented values.
What does this show? With no additional configuration, Amazon Lex correctly interpreted the user’s desire to swap the size, right from the confirmation prompt.
Conclusion
In this post, we looked at some elementary bot design decisions. We started with some raw observations about conversation flow, narrowed in on a particular transaction, and quickly built an interactive bot with some defaults. In the test console, we made sure that the bot behaves as expected.
In Part 2, we look at some more considerations and develop this elementary bot so that it can understand voice.
Note: The code for for Part 1 and Part 2 is located in our Github repo.
If you have questions or suggestions, please leave a comment below.
About the Authors
As a Solutions Architect, Niranjan Hira is often found near a white board helping our customers assemble the right building blocks to address their business challenges. In his spare time, he breaks things to see if he can put them back together.
As a Product Manager on the Amazon Lex team, Harshal Pimpalkhute spends his time trying to get machines to engage (nicely) with humans.
相關推薦
Building Better Bots Using Amazon Lex (Part 1)
As Jeff Barr showed in his introductory blog post, Amazon Lex is a service that allows developers to build conversational interfaces for voice and
The Power of Voice: Amazon Alexa (Part 1)
While undertaking new concept development within the Anthemis Foundry, I couldn’t resist tapping into the possibilities of voice interfaces across industri
Robust Message Serialization in Apache Kafka Using Apache Avro, Part 1
In Apache Kafka, Java applications called producers write structured messages to a Kafka cluster (made up of brokers). Similarly, Java applications c
Lesson 2 Building your first web page: Part 1
appear mage ats ref with display sed emp bare In this ‘hands-on’ module we will be building our first web page in no time. W
Building React Native Projects with Native Code: Part 1
Building React Native Projects with Native Code: Part 1With Expo (now fully embraced by React Native), building plain (JavaScript only) React Native applic
Creating a Personal Chatbot in Python3 using ChatterBot(Part 1)
Before we get started, we need to get all of the necessary pip installations. Open your terminal and run the following commands:Pip installations:pip3 inst
Using TensorFlow.js to Automate the Chrome Dinosaur Game (part 1)
In this blog post, we’ll be learning how to automate the Chrome Dinosaur Game using the neural networks with TensorFlow.js. If you haven’t played it before
How to create role based accounts for your Saas App using FEAN? (Part 1)
Setup firebase in your angular app and express js// Front-endng new exampleAppcd exampleApp && cd exampleApp// For adding firebase to angular appng
AOSP Part 1: Get the code using the Manifest and Repo tool
6 months ago, I moved to New York, the first city I lived in outside of Israel. With a new job at a new place, I decided to also try a new laptop runn
(Bot)ched communication: Why are bots not taking over the internet, part 1
(Bot)ched communication: Why are bots not taking over the internet, part 1As the first industrial revolution plowed its way through the course of history,
Creating visualizations to better understand your data and models (Part 1)
The Cancer Genome Atlas Breast Cancer DatasetThe Cancer Genome Atlas (TCGA) breast cancer RNA-Seq dataset (I’m using an old freeze from 2015) has 20,532 fe
Symbolic Computing Using Python: Part 1-Basics
Symbolic Computing Using Python: Part 1-BasicsSymbolic ComputingThe implementation of algorithms in the Python language using symbolic calculations and an
linux操作系統及命令Part 1
oldboy ont pre 普通 下載 man tro 分隔符 所在 1.關於linux系統的安裝與流程 (1)下載Vmware workstation 與 linux系統(centos版本、redhat版本、Ubuntu版本...)鏡像。 (2)詳細安裝見
開啟Python取經之路-CLASS-6(Part 1)
int code 中標 cnblogs 環境 執行 變量 spa -c 第一個python程序 HELLO WORLD 1 print("hello world") 單行註釋:# 多行註釋:‘‘‘....‘‘‘或者"""....""" 在linux編程中,要在程序中
if else流程判斷-CLASS-11(Part 1)
color failed use log user clas elif 輸入 ger if判斷語句 1 if true: 2 print("true") 3 else: 4 print("false") 猜年齡大小 1 # Author:dd 2 ag
如何兩周達到150行Java程序的能力--part 1
指導 編譯 這也 結構 初始化 private rst 能力 知識點 面向對象程序先導課是體系化面向對象課程的重要組成部分,其目標是幫助那些有一定C語言基礎,但對面向對象概念陌生,基本沒碰過Java編程的同學。該課程設計為暑期選修課,因為沒有其他課程,我們設計為現場訓練性質
數據庫學習-part 1 約束
alter 主鍵 reat sql 關鍵字 alt table use ons 約束--待續 1,分類-實現 類型 關鍵字 建表實現方式 單獨實現方式 註意點 主鍵 primary key create table user (user_id
Writing a Bootloader Part 1
sam zone destroy .org 64 bit rac disk control bin This article series explains how to write a tiny 32-bit x86 operating system kernel. We
Python階段復習 - part 1 - Python基礎練習題
sort 階段 += art cnblogs .so range else 方法 1、實現1-100的所有的和 # 方法1: sum = 0 for i in range(1,101): sum += i print(sum) # 方法2: num
逆向破解 H.Koenig 遙控器 Part 1
body x64 模塊 做了 目前 完成 努力 而且 優惠 逆向破解 H.Koenig 遙控器(Part 1) 最近我正在嘗試一研究些自動吸塵器機器人。iRobot公司的Roomba貌似是該領域的領導者,但是作為實驗來講的話這些東西真是太昂貴了,我也找不到任何的折