Thinking for Voice: Design conversations, not logic

阿新 • • 發佈：2018-12-30

What if, instead of focusing on what our program does, we focus on what our users are saying when they talk with us. This conversation first approach can more easily uncover some of the issues we saw at the end of the last session. Furthermore, it keeps things fairly simple, since our design can focus on things one at a time.

We can restructure that flowchart as a number of conversations. We’re just going to go through a few conversations that are on the “happy path”, but keep in mind that a real design would have many more of these scripts exploring other things that might happen. Later on, we’ll annotate the conversation to see what we do at each point.

Script 1:

User:  Talk to number or letter
Agent: Welcome! Would you like to hear a number or letter? 
User:  How about a number
Agent: 1 is the loneliest number. Would you like another?
User:  Yes
Agent: 2 is the only even prime. Care for another?
User:  No
Agent: Ok, would you like to hear a number or letter?

Script 2:

User:  Talk to number or letter
Agent: Welcome! Would you like to hear a number or a letter?
User:  Number, please
Agent: 42 is the answer. Another one, so to speak?
User:  sure
Agent: 21 is blackjack. Care for another?
User:  nope
Agent: Ok, care for a letter or a number?

Script 3:

User:  Talk to number or letter
Agent: Howdy! Would you like a letter or a number?
User:  Let's go for a number
Agent: A googol is 10 to the 100th power. Ready for another?
User:  How about a letter this time
Agent: The letters "A" and "I" are also words. Want another?

There are a few points to learn from this:

People respond in many different ways when asked a question. The tools we will use collect many sample phrases, plus mix in some ML algorithms, to match what the user says to how they can reply.
In real conversations, users can take the conversation in any direction. So although we’ve asked a yes or no question, they may try to take the conversation in a whole different direction instead.
How we reply depends on two things:
(1) What state we were in and
(2) What the user says
A consequence of point (1) above is that we should keep track of the user’s state to determine what we say, so the new state becomes part of the reply, even if the user doesn’t see that.

We commonly break these conversations down by what the user says. These are called Intents, since they reflect what the user is intending to say or do, rather than the specific words they might use. Intents are never about what our code will do, instead, they focus on one of the most important points in any conversation — what the other person is saying to us.

With the above points we learned, let’s add a little more information to the conversation, breaking it down by what the user says at each stage. We will add what Intent might get matched and then what our code would do — both in terms of state set and the reply sent.

Script 1:

User:  Talk to number or letter 
Match: intent.welcome 
Logic: Set replyState to "prompt" 
       Pick a response for the current replyState ("prompt") 
         and the intent that was matched ("intent.welcome") 
Agent: Welcome! Would you like to hear a number or letter?

User:  How about a number 
Match: intent.number 
Logic: Set replyState to "number" 
       Pick a response for the current replyState ("number") 
Agent: 1 is the loneliest number. Would you like another?

User:  Yes 
Match: intent.yes 
Logic: Pick a response for the current replyState ("number") 
Agent: 2 is the only even prime. Care for another?

User:  No 
Match: intent.no 
Logic: Set replyState to "prompt" 
       Pick a response for the current replyState ("prompt") 
         and the intent that was matched (not "intent.welcome") 
Agent: Ok, would you like to hear a number or letter?

With this, we can see that our replies are based on a combination of the current state and the user’s intent. (Our state could be more complex, to keep track of what the user has heard, how many times they’ve visited, etc. This is very simplified.)

We also see that “yes” doesn’t change the state. It doesn’t need to. Some of our processing doesn’t care what intent triggered that, and that’s ok. Others might vary the conversation slightly by saying “welcome” only when we start the conversation.

Finally, we’ll also note that we’ve combined our replies (the information and the prompt) into a single thing sent back. This is how back and forth conversations typically work we reply only when our conversational partner says something first.

If we look at script 2, we’ll see it plays out identically:

User:  Talk to number or letter 
Match: intent.welcome 
Logic: Set replyState to "prompt" 
       Pick a response for the current replyState ("prompt") 
         and the intent that was matched ("intent.welcome") 
Agent: Welcome! Would you like to hear a number or a letter?

User:  Number, please 
Match: intent.number 
Logic: Set replyState to "number" 
       Pick a response for the current replyState ("number") 
Agent: 42 is the answer. Another one, so to speak?

User:  sure 
Match: intent.yes 
Logic: Pick a response for the current replyState ("number") 
Agent: 21 is blackjack. Care for another?

User:  nope 
Match: intent.no 
Logic: Set replyState to "prompt" 
       Pick a response for the current replyState ("prompt") 
         and the intent that was matched (not "intent.welcome") 
Agent: Ok, care for a letter or a number?

In fact, if you look at the “Match” and “Logic” parts, they are identical conversations as far as the program is concerned. The only difference is the exact words the user used and the exact phrases we replied with. They are, fundamentally, the exact same conversation.

How does this play out in script 3?

User:  Talk to number or letter 
Match: intent.welcome 
Logic: Set replyState to "prompt" 
       Pick a response for the current replyState ("prompt") 
         and the intent that was matched ("intent.welcome") 
Agent: Howdy! Would you like a letter or a number?

User:  Let's go for a number 
Match: intent.number 
Logic: Set replyState to "number" 
       Pick a response for the current replyState ("number") 
Agent: A googol is 10 to the 100th power. Ready for another?

User:  How about a letter this time 
Match: intent.letter 
Logic: Set replyState to "letter" 
       Pick a response for the current replyState ("letter") 
Agent: The letters "A" and "I" are also words. Want another?

Here, the user has suddenly requested we jump to an entirely different state. But that isn’t a problem — our program just saw this the same as if they requested that state from the prompting question and the handler for that reacts the same way.

So instead of having to build in many followup intents, we aim to capture what the user is saying, and then use our webhook to change the state based on that.

Thinking for Voice: Design conversations, not logic

Thinking for Voice: Design conversations, not logic

Public key for mysql....rpm is not installed

A valid provisioning profile for this executable was not found.

XCode - App installation failed (A valid provisioning profile for this executable was not found)

Xcode 10 升級專案報錯 “directory not found for option” and “library not found for -libstdc++.6 ~解決方法

GIT： ! [remote rejected] HEAD -> refs/for/master (you are not allowed to upload merges)

打包造成The packaging for this project did not assign a file to the build artifact -> [Help 1]

mvn 打jar包異常：The packaging for this project did not assign a file to the build artifact -> [Help 1]

匯入第三方報錯warning: -no_pie ignored for arm64 symbol(s) not found for architecture arm64

解決模型載入NotFoundError (see above for traceback) Key v1 not found in checkp錯誤

Discovering,Thinking and Finding——This Is My Logic

Xcode 10 iOS12 "A valid provisioning profile for this executable was not found

Forging a New Path for Disabled Design: An Interview with Liz Jackson

Ask HN: Any companies or freelancers that you have used for a design 'makeover'?

9 Best Website Layout Examples and Ideas for Web Design in 2018

Codelab for Android Design Support Library used in I/O Rewind Bangkok session

Create an Adaptive Sketch Symbol for Material Design’s Outlined Text Field

maven install報錯：The packaging for this project did not assign a file to the build artifact

Resize Hard Disk Operation For This Format Is Not Implemented Yet” Error While Resizing VirtualBox D

intellij中maven專案報錯 The packaging for this project did not assign a file to the build artifact

Thinking for Voice: Design conversations, not logic

相關推薦