Friday, January 22, 2016

Training your digital assistant

Despite reading articles like, every time I use a digital assistant application to interact with my device, I'm reminded of how limited they really are. Still useful for a certain set of commands (notes, reminders, factual questions, etc.), but not even close to the type of "conversational computing" imagined in videos like

So what I'm asking for is much simpler; just let me easily program a set of voice commands for my own device. Or, more precisely, let me teach my digital assistant how to interpret my words over time. When you search google about training a digital assistant like Siri or Cortana, all the information seems to be about training the voice recognition.  But thats only a small part of the problem.  Why isn't it easier to create simple verbal commands to trigger a specific action on your device?

For example, if there is an app I use a lot, I should be able to set a verbal launch command (without buying a 3rd party app launcher).  I should be able to record "macros" of events on the phone and then assign a verbal command to replay that exact set of events (e.g. open an app, click a button in the app, swipe to the second tab/screen of it).

Basically, I think the voice recognition should evolve with me to understand the types of things I'm likely to ask for. If it misinterprets what I'm asking for, I should be able to give it a simple piece of negative feedback so that it can try a different interpretation in the future. Or better yet, it could learn to measure its success based on my implicit feedback (if I redo a very similar command, I'm probably looking for a different result).

I really do hope that WIRED article is right and that smarter digital assistants are right around the corner, but I wouldn't hold my breath. If they do arrive, we'll just have to solve that silly little problem of getting people more comfortable with talking to their devices.

Training the next generation to dictate

Recently, I've shared some thoughts on the current user experience of digital assistants like Siri and Cortana.  A couple years ago, I bemoaned the lack of a more direct voice-to-system channel in our modern consumer technology:

One of the reasons I'm so stuck on this is that I feel like we finally have the technology to make conversational computing a normal mode of interaction with our devices.  I've been surprised to see that so many people are carrying smart phones and yet very few "normals" are using voice input.

I doubt that many would disagree that mobile devices are not good for typing. And yet, much to my surprise, younger generations have turned to text messaging as a primary form of communication, defacing the english language to make it slightly less painful to type their messages on their tiny little devices.

Sure, there are many contexts in which a voice input channel doesn't make sense, but I would argue that much of the time it does. So, if dictation is a better, more efficient, input mechanism for the majority of users, in the majority of cases, why isn't it more common? One explanation for this is that dictation is hard.

When I'm driving and try to dictate an email, text message, or note on my phone, its always full of stutters and breaks. To do it properly, I need to think out my entire sentence before I verbalize it to my device. While typing, I find it much easier to finish the thought in the midst of entry. Its funny how different it feels to talk to your device than it does to talk to a person...even if that person is not in the same physical location.

My theory is that this is learned, societal behavior. Further, I suspect that if I started dictating at a very young age, that dictation process would feel much more natural. Has anyone done an experiment like that? I'm seriously considering teaching my daughters to get into the habit of talking to a device just to see...maybe some kind of diary app?

Self-aware digital assistants

Knowing others is intelligence; knowing yourself is true wisdom.
                                                                                      - Lao Tzu

Does anyone else find it annoying that the digital assistants that are built into modern consumer technology (like Apple's Siri and Microsoft's Cortana) can answer questions about the world, but when you start to experience problems with your device they are the least helpful "person" in the world?

I recently upgraded to Windows 10 at home and I think the first 3 times I used Cortana, I tried to ask her questions about my installation.  Here are the type of questions I expect a smart digital assistant to be able to answer:

1. Is everything OK?  (did the installation go smoothly?  any errors/warnings I should know about?)
2. Why is my computer/phone going so slow?  (what is the current bottleneck?  CPU?  Memory?  What is using all the resources?)
3. Why am I out of disk space?  (what is the largest set of folders and files in my system?)
4. Why aren't my speakers working?  (scan device drivers, tell me what speakers are/aren't connected/make suggestions)

Questions like this might not be sexy, but I think it will feel natural to ask your phone/computer about itself.  Maybe digital assistants like Siri and Cortana should spend some time learning about themselves before they turn outward and help their users...