Virtual Voice-Controlled Assistants: How Do Siri, Google Now And Corta

Smartphones have had voice control for many years now. I remember spending a good while trying to train an old Nokia dumbphone my voice for it to misdial every time I tried it, until I eventually just had one number with a voice dial profile so whatever it heard, it would work. Classic BlackBerry devices had a voice function, which was accurate enough to be able to ask it to call somebody from your ‘phone book without training it. And in 2011, Apple announced that the iPhone 4S was to benefit from their advanced virtual personal assistant, Siri. The Apple marketing machine went into overdrive by explaining how wonderful Siri was and how it would revolutionize how we used our smartphone. What they didn’t explain is how to avoid looking like a chump saying something into your iPhone, repeating it, sighing, repeating it again and either putting the device away or tapping on the tiny little screen yourself.

Apple’s Siri is recognized as the first cloud-based, operating system-included, voice recognition system. This means that when you talk to Siri and its peers, it sends the voice recording to a cloud server system, which interprets what you mean and sends the instructions back. This means that you need a connection to the Internet in order to use the service and the slower your connection, the slower it is to respond. It’s designed to cope with regional dialects and choice expressions plus Apple gave the assistant an artificial sense of humor. You can ask Siri to tell you a story and it will say no. Or you can torment Siri by asking it to “open the pod bay doors,” and it will eventually give you the famous line from the movie 2001. For a while, I also thought Siri’s inability to understand me was a comedy feature, too, but as it happens this isn’t the case. Siri’s voice recognition isn’t bad in isolation, but there are times when it misunderstands what I say but tries to understand it anyway: the Google and Microsoft alternatives are able to edit and understand what I really meant despite what they really heard. This “fuzzy logic” is beautiful to behold, because it means I can use Google Now in a busy, noisy environment (if I’m okay looking chumpy talking into a ‘phone or smartwatch). When I carried an iPhone, I first gave up trying to use Siri anywhere it wasn’t quiet and later gave up on using Siri.

Google Now also has a built in voice service, one that is cloud based too and can cope with regional accents. Google haven’t given their voice assistant a sense of humor but instead have made it context-aware: if I ask Google about the capital of Scotland, it tells me. If I next ask Google if it’s raining there, it knows I’m talking about Edinburgh. Siri and Microsoft’s Cortana unhelpfully ask where I mean. Google Now also has a tremendous advantage in that it’s aware of whatever is in your Gmail account. It knows your location and diary, contacts, but also understands your favorite sports teams, flight or travel arrangements and a whole lot else besides. This, when combined with the context-aware code, means that Google Now feels much more aware than Cortana and Siri.

Microsoft Cortana is now standard on Windows 8.1. It’s in beta form, only available in the United States of America but despite these limitations, it is very close in terms of usability to Google Now and Siri. It’s more accurate than Siri’s voice recognition and it can in-line edit what you’ve said like Google. It isn’t context-aware and when you ask it for information on something, it has the handicap of relying on Microsoft Bing as a search engine, nor will it talk to you about what you’ve asked it, as Google does. Cortana has one very powerful contact management feature: it has the ability to remind you of something when you next engage with a contact. Sure; Siri and Google have time and location reminders and Cortana does these too. But being reminded of something when I receive a call or text from a contact is a surprisingly useful feature. It’s the sort of thing that an Android Wear device could show when receiving a call from somebody (Google, are you reading this?). And finally, Cortana has a sense of humor; she can be fun to engage with.

I’ve written about the sense of humor aspect and here I’m in two minds. Google Now gives me the information that I ask for, when I ask for it. It does so in an efficient way, but it doesn’t have any style. Siri, Cortana are less capable assistants, are a little slower in execution and have much less background information and are unaware of context. They’re less useful but more fun. My issue here is that I don’t want to have a friendship with an artificial assistant on my handset, as it can’t bake me a cheesecake or fetch the coffee; the fun factor evaporates pretty quickly when it doesn’t recognize what I’m saying. Perhaps with Google’s awareness of the user, it might be open to lawsuits if it started answering back? If you ask me what’s the best virtual assistant, right now I must side with Google Now. Not because I’m a Google fan, although I am, and not even because of the voice recognition but instead because it’s a part of Google Now. And Google Now’s ability to second guess what I need and put the information onto my smartwatch, tablet, Chromebook, smartphone, and how it understands this in relation to what I talk to it about – that’s what makes it a winner.

And any smart product business ignoring how Google’s product is aware of more than you directly tell it is in serious danger of missing the boat.