Q Actions 2.0

Do more with Voice! Q Actions 2.0 now available on Google Play

By | Action Recipes, App Actions, Artificial Intelligence, Conversation, Digital Assistants, Natural Language, Voice, Voice Search | No Comments

Do more with Voice

Q Actions 2.0 is here. With this release, we wanted to focus on empowering users throughout their day. As voice is playing a more prevalent part in our everyday lives, we’re uncovering more use cases where Q Actions can be of help. In Q Actions 2.0, you’ll find new features and enhancements that are more conversational and useful.

Directed Dialogue™

Aiqudo believes the interaction with a voice assistant should be casual, intuitive, and conversational. Q Actions understands naturally spoken commands and is aware of the apps installed on your phone, so it will only return personalized actions that are relevant to you. When a bit more information is required from you to complete a task, Q Actions will guide the conversation until it fully understands what you want to do. Casually chat with Q Actions and get things done.

Sample commands:

  • “create new event” (Google Calendar)
  • “message Mario (WhatsApp, Messenger, SMS)
  • “watch a movie/tv show” (Netflix, Hulu)
  • “play some music” (Spotify, Pandora, Google Play Music, Deezer)

Q Cards™

In addition to providing relevant app actions from personal apps that are installed on your phone, Q Actions will now display rich information through Q Cards™. Get up-to-date information from cloud services on many topics: flight status, stock pricing, restaurant info, and more. In addition to presenting the information in a simple and easy-to-read card, Q Cards™ support Talkback and will read aloud relevant information.

Sample commands:

  • “What’s the flight status of United 875?”
  • “What’s the current price of AAPL?”
  • “Find Japanese food

Voice Talkback™

There are times when you need information but do not have the luxury of looking at a screen. Voice Talkback™ is a feature that reads aloud the critical snippets of information from an action. This enables you to continue to be productive, without the distraction of looking at a screen. Execute your actions safely and hands-free.

Sample commands:

  • “What’s the stock price of Tesla?” (E*Trade)
    • Q: “Tesla is currently trading at $274.96”
  • “Whose birthday is it today?” (Facebook)
    • Q: “Nelson Wynn and J Boss are celebrating birthdays today”
  • “Where is the nearest gas station?”
    • Q: “Nearest gas at Shell on 2029 S Bascom Ave and 370 E Campbell Ave, 0.2 miles away, for $4.35”

Compound Commands

An enhancement to our existing curated Actions Recipes, users can now create Action Recipes on the fly using Compound Command. Simply join two of your favorite actions using “and” into a single command. This allows the users the capability to create millions of Action Recipe combinations from our database of 4000+ actions.

Sample commands:

  • “Play Migos on Spotify and set volume to max”
  • “Play NPR and navigate to work”
  • “Tell Monica I’m boarding the plane now and view my boarding pass”

Simply do more with voice! Q Actions is now available on Google Play.

Q Actions - Action Recipes and Compound Commands

Q Actions – Complex tasks through Compound Commands

By | Artificial Intelligence, Command Matching, Conversation, Uncategorized | No Comments

In many cases, a single action does the job.

Say it. Do it!

Often, however, a task require multiple actions to be performed across multiple independent apps. On-the-go, you just want things done quickly and efficiently without having to worry about which actions to run, and which apps need to be in the mix.

Compound commands allow you to do just that – just say what you want to do – naturally –  and, assuming this makes sense and you have  access to the relevant apps, the right actions are magically  executed. It’s not that complicated – just say “navigate to the tech museum and call Kevin”, firing off Maps and WhatsApp in the process.  Driving, and in a hurry to catch the train? Just say “navigate to the Caltrain station and buy a train ticket” launching Maps and the Caltrain app in sequence.  Did you just hear the announcement that your plane is ready to board? Say “show my boarding pass and tell susan I’m boarding now” (American, United, Delta,…)  and (Whatsapp, Messenger,…) and you’re ready to get on the flight home – one, two … do!

Compound commands are … complex magic to get things done … simply!

Q Actions - Voice Talkback

Q Actions – Voice feedback from apps using Talkback™

By | App Actions, Conversation, Digital Assistants, User Interface | No Comments

Wonder why you can’t talk to your apps, and why your apps can’t talk back to you?  Stop wondering, as Talkback™ in Q Actions does exactly that. Ask “show my tasks” and the system executes the right action (Google Tasks) and, better yet, tells you what your tasks are – safely and hands-free, as you drive your car.

Driving to work and stuck in traffic?  Ask “whose birthday is it today?” and hear the short list of your friends celebrating their birthdays (Facebook). You can then say  “tell michael happy birthday”  to wish Mike (WhatsApp or Messenger). And if you are running low on gas, just say “find me a gas station nearby” and Talkback™ will tell you where the nearest gas station is and how much you’ll pay for a gallon of unleaded fuel.

Say it. Do it. Hear it spoken back!

Q Actions - Directed Dialogue

Q Actions – Task completion through Directed Dialogue™

By | Conversation, Digital Assistants, Natural Language, User Interface, Voice | No Comments

When an action or a set of actions require specific input parameters, Directed Dialogue™ allows the user to submit the required information through very simple, natural back-and-forth conversation. Enhanced with parameter validation, and user confirmation,Directed Dialogue™ allows complex tasks to be performed with confidence.Directed Dialogue™ is  not about open-ended conversations, but  it about getting things done, simply and efficiently.

With Q Actions, Directed Dialogue™ is automatically enabled  for every action in the system because we know the semantic requirements of each and every action’s parameters. It is not constrained, and  applies across all actions across all verticals.

Another application of Directed Dialogue™ is input refinement. Let’s say I want to purchase batteries. If I just say, “add batteries to my shopping cart” I can get the wrong product added to my cart, as on Alexa, which does the wrong thing for a new product order (the right thing happens on a reorder). In the case of Q Actions, I can provide the brand Duracell and the type 9V 4 pack with very simpleDirected Dialogue™, and exactly the right product is added to my cart – in the Amazon or Walmart app.

Get Q Actions today.

Thought to Action

Thought to Action!

By | Artificial Intelligence, Machine Learning | No Comments

Here at Aiqudo, we’re always working on new ways to drive Actions and today we’re excited to announce a breakthrough in human-computer interaction that facilitates these operations.  We’re calling it “Thought to Action™”. It’s in early-stage development, but shows promising results.

Here’s how it works. We capture user brainwave signals via implanted neural-synaptic receptors and transfer the resulting waveforms over BLE to our cloud where advanced AI and machine learning models translate the user’s “thoughts” into specific app actions that are then executed on the user’s mobile device.   In essence we’ve transcended the use of voice to drive actions. Just think about the possibilities. Reduce messy and embarrassing moments when your phone’s speech recognizer gets your command wrong. “Tweet Laura, I love soccer” might end up as “Tweet Laura, I’d love to sock her”. With “Thought to Action™” we get it right all the time. And perfect for use in today’s noisy environments. Low on gas and you’re driving your entire kids soccer team home from a winning match, you can simply think “Find me the nearest gas station” and let Aiqudo do the rest.  Find yourself in a boring meeting? Send a text to a friend using just your thoughts.

Stay tuned as we work to bring this newest technology to a phone near you.

Auto in-cabin experience

The Evolution of Our In-Car Experience

By | Digital Assistants, User Interface, Voice | No Comments

As the usage model for cars continues to shift away from traditional ownership and leasing to on-demand, ridesharing, and in the future, autonomous vehicle (AV) scenarios, how we think about our personal, in-car experience will need to shift as well.

Unimaginable just a few short years ago, today, we think nothing of jumping into our car and streaming our favorite music through the built-in audio system using our Spotify or Pandora subscription. We also expect the factory-installed navigation system to instantly pull up our favorite or most-commonly used locations (after we’ve entered them) and present us with the best route to or from our current one. And once we pair our smartphone with the media system, we can have our text and email messages not only appear on the onboard screen but also read to us using built-in text-to-speech capabilities.  It’s a highly personalized experience in our car.

When we use a pay-as-you-go service, such as Zipcar, we know we’re unlikely to have access to all of the tech comforts of our own vehicle, but we can usually find a way to get our smartphone paired for handsfree calling and streaming music using Bluetooth. If not, we end up using the navigation app on our phone and awkwardly holding it while driving, trying to multitask. It’s not pretty. And when we hail a rideshare, we don’t expect to have access to any of the creature comforts of our own car.

But what if we could?

Just as our relationship to media shifted from an ownership model–CDs or MP3 files on iPods–to subscription-based experiences that are untethered to a specific device but can be accessed anywhere at any time, it’s time to shift our thinking about in-car experiences in the same way.

It’s analogous to accessing your Amazon account and continuing to watch the new season of “True Detective” on the TV at your Airbnb–at the exact episode where you left off last week. Or listening to your favorite Spotify channel at your friend’s house through her speakers.

All your familiar apps (not just the limited Android Auto or Apple CarPlay versions) and your personalized in-car experience–music, navigation, messaging, even video (if you’re a passenger, of course)–will be transportable to any vehicle you happen to jump into, whether it’s a Zipcar, rental car or some version of a rideshare that’s yet to be developed. What’s more, you’ll be able to easily and safely access these apps using voice commands. Whereas today our personal driving environment is tied to our own vehicle, it will become something that’s portable, evolving as our relationship to cars changes over time.

Just on the horizon of this evolution in our relationship with automobiles? Autonomous vehicles, or AVs, in which we become strictly a passenger, perhaps one of several people sharing a ride. Automobile manufacturers today are thinking deeply about what this changing relationship means to them and to their brands. Will BMW become “The Ultimate Riding Machine?”(As a car guy, I personally hope not!)  And if so, what will be the differentiators?

Many car companies see the automobile as a new digital platform, for which each manufacturer creates its own, branded, in-car digital experience. In time, when we hail a rideshare or an autonomous vehicle, we could request a Mercedes because we know that we love the Mercedes in-car digital experience, as well as the leather seats and the smooth ride.

What happens if we share the ride in the AV, because, well, they are rideshare applications after all? The challenge for the car companies becomes creating a common denominator of services that define that branded experience while still enabling a high degree of personalization. Clearly, automobile manufacturers don’t want to become dumb pipes on wheels, but if we all just plug in our headphones and live on our phones, comfy seats alone aren’t going to drive brand loyalty for Mercedes. On the other hand, we don’t all want to listen to that one guy’s death metal playlist all the way to the city.  

The car manufacturers cannot create direct integrations to all services to accommodate infinite personalization. In the music app market alone there are at least 15 widely used apps, but what if you’re visiting from China? Does your rideshare support China’s favorite music app, QQ?  We’ve already made our choices in the apps we have on our phones, so transporting that personalized experience into the shared in-car experience is the elegant way to solve that piece of the puzzle.

This vision of the car providing a unique digital experience is not that far-fetched, nor is it that far away from becoming reality. It’s not only going to change our personal ridesharing experience, but it’s also going to be a change-agent for differentiation in the automobile industry.

And it’s going to be very interesting to watch.

Semiotics

AI for Voice to Action – Part 3: The importance of Jargon to understanding User Intent

By | Artificial Intelligence, Command Matching, Machine Learning | No Comments

In my last post I discussed how semiotics and observing how discourse communities interact had influenced the design of our machine learning algorithms. I also emphasized the importance of discovering jargon words as part of our process of understanding user commands and intents.

In this post, we describe in more depth how this “theory” behind our algorithms actually works. We also discussed what constitute good jargon words.  “Computer” is a poor example of a jargon word because it is too broad in meaning, whereas a term relating to a computer chip, e.g. “Threadripper” (a gaming processor from AMD) would be a better example as it is more specific in meaning and is used in fewer contexts.

Jargon terms and Entropy

So – how do we identify good jargon terms and what do we do with them in order to understand user commands?

To do this we use entropy. In general entropy is a measure of chaos or disorder and, in an information theory context, it can be used to determine how much information is conveyed by a term. Because jargon words have a very narrow and specific meaning within specific discourse communities, they have lower entropy (more information value) than broader more general terms.

To determine entropy we take each term in our synthetic documents (see this post for more information of how we create this data set) and build a probability profile of co-occurring terms. The diagram below shows an example (partial) probability distribution for the term ‘computer’.

Entropy

Figure 1: Entropy – probability distributions for jargon terms

These co-occurring terms can be thought of as the context for each potential jargon word. We then use this probability profile to determine the entropy of the word. If that entropy is low then we consider it to be a candidate jargon word.

Having identified the low entropy jargon words in our synthetic command documents, we then use their probability distributions as attractors for these documents themselves. In this way (as seen in the diagram below) we create a set of document clusters where each cluster relates semantically to a jargon term. (Note: in the interest of clarity, clusters are described using high level topic as opposed to the jargon words themselves in the figure below).

Clusters derived from Synthetic Documents

Figure 2: Using jargon words as attractors to form clusters

We then build a graph within each cluster that connects documents based on how similar they are in terms of meaning. We identify ‘neighborhoods’ within these graphs that relate to areas of intense similarity. For example a cluster may be about “cardiovascular fitness” whereas a neighborhood may be more specifically about “High Intensity Training”, or “rowing” or “cycling”, etc.

Clusters and Neighborhoods

Figure 3: Neighborhoods for the cluster “cardiovascular fitness”

These neighborhoods can be thought of as sub-topics within the overall cluster topic. Within each sub-topic we can then extract important meaning-based phrases that precisely describe what that neighborhood is about. e.g. “HIIT”, “anaerobic high-intensity period”, “cardio session”, etc.

Meaning based phrases for sub-topics

Figure 4: Meaning based phrases for the “high intensity training” sub-topic

In this way we create meaning-based structure from completely unstructured content. Documents from the same cluster relate to the same discourse community. Documents from the same cluster that share similar important terms or phrases can be regarded as relating to the same sub-topic. If two clusters share a large number of important phrases then this represents a dialog between two discourse communities. If multiple important phrases are shared among many clusters then this represents a dialogue among multiple communities.

So having described a little bit about the algorithms themselves, how do they help us understand the correct meaning behind a user’s command? Given this contextual partitioning of the data into discourses based on jargon terms, we can disambiguate among the many different meanings a term can have. For example, if the user were to say ‘open the window’ – we will be able to understand that there is a meaning (discourse) relating to both buildings and to software but if the user were to say ‘minimize the window’, we would understand that this could only have a software meaning and context. Fully understanding the nuances behind a user’s command is, of course, much more complicated than what I have just described, but the goal here is to give a high level overview of the approach.

In subsequent posts, we will discuss how we extract parameters from commands, accurately determine which app action to execute, and how we pass the correct parameters to that action.  

David Patterson and Vladimir Dobrynin

Aiqudo, Inc.

Silicon Valley Voice Pioneer Aiqudo Unveils Its Latest Software Platform

By | Press | No Comments

Enables Anyone to Use their Voice to Control and Interact with 1000’s of Mobile Apps

SAN JOSE CALIFORNIA (BUSINESS WIRE), December 12, 2018

Aiqudo unveiled a set of breakthrough advances to Q Actions, its industry-leading voice enablement platform, that for the first time makes it possible for anyone to navigate their lives through their mobile apps seamlessly using a natural voice. Now, mobile applications can talk back to users to confirm instructions, conduct multi-step processes and even proactively alert users to new messages and read them back.

“Our Directed Dialogue feature helps users to easily complete complex tasks”

Unlike other voice platforms, Aiqudo serves users by working directly with apps users have downloaded on their mobile phones, eliminating the self-serving walled gardens erected by other voice platforms. Consumers may never be able to check Facebook instant messages from Alexa or access an Amazon wish list from Google Assistant and go shopping. Aiqudo removes this obstacle and makes voice the simplest, fastest, most intuitive interface for consumer technologies.

“By focusing on extending dominance in their legacy businesses such as ecommerce or search, the major voice platforms have failed to deliver on their own hype around voice,” said John Foster CEO of Aiqudo. “We’ve taken a better route focused on making voice truly useful today. We’re app-centric, platform-agnostic and let consumers use voice on their own terms, not just when they’re standing next to a device in their living rooms. Our voice assistant needs to be available to us whether we’re in a car, on a train with our hands full or wandering around an amusement park.”

At the center of the latest version of Aiqudo are features such as:

  • Directed Dialogue: Aiqudo quickly and easily guides users to successful actions, prompting them to provide all required pieces of information, whether it’s a calendar event requiring start and end times, location and event name, or providing party size and time for booking a table at a restaurant.
  • Compound Commands: Your favorite apps and mobile phone features can now work collaboratively to get everyday requests completed. Executing multiple actions with a single command is easier than ever – navigate with Waze or other traffic app and notify your friends of a late arrival with your favorite messaging app– and it’s done with one single request.
  • Voice Talkback: Don’t want to be distracted looking at your phone? Aiqudo can read back results from your favorite apps such as news headlines, stock quotes and message responses.

“We’ve taken a better route focused on making voice truly useful today. We’re app-centric, platform-agnostic and let consumers use voice on their own terms, not just when they’re standing next to a device in their living rooms.”

“Our Directed Dialogue feature helps users to easily complete complex tasks,” said Rajat Mukherjee CTO of Aiqudo. “A user is only prompted to provide any missing information required by an action that she has not already provided in a command. Because we understand the semantics of all actions in the system, directed dialog works out-of-the-box for every one of our actions and does not require configuration, customized training or huge volumes of training data.”

Deploying a semiotics-based language modeling platform enables multi-lingual natural language commands, while Aiqudo’s app analysis engine allows rapid on boarding of apps to provide high utility and broad coverage across apps. Today Aiqudo supports thousands of applications ranging from ecommerce apps like Amazon, Walmart, or eBay, entertainment apps like Netflix, Spotify, or Pandora, to favorite messaging and social apps including WhatsApp, WeChat, Messenger and more.

Aiqudo Q Actions 2.0 will be available on Google Play by year end, and the company has already struck OEM relationships with the likes of Motorola for the technology to be embedded directly into phones.

To view product demo videos, visit Aiqudo’s YouTube channel.

About Aiqudo
Aiqudo (pronounced: “eye-cue-doe”) is a software pioneer that connects the nascent world of digital voice assistants to the useful, mature world of mobile apps through its Voice-to-Action™ platform. It lets people use voice commands to execute actions in mobile apps across devices. Aiqudo’s SaaS platform uses machine learning (AI) to understand natural-language requests and then triggers instant actions via mobile apps consumers prefer to use to get things done quickly and with less effort. For more info, visit: http://www.aiqudo.com

Business Wire: Silicon Valley Voice Pioneer, Aiqudo, Unveils Its Latest Software Platform

 

Voice Enable System Settings with Q Actions 1.3.3!

By | App Actions, Digital Assistants, News, Voice Search | No Comments

Somewhere in the Android Settings lies the option for you turn on Bluetooth, turn off Wifi, and change sound preferences. These options are usually buried deep under menus and sub-menus. Discoverability is an issue and navigating to the options usually means multiple taps within the Settings app. Yes, there’s a search bar within the Settings app, but it’s clunky, requires typing and only returns exact matches. Some of these options are accessible through the quick settings bar, but discovery and navigation issues still exist. 

In the latest release, simply tell Q Actions what System Settings you want to change. Q Actions can now control your Bluetooth, Wifi, music session, and sound settings through voice.

Configure your Settings:

  • “turn on/off bluetooth”
  • “turn wifi on/off”

Control your music:

  • “play next song”
  • “pause music”
  • “resume my music”

Toggle your sound settings:

  • “enable do not disturb”
  • “mute ringer”
  • “increase the volume”
  • “put my phone on vibrate”

In addition to placing calls to your Contacts, Q Actions helps you manage Contacts via voice. Easily add a recent caller as a contact in your phonebook or share a friend’s contact info with simple commands. If you have your contact’s address in your Contacts, you can also get directions to the address using your favorite navigation app.

Place calls to Contacts:

  • “call Jason Chen
  • “dial Mario on speaker”

Manage and share your Contacts:

  • “save recent number as Mark Johnson
  • “edit Helen’s contact information“
  • “share contact info of Daniel Phan
  • “view last incoming call”

Bridge the gap between your Contacts and navigation apps:

  • “take me to Rob’s apartment”
  • “how do I get to Mike’s house?”

Unlock your phone’s potential with voice! Q Actions is now available on Google Play.

Poison Bottle

AI for Voice to Action – Part 2: Machine Learning Algorithms

By | Artificial Intelligence, Command Matching, Machine Learning, Natural Language | No Comments

My last post discussed the important step of automatically generating vast amounts of relevant content relating to commands to which we apply our machine learning algorithms. Here I want to delve into the design of our algorithms.

Given a command, our algorithms need to:

  1.   Understand the meaning and intent behind the command
  2.   Identify and extract parameters from it
  3.   Determine which app action is most appropriate
  4.   Execute the chosen action and pass the relevant parameters to the action

This post and the next one will address point 1. The other points will be covered in subsequent posts.

So how do we understand what a user means based on their command? Typically commands are short (3 or 4 terms), which makes it very difficult to disambiguate among the multiple meanings a term can have. So if someone says “search for Boston” do they want directions to a city or do they want to listen to a rock band on Spotify? In order to disambiguate among all the possibilities we need to know if a) any of the command terms can have different meanings, b) what those meanings are and finally c) which is the correct one based on context.

Semiotics

In order to do this we developed a suite of algorithms which feed off the data we generated previously (See post #3). These algorithms are inspired by semiotics, the study of how meaning is communicated. Semiotics originated as a theory of how we interpret the meaning of signs and symbols. Given a sign in one context, for example a flag with a skull and crossbones on it, you would assign a particular meaning to it (i.e. Pirates).

Pirate Symbol

Whereas, if you changed the context to a bottle, then the meaning changes completely

Poison Bottle

Poison – do not drink!

Linguists took these ideas and applied them to language and how, given a term (e.g. ‘window’), its meaning can change depending on the meaning of the words around it in the sentence (meanings could be physical window in a room, software window, window of opportunity, etc.).  By applying these ideas to our data we can understand the different meanings a term can have based on its context.

Discourse Communities

We also drew inspiration from discourse communities. A discourse community is a group of people involved in and communicating about a particular topic. They tend to use the same language for important concepts (sometimes called jargon) within their community, and these terms have a specific, understood and agreed meaning within the community to make communication easier. For example members of a cycling community have their own set of terms that is fairly unique to them that they all understand and adhere to. If you want to see what I mean, go here and learn the meanings of such terms as an Athena, a Cassette, a Chamois (very important!) and many other terms. Similarly motor enthusiasts will have their own ‘lingo’. If you want to be able to differentiate your AWS from your ABS and your DDI from your DPF then get up to speed here.

Our users use apps, so in addition we would expect to discover gaming discourses, financial discourses, music discourses, social media discourses and so on. Our goal was to develop a suite of machine learning algorithms which could automatically identify these communities through their important jargon terms. By identifying the jargon terms we can build a picture of the relationship between these terms and other terms used by each discourse community within our data. A characteristic of jargon words is that they have a very narrow meaning within a discourse compared to other terms. For example the term ‘computer’ is a very general term that can have multiple meanings across many discourses – programming, desktop, laptop, tablet, phone, firmware, networks etc. … ‘Computer’ isn’t a very good example of a jargon term as it is too general and broad in meaning. We want to identify narrow, specific terms that have a very precise meaning within a single discourse, e.g. a specific type of processor, or a motherboard. Our algorithms do a remarkable job of identifying these jargon terms and are foundational to our ability to extract meaning, precisely understand user commands and thereby the real intent that lies behind them.

In my next post I will go into the details behind the algorithms that enable us to identify these narrow-meaning, community-specific jargon terms and ultimately to build a model that understands the meaning and intent behind user queries.