Monthly Archives

October 2020

AI Neural Networks

Enhancing Aiqudo’s Voice AI Natural Language Understanding with Deep Learning

By Artificial Intelligence, Deep Learning, Machine Learning, Natural Language, Neural Networks No Comments

Aiqudo provides the most extensive library of voice-triggered actions for mobile apps and other IOT devices. At this difficult time of Covid-19, voice is becoming mandatory as more organizations are seeing the need for contactless interactions. To further improve the performance of Aiqudo voice, we enhanced our unique Intent Matching using Semiotics with Deep Learning (DL) for custom Named Entity Recognition (NER) and Part of Speech (POS) Tagging. 

The task in question was to recognize the relevant Named Entities from user’s commands. This specific task is known as Named Entity Recognition (NER) in the Natural Language Processing (NLP) community. For example, ‘play Adele on Youtube’ involves two named entities, ‘Adele’ and ‘Youtube’. Extracting both entities correctly is critical for understanding the user’s intent, retrieving the right app and executing the correct action. Publicly available NER tools, such as NLTK, Spacy and Stanford NLP proved unsuitable for our purposes for the following reasons:

  1. they often made mistakes especially when processing short sentences typically seen in user commands
  2. they make mistakes such as labelling ‘Youtube’ as an ‘Organization’ and ‘Adele’ as ‘Person’, as opposed to the entity types we need within this command context  – which is ‘App’ and ‘Artist’.
  3. these tools don’t provide us with the granularity we need. As we support a very broad set of verticals or domains, our granularity needs for parameter types is very high – we need to identify almost 70 different parameter types in total (and this continues to grow). It’s not enough for us to identify a parameter as an “Organization”; we need to know if it is a “Restaurant”, “Business” or a “Stock ticker”

Part of Speech (POS) tagging is another essential aspect for both NER detection and action retrieval, but, again, public POS taggers such as NLTK, Spacy and Stanford NLP don’t work well for short commands. The situation gets worse for verbs such as ‘show’, ‘book’, ‘email’, ‘text’, which are normally regarded as nouns by most existing POS taggers. We, therefore, needed to develop our own custom NER module that also facilitates and produces more accurate POS information.

Fortunately, we already had a database of 13K+ commands relating to actions already in our platform and this provided the training data to build an integrated DL model. Example commands (with parameters extracted) in our database included ‘play $musicQuery on $mobileApp’ and, Show my $shoppingList, Navigate from $fromLocation to $toLoaction, etc. (Our named entity types start with ‘$’) For each entity, we created a number of realistic values, such as ‘grocery list’ and ‘DIY list’ for ‘$shoppingList’, and ‘New York” and ‘Atlanta’ for ‘$fromLocation’. We created around 3.7 million instantiated queries, e.g., ‘play Adele on Youtube’,Show my DIY list, and Navigate from New York to Atlanta’. We then used existing POS tools to label all words, chose the most popularly labelled POS pattern for each template, and finally labelled each relevant query accordingly. 

To make the data understandable to a neural network, we then needed to represent each word or token digitally, i.e. as vectors of certain dimensions. This is called Word Embedding. We tried several embedding methods, including Transformer tokenizer, Elmo, Google 300d, GloVe, and random embeddings of different dimensions. A pre-trained transformer produced the best results but required the most expensive computing resources such as a GPU. Elmo produced the 2nd best results but also needed a GPU for efficient computing time. Random embeddings of 64 dimensions work well on a CPU and can produce good results comparable to Elmo, while also being less expensive. Such tradeoffs are critical when you go from a theoretical AI approach to rolling AI techniques into production at scale. 

Our research and experiments were based on the state-of-the-art DL NER architecture of a residual Bidirectional LSTM. We integrated two relevant tasks: POS tagging and multi-label multi-class classification for potential entity types. Therefore, our present solution is a multi-inputs multi-outputs DL model. The neural architecture and data flow are illustrated in Fig. 1. The input module takes users’ speech and transforms it into text; the embedding layer represents the text in a sequence of vectors; the two bidirectional layers capture important recurrent patterns in the sequence; the  residual connection restores some lost features; these patterns and features are then used for labelling named entities and creating POS tags; or are flattened to make global classification for entity (parameter) types.

Deep Learning Architecture

Fig. 1 Neural architecture for Aiqudo Multitask Flow

One real life scenario would be as follows: A user wants to greet his friend Rodrigo on Whatsapp. He issues the following command verbally to his phone ‘Whatsapp text Rodrigo good morning’ (not a well-formed command, but this is common in practice). Each word in his speech is then mapped to a token integer, by which a 64 dimensional vector is indexed; the digital representation of all vectors goes through the neural network of two bidirectional LSTM layers and one residual connection layer; the network outputs parameter and value pairs and POS tags in time series; and the network is flattened on another branch and outputs parameter types. Our platform now has all the information needed to pass on to the next Natural Language Understanding (NLU) component in our system (see Figure 2), to fully understand the user’s intent and execute the correct action for them.

Online Intent Pipeline

Fig. 2 Aiqudo Online Intent Pipeline

Before we could go live in production, we needed to test the performance of the pipeline thoroughly. We devised 600k test scenarios that spanned 114 parameter distributions covering a range of command lengths from very short 2-term commands to much longer 15-term commands. We also focused on out-of-vocabulary parameter terms (terms that do not occur in the training data such as names of cities and movies for example) to ensure that the model could also handle these. 

Analysis of this approach in conjunction with the Aiqudo platform showed how it improved platform performance: The general entity recall ratio increased by over 10%. This integrated multitask model specifically fits well with Aiqudo’s requirements:

  1. The model was trained on our own corpus and produces entities and POS tags compatible with our on-boarded mobile app commands
  2. The three relevant tasks share most hidden layers and better weight optimization can therefore be achieved very efficiently
  3. The system can be easily adapted to newly on-boarded actions by expanding or adjusting the training corpus and/or annotating tags
  4. The random embedding model runs fast enough even on CPUs and produces much better results than publicly available NLP tools

We plan to continue to use DL where appropriate within our platform to complement and augment our existing Semiotics-based NLU engine. Possible future work includes: 

  1. extending the solution for any other languages (our system has commands on-boarded in several languages to use for training)
  2. tagging information and multi-label outputs haven’t been explicitly utilized as yet; we plan to leverage this information to further improve NER performance 
  3. the DL model can be further expanded by integrating it with other subtasks such as predicting relevant mobile apps from commands and/or actions. 

This powerful pipeline employing this flexible combination of Semiotics, Deep Learning and Grammar-based algorithms will offer more powerful Aiqudo voice services in the future. 

Xiwu Han, Hudson Mendes and David Patterson – Aiqudo R&D

Covid Information

QTime: What I Learned as an Aiqudo Intern

By App Actions, Startup Culture, Uncategorized, Voice No Comments

Mithil Chakraborty

Intern Voice: Mithil Chakraborty

Hi! My name is Mithil Chakraborty and I’m currently a senior at Saratoga High School. During the summer of 2020, I had the privilege of interning at Aiqudo for 6 weeks as a Product Operations intern. Although I had previously coded in Java, HTML/Javascript, and Python, this was still my first internship at a company. Coming in, I was excited but a bit uncertain thinking that I would not be able to fully understand the core technology (Q System) or how the app’s actions are created. But even amidst the COVID-19 Pandemic, I learned a tremendous amount about not only on boarding and debugging actions, but how startups work; the drive from each of the employees was admirable and really stood out to me. As the internship progressed, I felt like a part of the team. Phillip, Mark, and Steven did a great job making me feel welcome and explaining the Q Tools program, Q App, and on boarding procedures. 

As I played around with the app, I realized how cool the capabilities were. During the iOS stage of my internship, I verified and debugged numerous iOS Q App actions and contributed to the latest release of the iOS Q Actions app. From there, I researched new actions to on board for Android, focusing on relevant information and new apps. As a result, I proposed actions that would display COVID-19 information in Facebook and open Messenger Rooms. Through this process, I learned how to implement Voice Talkback too for the Facebook COVID-19 info action, using Android Device Monitor and Q Tools. The unique actions I finally on boarded included:

  • “show me coronavirus info” >> talks back first 3 headlines in COVID-19 Info Center Pane on Facebook 
  • “open messenger rooms” >> creates and opens a Messenger Room

Covid Information

Users don’t have to say an exact phrase in order for the app to execute the correct action; the smart AI-based intent matching system will only run the relevant actions from Facebook or Messenger based on the user’s query.  The user does not even have to mention the app by name – the system picks the right app automatically.

When these actions finally got implemented, it felt rewarding to see my work easily accessible on smartphones; thereafter, I told my friends and family about the amazing Q Actions app so they could see my work. Throughout my Aiqudo internship, the team was incredibly easy to talk to and they always encouraged questions. It showed me the real-life applications of software engineering and AI, which I hadn’t been exposed to before, and the importance of collaboration and perseverance, especially when I was debugging pesky actions for iOS. This opportunity taught me in a hands-on way the business and technical skills needed for a startup like Aiqudo to be nimble and successful, which I greatly appreciated. Overall, my time at Aiqudo was incredibly memorable and I hope to be back soon.

Thank you Phillip, Mark, Steven, Rajat and the rest of the Aiqudo team for giving me this valuable experience this summer! 

AssetCare

mCloud Brings Natural Language Processing to Connected Workers through Partnership with Aiqudo

By Artificial Intelligence, Asset Management, News, Press, Uncategorized, User Interface, Voice No Comments

CANADA NEWSWIRE, VANCOUVER, OCTOBER 1, 2020

mCloud Technologies Corp. (TSX-V: MCLD) (OTCQB: MCLDF) (“mCloud”   or the “Company”), a leading provider of asset management solutions combining IoT, cloud computing, and artificial intelligence (“AI”), today announced it has entered into a strategic partnership with Aiqudo Inc. (“Aiqudo”), leveraging Aiqudo’s Q Actions® Voice AI platform and Action Kit SDK to bring new voice-enabled interactions to the Company’s AssetCare™️ solutions for Connected Workers.

By combining AssetCare with Aiqudo’s powerful Voice to Action® platform, mobile field workers will be able to interact with AssetCare solutions through a custom digital assistant using natural language.

“mCloud’s partnership with Aiqudo provides AssetCare with a distinct competitive edge as we deliver AssetCare to our oil and gas, nuclear, wind, and healthcare customers all around the world. Connected workers will benefit from reduced training time, ease of use, and support for multiple languages” 

In the field, industrial asset operators and field technicians will be able to communicate with experts, find documentation, and pull up relevant asset data instantly and effortlessly. This will expedite the completion of asset inspections and operator rounds – an industry-first using hands-free, simple, and intuitive natural commands via head mounted smart glasses. Professionals will be able to call up information on-demand with a single natural language request, eliminating the need to search using complex queries or special commands.

Here’s a demonstration of mCloud’s AssetCare capabilities on smart glasses with Aiqudo.

“mCloud’s partnership with Aiqudo provides AssetCare with a distinct competitive edge as we deliver AssetCare to our oil and gas, nuclear, wind, and healthcare customers all around the world,” said Dr. Barry Po, mCloud’s President, Connected Solutions and Chief Marketing Officer. “Connected workers will benefit from reduced training time, ease of use, and support for multiple languages.”

“We are excited to power mCloud solutions with our Voice to Action platform, making it easier for connected workers using AssetCare to get things done safely and quickly,” said Dr. Rajat Mukherjee, Aiqudo’s Co-Founder and CTO. “Our flexible NLU and powerful Action Engine are perfect for creating custom voice experiences for applications on smart glasses and smartphones.”

Aiqudo technology will join the growing set of advanced capabilities mCloud is now delivering by way of its recent acquisition of kanepi Group Pty Ltd. (“kanepi”). The Company announced on September 22 it expected to roll out new Connected Worker capabilities to 1,000 workers in China by the end of the year, targeting over 20,000 in 2021.

BUSINESSWIRE:  mCloud Brings Natural Language Processing to Connected Workers through Partnership with Aiqudo

Official website: www.mcloudcorp.com  Further Information: mCloud Press 

About mCloud Technologies Corp.

mCloud is creating a more efficient future with the use of AI and analytics, curbing energy waste, maximizing energy production, and getting the most out of critical energy infrastructure. Through mCloud’s AI-powered AssetCare™ platform, mCloud offers complete asset management solutions in five distinct segments: commercial buildings, renewable energy, healthcare, heavy industry, and connected workers. IoT sensors bring data from connected assets into the cloud, where AI and analytics are applied to maximize their performance.

Headquartered in Vancouver, Canada with offices in twelve locations worldwide, the mCloud family includes an ecosystem of operating subsidiaries that deliver high-performance IoT, AI, 3D, and mobile capabilities to customers, all integrated into AssetCare. With over 100 blue-chip customers and more than 51,000 assets connected in thousands of locations worldwide, mCloud is changing the way energy assets are managed.

mCloud’s common shares trade on the TSX Venture Exchange under the symbol MCLD and on the OTCQB under the symbol MCLDF. mCloud’s convertible debentures trade on the TSX Venture Exchange under the symbol MCLD.DB. For more information, visit www.mcloudcorp.com.

About Aiqudo

Aiqudo’s Voice to Action® platform voice enables applications across multiple hardware environments including mobile phones, IoT and connected home devices, automobiles, and hands-free augmented reality devices.  Aiqudo’s Voice AI comprises a unique natural language command understanding engine, the largest Action Index and action execution platform available, and the company’s Voice Graph analytics platform to drive personalization based on behavioral insights.   Aiqudo powers customizable white label voice assistants that give our partners control of their voice brand and enable them to define their users’ voice experience.  Aiqudo currently powers the Moto Voice digital assistant experience on Motorola smartphones in 7 languages across 12 markets in North and South America, Europe, India and Russia.  Aiqudo is based in Campbell, CA with offices in Belfast, Northern Ireland.

SOURCE mCloud Technologies Corp.

For further information:

Wayne Andrews, RCA Financial Partners Inc., T: 727-268-0113, wayne.andrews@mcloudcorp.com; Barry Po, Chief Marketing Officer, mCloud Technologies Corp., T: 866-420-1781