Preference Based Software Interface
To Improve the Responses of Virtual Voice Assistants
Jacob Onbreyt, Angel Sandoval, Joe Thomas Jorge, Haiping Wu
City College of New York
Summary
The presence of artificial technology in everyday objects has increased drastically over the past decade. With the release of virtual voice assistants on smart devices, smartphones have become hosts to artificial intelligence technology. Although useful in providing hands free phone capabilities, when asked questions, the virtual voice assistants come with limitations. Whatever they cannot answer they simply direct to a mere google search. By updating the software of current virtual voice assistants to expand their knowledge and search database, the efficiency and quality of virtual voice assistant response will increase as results will while still being varied will be more topic specific, and to the point.
Author Note
This paper was prepared for English 21007 taught by Susan Delamare
Table of Contents
Introduction…………………………………………………………………………….3
Objective……………………………………………………………………………….4
Preliminary Literature Review………………………………………………………….5
Technical Description of Innovation……………………………………………………6
Budget………………………………………………………………………………….
References………………………………………………………………………………7
Appendix 1……………………………………………………………………………..8
Appendix 2……………………………………………………………………………..9
List of Figures
Fig 1. Software Architecture for Virtual Assistant…………………………………….6
List of Tables
Introduction
People are always looking for different ways to simplify the tasks that life throws at them. Much of that is due to it being human nature that if an individual notices that a task can be completed with minimal effort that they will choose the easier route. Thus people resort to the use of electronic devices. These phones, tablets, smart tv’s, etc, are extremely versatile in what a person can do with them, especially since over the years many of these devices have become equipped with a commercial spoken dialog system known as a virtual voice assistant.
According to Yang and Lee (2019) current commercial virtual personal assistants can understand voice commands and provide information and services accordingly. “A dialog function enables users to interact with devices for specific commands such as calling somebody. Through a web search, users can search and identify specific information in the same way as using the traditional internet.” (Yang and Lee, 2019). However since its release in 2010, virtual personal assistant diffusion into daily life has been a struggle as within the second week of release only 3% users were engaging with their phone’s voice assistant (Yang and Lee 2019). This means that users are not particularly attracted to current virtual voice assistants. Cross (2020) in his article about how Apple can improve their iPhones states that “Siri needs better voice recognition, faster response times… it needs to give more accurate answers to a much broader set of questions.” From this it is evident that the issue with virtual voice assistants lies in their software.
In this paper, we propose a possible method to improve current virtual personal assistant responses through a software update. Without changing basic overall functionality of the virtual voice assistant software, our update will feature a preference based response system that generates queries and answers based on preset user preferences in regards to how they want their answers given to them. If a user does not like a result set provided, they can tell their virtual voice assistant how they want their answer given and the assistant will find a better answer as well as remember your preferences for that type of question for future use. Such an update to the efficiency of virtual voice assistant responses will benefit consumer application of this technology as its improvement will further its evolution into a hub device that can operate phones, homes, and workplaces. Improved responses will decrease time needed for web searches and in turn the speed at which tasks are done since information is readily provided with a mere voice command.
Objective
The operation mechanism of the voice assistant, in the article “Build your first Voice Assistant” by Nagesh Singh Chauhan, he mentions that activates the voice system by wake word to voice input, and finally to get feedback. After the voice device is activated, the machine enters the monitoring mode. At the human voice input, after receiving the voice command and processing it, the result of voice recognition is obtained. Therefore, we could know there are three main parts in the “communication” between the voice assistant application and humans, which’s select the voice message, processing the input information, and responding to the results. At present, the main problem of using artificial intelligence virtual voice assistants is its response efficiency. Improving the response efficiency is the key to the further development of voice assistants. Forturenly, after conducting some research, it seems that software updates can improve response system efficiency. Below we will study how the upgrade will improve the response efficiency of the voice assistants. The goal is to enable them to better extract answers from the Internet and possibly save those responses for future use, with this update resulting in more accurate answer extraction efficiency.
Preliminary Literature Review
Today it might be perfectly normal to ask Siri to send the message. And the newest developments in virtual voice assistants are Chatbot. Automotive News ( 2020) showed that “Artificial intelligence has become a staple at U.S. dealerships. Often, it takes the form of chatbots, software programs that conduct instant-messaging conversations with Web visitors. Some dealerships even give their AI personas names and “assistant” titles, which has led to people coming in and asking to speak to these virtual beings.” In the business area, Chatbots give dealerships a way to stay connected with customers. In the real world, a human sends you an email or two, and then they get emotionally disconnected. But AI could care less about the emotions and they’re not going to judge your interest level. They’re just going to keep tapping you on the shoulder. But at the same time, the next phase of development for voice assistants is the improvement of their own systems. The main problem of using an artificial intelligence virtual voice assistant is its low response rate and the accuracy of the response results. like unable to identify the user ’s problems, search for the wrong information, and run slow and not smoothly, are all problems faced by current voice assistants.
Tim Tuttle (2015) states “Despite scientists dedicating their lives to the challenge over several decades, until recently, only very slow progress had been made in teaching machines to understand spoken language at all, let alone with human-level proficiency.” Due to the limitations of today’s technology, voice assistant systems still face challenges on the road to success. Human conversation is extremely rich and interactive at the same time, which is a big challenge for voice assistant systems. Sometimes people’s words can be understood by their friends before they are spoken. Technically, the response time when people talk to each other is just a few tens of milliseconds. But how embarrassing everyday communication can become if the pauses in normal conversation are several seconds at a time, or if there is a constant need to retold questions or commands. Many aspects of the underlying technology have to do with the slow response of voice assistants to “conversation”.
In the intelligence market, as long as the continuous improvement and upgrading, voice assistants still play a very important role. According to the information from “Voice Assistant Market Research Report – Global Forecast till 2025” unknown author (2018) states “The increasing popularity of connected devices, shifting preference toward smart homes, increasing instances of voice search, and high demand for self-service applications are primarily driving the growth of this market… Amazon.com Inc. holds a substantial share of the voice assistant market with Echo, while Google Home ranks next in line in the market. Due to the increasing adoption of IoT, the market has been witnessing tremendous growth in the adoption of connected devices and IoT data generation. In the connected home services, voice assistants are expected to serve as a primary interface for the users and provide them with personalized services triggered with voice commands. ” Thus, the value of voice assistants is still valued by the top e-commerce companies today. And have the intention to expand the direction of intelligent home life. Voice systems touch many aspects of our lives, and voice search for information will become a new trend that will change traditional search behavior. Therefore, it has unlimited potential and necessity to study the upgrade of the voice assistants system.
Technical Description of Innovation
The proposed software for the update to a virtual assistant is largely based around an existing virtual assistant software development. In that development, the virtual assistant cross references saved information in regards to user preference with the question at hand and based on a preference inference, searches its database for a possible answer.
We propose to use a similar software so that the phone user can get their response in a manner that suits them best. After the user activates the virtual voice assistant on their mobile device, they interact with it by asking a question. The phone then uses its sensor data to send the interaction to an inference engine (Master, Ehsani, and Witt-Ehsani, 2017). The inference engine then takes the interaction and compares it to previously recorded knowledge preferences to formulate a query and direct it at a specific search engine, or formulate a query that answers a possible follow up question (Master, Ehsani, and Witt-Ehsani, 2017). After the result set is generated, a reply is sent back to the phone with a question regarding whether the interaction was addressed properly to further narrow down a better result or address follow up questions.
The virtual voice assistant will converse with the phone user to answer their question through a link between the phone, a network, a query engine, a knowledge database, and a search engine (research database) (fig 3). This process occurs as follows:
Fig. 3A– User interacts with a virtual assistant by asking a question.
Fig. 3B– The question then gets analyzed by sensor data, and over a network connection is transmitted to a query engine.
Fig. 3C– The query engine compares the question to the user’s preset preferences and permitted history storage to see if the interaction type corresponds to any particular result generation and formulates the query for search.
Fig. 3D– The query then gets put into a search engine that runs it through various internet databases and returns a result set based on preferences that formulated the query.
Fig. 3E– The narrow result set is sent back to the virtual voice assistant for user interaction followed by a follow up question that asks if the results addressed the question.
Fig. 3F– If the user responds that their question was not addressed, the virtual assistant transmits another query over the network connection to the user asking them to specify the type of answer they want (simpler, more in depth, from a specific source, etc.). The virtual assistant then takes that interaction and sends it back to the query engine and goes through C-F again.
Figure 3 – Diagram of Virtual Assistant Interaction
Collage created with images sourced from:
https://www.clipart.email/clipart/clipart-cloud-internet-338241.html
https://www.iconfinder.com/icons/4937165/listen_sound_speak_talk_talking_voice_waves_icon
https://www.stickpng.com/img/electronics/iphones/iphone-7-template
If another search is prompted to the virtual voice assistant, when asking for the user to specify how they would like their answer presented, the user will be able to save and add that preference to the knowledge database for future interactions.
Budget
Our budget covers the costs associated with developing an interface capable of reaching our objectives, that is, a system with improved responsiveness and versatility. Since new technologies are being introduced all the time, a re-evaluation of our projected costs will be required after the initial year. Budget estimates are based on previous smart interface development groups.
We plan to have a full-time development team composed of a data scientist, software and machine learning researchers and engineers. This section of the team will develop the technical parts of the system. A conversational UX designer will, also, be present. This will be the person in charge of helping develop the personalized and “user friendly” environment for the customers. A product manager and a growth hacker would also be present to focus on the business aspect of developing the smart interface. Other expenses like equipment and a workspace have been included.
Table 1 Project Budget Details, created by Angel Sandoval, 5/13/2020.
References
https://patentimages.storage.googleapis.com/3e/c9/c6/e417d85eff5a1b/US20130204813A1.pdf
Automotive News ( 2020) from ccny database
“The Future of Voice: What’s Next After Siri, Alexa and Ok Google” Tim Tuttle (2015)
“Build your first Voice Assistant” Nagesh Singh Chauhan (2019)
https://towardsdatascience.com/build-your-first-voice-assistant-85a5a49f6cc1
“Voice Assistant Market Research Report – Global Forecast till 2025” unknown author (2018)
https://www.marketresearchfuture.com/reports/voice-assistant-market-4003
“Apple Siri vs Amazon Alexa vs Google Assistant: Tests reveal which is smartest” Liam Tung (2019) https://www.zdnet.com/article/apple-siri-vs-amazon-alexa-vs-google-assistant-tests-reveal-which-is-smartest/
“Voice Assistants: Raising Expectations by Empowering the Customer” Unknown Author (2019) https://medium.com/datadriveninvestor/voice-assistants-raising-expectations-by-empowering-the-customer-7551f724df71
“Siri finishes last in test against other smart speaker virtual assistants” Alan Friedman (2018) https://www.phonearena.com/news/Siri-finishes-last-in-test-among-smart-speaker-assistants_id102400
Appendix A – Task Schedule
Table 2 Project Task Schedule, created by Jacob Onbreyt, 5/13/2020
Appendix A:
Figure #3: Chart about questions voice assistants answered correctly. Reprint from
Figure #4: Chart about questions voice assistants answered correctly. Reprint from https://www.phonearena.com/news/Siri-finishes-last-in-test-among-smart-speaker-assistants_id102400
Voice assistants do a great job of answering questions their users may have no matter how complex. They are not however, One hundred percent accurate. The artificial voice assistants, as helpful as they can be, have room for improvement in their responsiveness and accuracy. Many factors may contribute to the assistants being inaccurate, and these can be things like not being able to distinguish a specific voice from people speaking at the same time, or simply just receiving a question that is harder to answer. These problems if remedied, can significantly improve the effectiveness and efficiency of the assistants responsiveness.