Amazon Echo and Google home are both vying for control of your smart home. Both of these platforms are ushering in an era where voice first devices will become as ubiquitous as the smart phone is today. Amazon Alexa Voice Service, the technology that powers the Echo family of devices is already boasting over 10,000 skills or functions that can augment the usefulness or entertainment value of the platform, and more are being added all the time by third parties. Amazon is even licensing this technology to be built into third party devices like Sonos Speakers and Ford’s Sync car entertainment system. The need for testing of these custom skills and devices will be become more common place in the future like how mobile app testing is today. There is a unique challenge involved when testing voice first devices.
Manual Testing Drawbacks
How would you go about structuring your testing environment to test these voice first devices? Would you go the normal manual testing route or develop automation so that you can have a consistent workflow. Some drawbacks associated with manual testing are that it could be expensive to implement at a large scale. Offshoring this type of testing would be difficult to countries like India, China, and Vietnam if you wanted to test a primary English speaking skill. If you intended your app to be used globally you would need to hire testers that could speak all the languages required by your desired skill. This could get quite expensive and time consuming in locating the testers with the specific skillset. Having your testers vocally test the devices might not be feasible as well since you will probably end up with testers with hoarse throats after a long testing session. There are tools that could facilitate this, by generating the required commands you wanted to test before hand in whatever language you desired.
The drawback of this method would be that everything would have to be generated beforehand which is a time consuming process. You might not be aware of all the commands you want to test as well. The rigidness of this method is a real detriment to testing these devices which are conversational in nature. An example of this would be a pizza ordering application. Not all pizza chains offer the same toppings, so in order to test an app like this you would have to compile all the various toppings that pizza chains can offer and have those lines recorded ahead of time. A dynamic solution would be preferred.
Automation testing of voice first devices is the answer. In the voice user interface world of these voice first devices the platform is agnostic and device agnostic. Since we are testing on the voice level the automated tests can be executed against any kind of voice apps running on the device. Automated Voice Testing can be leveraged against Alexa app testing, Google Home testing, voice-enabled websites and apps, virtual assistant testing (Siri, Cortana, OK Google on Android), voicemail, call center software. One set of tests have the capability to be utilized cross-platform, or cross device. Automation does pose its own set of challenges. When testing a device like the Amazon Echo you will find that it does not always respond the same way to the same requests. Some reasons are that the speech recognition engine is not perfect, and it might have some difficulty in understanding you, or that it is programmed to respond differently to give a more human sounding nature. So when automating these devices the automation framework has to be smart enough in dealing with multi-turn conversations. The automation has to be able to interpret Alexa’s responses and provide a response to keep the conversation going until it ends naturally. Being able to set the automation on auto pilot makes the test execution more robust and useful.
Pros and Cons of Manual vs Automation Testing of Voice First Apps
|Automation Testing||Manual Testing|
|Can test on a wide range of devices and platforms.||Device Agnostic, Platform Agnostic when testing on the voice level.||Device Agnostic, Platform Agnostic when testing on the voice level.|
|Autopilot Execution||Needs to be well programmed to interact with the system under test and keep the conversation going and have a dynamic response output.||More natural speech response, will mimic real users’ commands and mistakes better than automation can.|
|Test Execution Speed||Can test a wide variety of commands much faster than a human could||Hard to manually record every speech command if you wanted a single tester to execute them through a computer.|
|Multiple Language / Accent Testing||Can test a wider range of Languages, accents than a single tester||Hard for a single tester to test different languages and or different regions speech variations.|
|Outsource the testing effort||Would be easy to outsource the work effort.||Difficult to Outsource to foreign countries if you wanted to test English Only|
|Physical Strain||Testers will not have to speak every command you wanted to test.||Physical Strain on testers if you have a wide variety commands to execute.|
Uber testing voice first devices is the next frontier. To be ready to tackle this challenge, embracing automation is a move in the right direction. A purely manual testing focus has drawbacks such as not having the right testers available to speak all the languages you wanted to test and the huge strain involved if you have each tester vocally test the device day in and day out. There is also difficulty in offshoring this type of work if you wanted to test mainly in English. A robust automation framework can dynamically produce the responses you wanted and generate the speech files based on a set of input data sets. Automation at the voice level can be reutilized cross device or cross platform, allowing you to greatly expand your test coverage. Head over to LogiGear Magazine to read on how we are solving your testing challenges through automation with TestArchitect.
Disclaimer: Amazon, Alexa and all related logos and motion marks are trademarks of Amazon.com, Inc. or its affiliates.