Voice-Activated Technology Must Advance to Support Hybrid Workplaces

The pandemic changed everything about people’s lives, including how they interact with speech technology. The Smart Audio report from NPR reveals that more people are using their smart devices daily; the number of people using voice commands at least once a day increased by 6 percentage points from December 2019 to April 2020.

Before COVID-19, many workers were outside their homes for eight hours or more. They did not have access to their smart devices, and they generally felt more secure using voice commands privately. But the shift to teleworking meant more time at home and more opportunities to explore technology.

This trend towards voice-activated technology shows no sign of stopping. More than 50% of employees want to continue teleworking, and about 25% want a mix of personal and teleworking, according to a survey by Office Depot. As the routines that people formed over the past year become firmly cemented, smart speakers and voice assistants will become the cornerstones of hybrid work.

How Voice Tech can evolve to support hybrid workplaces

Voice technology has come a long way since Siri was first announced. During the pandemic, grocery stores and other retailers added voice tech and non-contact payment options to self-service kiosks to provide customers with more secure experiences. Researchers are also investigating how voice assistants can support the healthcare industry.

The future of speech technology is undoubtedly bright, but it will have to keep evolving to become an integral part of the new hybrid workplace. People expect voice technology to naturally fit into existing workflows, so any obstacles or flaws that hold adoption could pose problems for the continued use of voice-first technologies.

Here’s what needs to change as more remote workers work and buy smart devices:

Algorithms must be based on a variety of voices.

Clearly, some voice recognition technology has been trained and programmed using perfect diction, “standard North American English” and crystal clear footage. Unfortunately, these algorithms are not very useful in the real world.

Smart speakers and other devices must be able to navigate ambient noise, background voices, regional dialects, international accents, imperfect pronunciations, speech barriers, and more before they can be useful in hybrid space.

Fortunately, some companies tackle these issues directly. I recently spoke with a woman whose child had a speech impediment. They had spent hours in a Google recording studio helping to improve the programming of the company’s assistants. In addition, Apple has compiled a database of nearly 30,000 audio clips of speakers originating. Perfect voice recognition does not happen overnight, but accounting for different ages, voice heights, and other characteristics should help algorithms become as accurate as possible.

2. New users need a superlative experience.

Much depends on the first experiences. When someone turns on their smart speaker or voice assistant and asks to make a call, they expect it to go through without any problems. If the technology destroys the initial exchange, users will be less likely to try it again in the future. All of this is related to basic learning behaviors.

While smart speakers tend to get all the press, smartphone users’ adoption rate for voice tech remains significantly higher. In order for smart devices to become more useful to hybrid workers, companies need to prioritize the “wow” factor and pull out all the stops to get a good first impression.

Can the technology e.g. Integrate with laptops and computers? Can devices be remotely controlled? These are the questions the workers will ask in the future.

Speech technology education needs to become more diverse and inclusive.

There are plenty of examples of algorithms taking on bias, such as Amazon’s hiring assistant, who favors men, and a tool for predicting recidivism, called COMPAS, that misclassifies black defendants as more likely to commit further crimes. These inequalities show that technology as a whole needs to do better, as it relates to diversity, equality and inclusion.

In a study looking at speech recognition tools from Amazon, Google, IBM, Apple and Microsoft, the collective software was 16% more likely to misidentify words if the speaker was black. This may not seem like a high percentage, but think about having to correct four out of every 25 words you speak or write. Unless it is solved, this problem prevents people from embracing speech technology.

As it did in many other areas, the pandemic accelerated the adoption of voice-activated technologies. Employees around the world who require greater flexibility and security measures, voice tech has probably secured a permanent place as a cornerstone of future work.

David Ciccarelli

Founder and CEO of Voices

David Ciccarelli is the founder and CEO of Voices, the # 1 creative services marketplace with more than 2 million registered users. David is responsible for defining the vision, executing the growth strategy, creating a vibrant culture and managing the company on a day-to-day basis. He is often published in stores such as The Globe and Mail, Forbes and The Wall Street Journal.

Leave a Comment