The Sonic Internet

4 mins read

Whether in the smart home or in in-car entertainment the audio market is changing rapidly. The audio industry is enjoying a sustained period of rapid growth, particularly in key audio markets like the US and UK, and audio is becoming more important as developers look to create more immersive experiences in the fields of gaming and virtual reality.

According to Adriaan Thierry, Managing Director of EMEA Sonos consumers are attaching much greater importance to ease of use and now expect a seamless user experience when using different products and platforms.

Speaking at the Audio Collaborative Event organised by market analyst Futuresource Consulting last month, Thierry said that there needed to be much more collaboration.

“We need more open networks and platforms and we need to work more closely with content providers.”

Sonos is opening up its platform to partners and upwards of 100 developers are now planning to work with the company next year.

“The ‘Works with Sonos’ programme is intended to support an inclusive spectrum of partners,” Thierry explained. “We’re piloting a development platform so anyone wanting to build using it will have the necessary tools and documentation they need.We’ll also be offering a certification programme - the Sonos badge.”

A key driver behind the growth in the sector has been the use of voice and the arrival of the Virtual Personal Assistant (VPA), for many the next generation human machine interface for connecting devices and applications.

According to Simon Bryant, Director of Research, Futuresource Consulting, “Today there are 9 platforms, 33 brands and 64 products offering voice assisted technology. These platforms are being integrated into an increasingly long list of products such as smart speakers, TVs, gaming devices and set top boxes, all of which are now integrating voice assistance support.”

One company that is looking to address this market is XMOS, which supplies advanced embedded voice and audio solutions to the consumer electronics market.

Its newly launched XVF3500 voice processor has been designed to deliver 2-channel full duplex acoustic echo cancellation (AEC) and has been designed for developers working with voice-enabled devices which require stereo-AEC support for “across the room” voice-interface solutions.

The XVF3500 voice processor from XMOS

According to Mark Lippett, President and CEO at XMOS, “This device and accompanying evaluation kit will help to fuel further growth in the integration of embedded voice controlled devices at the edges of our rooms, particularly those used for high quality music and TV control.”

The solution captures voice commands from across a room which are then processed by a cloud-based speech recognition system, and is able to work in complex acoustic environments.

Increasingly sophisticated, these types of devices look to provide de-reverberation, automatic gain control, and noise suppression in order to provide crystal clear voice interaction experiences to enable VPAs to provide much greater functionality across a range of dumb objects.

Golden age

Thierry argued that the industry was entering a ‘golden age’ but warned that if the ‘Sonic Internet’ was to succeed, platforms would need to be opened up so that anyone with an idea or innovation could develop them using a platform.

“Another major problem is how you funnel content through devices that have not been designed, or built, for sound,” suggested Bryant. “While we are being encouraged to buy more devices, they aren’t particularly smart. So, if we want to enjoy new services, companies will have to work much more closely with their existing partners.”

According to James Chapman, VP Product Development at Qualcomm, “In a more connected world it’s likely we’ll need interfaces that will be capable of aggregating usage and audio artificial intelligence will most likely be the new user interface of choice.”

He continued, “I think when you look at audio as an interface its impact will go far beyond simply music; we’re looking at an entirely new user interface, something we will be able to talk too without needing key words.

“How we interact with technology will change in a way that we’ve not seen since the advent of the touch screen.”

Chapman highlighted the possible impact a voice user interface (UI) could have on devices such as headphones.

“We take headphones off as a matter of course throughout the day and form factors are varied. But as our relationship with voice UI changes and we embrace new services, new value is created and users will quite possibly start to see the benefit of wearing headphones throughout the day.

“For companies like Qualcomm, our focus has traditionally been on providing audio quality but, if you are looking to engage with a VNA all day, the industry is going to have to develop devices with much greater processing capabilities, with much smaller form factors as well as more efficient batteries.

“If you’re able to talk to a VNA throughout the day, content providers will also be able to provide many more varied services.”

Microphone technology

As device makers look to integrate voice assistants into their products, more durable microphones will be required that will enable them to build flexible hands-free devices.

With the introduction of new voice-forward products, device makers need to consider how the environment in which their product will be used will impact the components adopted during the design of the device.

Arrays with four or more microphones tend to offer clearer audio pickup, enabling VNAs to better understand the user but these larger microphone arrays are also more fragile.

One solution is piezoelectric MEMS microphones that enable device makers to add larger mic arrays into devices.

Vesper Technologies has developed the first commercially available piezoelectric MEMS microphone and without the need for a back plate, the microphone plates can bend and experience stress without quality degradation.

“As we move to voice driven HMIs and operating systems its impact will go far beyond the music and radio sector, said Patrice Slupowski, VP Digital Innovation, Orange.

”Voice provides an opportunity to engage with consumers in many different ways. Business models will have to adapt and I’m not sure that the companies currently managing set top boxes or pay TV services, for example, will be happy to hand over the control of the interface to the likes of Amazon or Google via their VNAs.”

So who is going to control the touch points with consumers? Brands and companies need to think about this, according to Chapman.

While consumers will have a strong relationship with voice interfaces, according to Nikolaj Hviid, CEO of Bragi, “It will only be one element of how consumers will interact with suppliers. What is crucial is how you combine hardware with the consumer experience and new business models. That is what will generate value. Focusing solely on the hardware seems to be the wrong strategy.”

According to Chapman more processing will be done at the local level.

“That’ll be a challenge but once that’s achieved and we’re able to run artificial intelligence on even the tiniest of devices to better understand the users’ intentions there will be no need to send data to the cloud and for the user’s data to be transferred to someone else.”

Interoperability is crucial to the successful rolling out of VNAs but with only a few platforms currently available, the scale needed to deliver the benefits of these new systems, especially when using artificial intelligence, has yet to be reached but it’s certainly coming.