With rumours of upgrades to the iPhone's audio capabilities set to be announced next week at the iPhone 7 launch, let's look at how this could affect the Hi-Res Audio sector.
It's fairly safe to say that the way we consume music has changed significantly over the last five years. The CD is clinging on, but if Google Trends (see graph below) is accurate this is mostly as Christmas presents. Vinyl has made a comeback among audiophiles. And streaming services have really taken off; Spotify alone receives a greater proportion of Google searches than CDs, overtaking it for the first time in March 2015 and staying ahead for all except the Christmas period.
I've mentioned before that voice interfaces will provide new opportunities for great innovative products. But while big technology companies like Amazon, Google, Apple, Microsoft and Facebook invest huge amounts of money into developing the natural language interfaces that enable speech recognition, we've so far seen few smaller companies taking advantage of the rich diversity of available hardware and software that is available.
So it's great news that the Chinese hardware company Seeed Studios have announced on KickStarter their latest product, a modular development platform called ReSpeaker, with the objective of adding voice interaction to products in your home or office:
A lot has changed in the streaming audio sector since I wrote my blog last year. And while outwardly it looks similar, some companies have gone into receivership and others are seriously contemplating IPOs mainly due to the costs involved in delivering a lossless streaming service. Yet the rewards are likely to be great for those that can make it work, so let's see what's happened to the main players in the last year.
Dag Kittlaus, CEO of Viv Labs (viv.ai), has recently given some impressive demonstrations of his latest AI technologies, but one thing that always looks odd during the Viv demonstrations is that he holds the phone very close to his mouth. He’s almost eating his phone every time – it’s like something from Trigger Happy TV. Dag needs to make sure that the quality of captured speech is clear enough for the software to decipher the content of the commands, but it’s not a real-world scenario. We’re not all going to want to "eat" our phones in order to book an airfare for next week.
I've just watched a promotional film that Orange recorded in 1999 called The Future's Bright The Future's Orange. It was striking that all the film's ambitions are as relevant to voice interactions today as they were to the world of mobile communications almost two decades ago.
A massive change is coming in the way that we engage with the electronics that surround us at home, at work and in the built environment. We will be liberated from computer keyboards, touch screens and apps that keep us glued to our smartphones and laptops. Instead, there will be billions of voice-aware products that we will use to talk to get information and entertainment, and manage our everyday tasks.
How voice user interfaces and natural language programming will change the way we interface with the world around us
Microphones capture analog signals and thanks to digitization, have ridden the back of Moore’s Law to get ever smaller and cheaper to the point where the ability to capture decent sound in a tiny device has improved dramatically. This has taken audio capture out of the sound booth and recording studios and democratized it. Thanks to their ubiquity in handsets, laptops, headsets and media tablets, nearly 5 billion MEMS microphones are expected to be shipped in 2016 alone, at an average price point that will deliver four or five of them for less than a Euro or dollar.
Voice user interfaces - you can scarcely avoid the current hype in the media as giants like Amazon, and Google jostle to exploit the explosion of possibilities that advancements in natural language technologies are providing. Today’s neural networks use algorithms to process language through ever-deeper layers of complexity. Machines can now understand the meaning and intent of spoken words with unprecedented levels of accuracy. This has sparked a revolution for the power of voice.
We love the Amazon Echo at XMOS. It’s a new-to-the-world category of product, brimming with possibilities as a digital assistant, a hub for home automation as well as a point of presence to allow us to access all of Amazon’s goods and services.
At its heart is a piece of technology known as a smart microphone. This enables the Echo to capture voice samples with a high degree of accuracy before transmitting them to the Amazon Voice Services in the cloud where the query is processed before the answer is returned to the device in the form of Alexa’s soothing tones contained in an MP3 file.