XMOS 2-mic dev kit for Alexa Voice Service qualified by Amazon

XMOS 2-mic dev kit for Alexa Voice Service qualified by Amazon 1600 1050 XMOS

29th October 2019 , Bristol, UK – The VocalFusion 2-mic dev kit for Alexa Voice Services launched by XMOS for smart TVs and set-top boxes in July 2019 has been qualified by Amazon, the company announces today.

The qualification from Amazon highlights the product’s engineering innovation, enabling manufacturers to build Amazon Alexa voice capabilities into their products more easily and cost effectively than ever before.

XMOS’s dev kit, which is based on the XMOS XVF3510 voice processor, is a BOM-efficient solution built for modern living spaces. The product’s next-generation acoustic algorithms create a compelling, premium experience with far-field voice control, opening the door to a more natural conversation with technology.

Mark Lippett, president and CEO at XMOS said: “Amazon is seen by many experts and consumers as the leader in the world of voice, and our engineering expertise will be helping to transform the way consumers interact with everyday devices like TVs. 

“This qualification from Amazon is a real endorsement for the VocalFusion 2-mic dev kit and the XMOS brand.”

The XMOS XVF3510 voice processor is available at a market-leading cost of $0.99 on orders over one million units a year.


About XMOS
XMOS stands at the intersection between voice-processing, edge-AI and the IoT (AIoT). Backed by some of the best names in high-tech venture capital, XMOS’ unique silicon architecture and differentiated software delivers class-leading voice-enabled solutions to a wide variety of AIoT applications.

Media contact
Sanjay Dove/Ben Musgrove
+44 (0)20 8408 8000

IFA 2019 was a showcase of connected living

IFA 2019 was a showcase of connected living 4032 3024 XMOS

Berlin: 4-9 September

At Europe’s largest tech show, voice assistants were out in full force. Last year, IFA was all about smart speakers. This year, Google’s Assistant and Amazon Alexa were everywhere inside Berlin’s sprawling Messe convention centre. The two leading voice assistants were ever-present in the expo halls, integrated in a vast number of partner products and demonstrations, from soundbars and TVs to laptops and cars. Google and Amazon could have foregone their own stands and still maintained massive presence throughout the event. Bixby, by contrast, was nowhere to be seen, despite Samsung having one of the largest stands of the entire show.

Shortly before IFA kicked off, XMOS’s new 2-mic voice interface passed Amazon AVS qualification with flying colours. The XVF3510 developer kit is aimed at enabling manufacturers to embed top-performing voice interfaces into smart products at an industry-leading price. While news of the qualification arrived too close to IFA for us to create a big splash about it at the show, we did get an early mention on the Alexa Developer Blog. And we took advantage of the acoustically-challenging conference environment to give demonstrations of the XVF3510’s impressive performance in noisy conditions. A successful barge-in from a distance of 3 metres against a backdrop of 90db conference noise makes for a compelling demo, especially when you add in separate point source of music from a mobile phone for good measure.

Amazon’s smart home stand featured a range of products, including a range of 3rd party smart home products, as well as a selection of developer kits. Today more than 60,000 smart home products can be controlled with Alexa, the vast majority of which are built by 3rd parties using chips like the new XMOS XVF3510. We were invited behind the scenes to the Alexa partner drinks, where we had a chance to meet other systems integrators, ODMs and dev kit providers, including our friends from Fraunhofer and Bragi, together with several new contacts made at the event.

Smart TVs played a prominent role on the expo floor. We saw TVs with integrated smart mics from Vestel and Toshiba, while Grundig had smart mic integrated into a soundbar. Sony has packed its new Android TV with a plethora of assistants, including Google Assistant, Amazon Alexa and Apple AirPlay, giving end users the full range of services to choose from. LG has voice control with both Alexa and Google Assistant, but so far has only opted for push-to-talk. We believe the days of the remote control are numbered, and we’ll start to see more and more TVs with a built-in far-field voice interface.

Whichever way you look at it, it is clear that the Voice Market is crossing the chasm into general adoption, and we’re excited to be leading the charge in this growing ecosystem.

XMOS is a Gold Sponsor of the Bristol Technology Showcase

XMOS is a Gold Sponsor of the Bristol Technology Showcase 1920 1080 XMOS

Bristol, 17 September 2019– XMOS is delighted to announce Gold Sponsorship of the forthcoming Bristol Technology Showcase, due to be held on Friday, 8th November at Aerospace Bristol, home of the iconic Concorde.

Designing intelligent human-machine interfaces for the IoT, XMOS is backed by some of the best names in high tech venture capital.  With a 10-year pedigree in Hi-Res and multichannel USB audio, the organisation now sits at the forefront of the far-field voice interface market, opening the door to a more natural conversation with technology.

Bristol Technology Showcase (BTS) will focus on emerging technologies and themes encompassed in the 4th industrial revolution. The impressive line-up of speakers includes Dr Paul Neil, VP Marketing and Product Management at XMOS who, with over 27-years international experience in the semiconductor industry, will deliver a session exploring the future of voice.

XMOS will also feature in the exhibition showcase, enabling visitors to engage and interact with the very latest in voice technology.

Speaking about the event, BTS Founder Nick Rutherford said “We’re delighted to be bringing a new discussion to Bristol and locating it at the home of Concorde. The speed of technological change is so quick today and it’s only going to get faster. It’s wonderful to be able to host an event with some amazing technology companies, speakers and wider businesses and allow delegates to experience some of these technologies first hand. Great Bristol businesses, amazing technology and the ability to showcase and demonstrate them, it’s going to be a great day.”

XMOS CEO, Mark Lippett said “This is a fantastic opportunity to partner with a local event and showcase the latest voice technology from XMOS. BTS will really raise the profile of technology in the South West and beyond and we’re delighted to offer our support.”

Tickets can be purchased here.

view website bristoltechnologyshowcase
follow on Twitter @Btechshowcase
follow on LinkedIn: Bristol Technology Showcase

PLEN Robotics announce PLEN Cube AI assistant, voice-enabled by XMOS.

PLEN Robotics announce PLEN Cube AI assistant, voice-enabled by XMOS. 680 453 XMOS

Japan, 31 July 2019–PLEN Cube is a new, AI assistant service with a big personality! It’ll face you when you speak, give a little shake when it wakes up, nod its head as it ponders – it’ll even dance to your music. The PLEN Cube has plenty to make you smile.

Don’t be fooled by it’s size, it may fit in the palm of your hand but this playful, smart cube is a technology powerhouse with built-in full-HD camera, display, microphone array and speakers along with cutting-edge software speaker, and microphone array. It’s equipped with the latest technology such as facial recognition, face-tracking and voice recognition.

Thanks to the XMOS XVF3100 voice processor, you can control PLEN using your voice. It’ll hear your voice command accurately from close and all the way across the room. Get the weather forecast, stream music, set reminders and update your schedule – It’s good to talk!

Simply tell it to take a photo or video and it’ll move into action. PLEN Cube can also be understood immediately with “body language” and audiovisual cues, bringing a natural, intuitive feel to the way you communicate with your cube.

PLEN Cube can rotate 360 degrees to take a panoramic photo and uses computer vision technology to follow your facewherever you go. So PLEN can capture all your special moments and – sinceit’s wifi and bluetooth-enabled – post them to your social media sites too!

We think there’s PLENty to get excited about.


About XMOS

XMOS stands at the interface between voiceprocessing, biometrics and artificial intelligence. Today our unique silicon architecture and highly differentiated software delivers class-leading far-field voice capture for consumer electronics, and we’re building for a more natural human machine interface tomorrow. For more information, please visit

About PLEN Robotics

PLEN Robotics is based on the experience of PLEN Project Co., Ltd., which has been developing small robots for over 10 years in Japan, with the aim of developing home and personal service robots that are more practical than ever and that make people’s lives more efficient.


XMOS Contact (Japan)

Okawa Takashi

One step closer to voice-controlled everything

One step closer to voice-controlled everything 1024 468 XMOS

A candid review of our XVF3510 voice processor from Max Maxfield.

XMOS delivers voice interface dev kit

XMOS delivers voice interface dev kit 800 422 XMOS

Electronics Weekly coverage of the new VocalFusion dev kit announcement at Voice Summit ’19.

XMOS unveils TV-optimised voice processor for Amazon AVS

XMOS unveils TV-optimised voice processor for Amazon AVS 800 422 XMOS

New Electronics announces launch of the new XVF3510 voice processor at Voice Summit ’19.

XMOS announces its next-generation voice processor

XMOS announces its next-generation voice processor 1800 921 XMOS

XMOS drives the voice-enabled TV market forward with its next-generation voice processor and reveals a new dev-kit for the Amazon Alexa Voice Service.

All the performance at half the price– with intelligent algorithms and a $0.99* price tag, the new 2-mic voice processor from XMOS is for developers who want the best choice in voice.

VOICE summit, New Jersey, 23 July 2019– At the world’s largest voice event, XMOS announces its VocalFusion dev kit for Amazon AVS*, a far-field 2-mic solution optimised for smart TVs and set-top boxes.

The new dev kit is based on the XMOS XVF3510 voice processor, which delivers a big performance at a market leading cost of $0.991, enabling manufacturers to embed a voice interface into mass-market smart TVs and set-top boxes economically.

Developed in the UK and purpose-built for modern living spaces, XMOS’ next-generation acoustic algorithms produce a compelling experience in far-field voice control, opening the door to a more natural conversation with technology.

Working intelligently to analyse the acoustic environment, XMOS technology identifies and isolates a voice command from every other sound in the room (including any media streaming through the device itself), enabling far-field voice capture with close range precision.

XMOS algorithms:

  • Stereo Acoustic Echo Canceller: Suppresses unwanted speaker echo and enables barge-in
  • Interference Canceller: Nulls point noise sources to cancel out unwanted background noise.
  • Our Adaptive Delay Estimator dynamically adjusts audio reference signal latency, ensuring the Acoustic Echo Cancellation algorithms deliver a smooth, real-time experience.

“The recent uptake and demand for devices featuring voice assistants has continued to surge”, said Simon Bryant, Research Director at Futuresource Consulting. “The strong adoption in multimedia and entertainment is expected to continue and this new device from XMOS addresses what manufacturers who want to add far-field voice control to their product lines will be looking for.”

“Giving people the freedom to control the TV with their voice from anywhere in the room,is a more natural experience – you simply tell your TV what you want to watch.” said Mark Lippett, CEO, XMOS. “Our new far-field voice processor gives developers an easy to implement solution, at a very compelling price.”

  • Watch a demo of the product here.
  • Find out more and buy the dev kit from today (it will be available in distribution end August 2019).
  • For XMOS news: “Alexa, ask XMOS what’s new”


Notes to editors

* $0.99 unit price on orders over one million units a year. For smaller orders, other prices apply. Raspberry Pi not included with VocalFusion dev kit for AVS. Amazon, Alexa and all related logos are trademarks of, Inc or its affiliates.

The power of voice: A new era for TV enthusiasts everywhere

The power of voice: A new era for TV enthusiasts everywhere 900 506 XMOS

Today we launched the next generation of voice technology that’s designed to drive the voice-enabled TV market forward and transform how we search and discover content.

Futuresource Consulting forecasts that nearly 700 million new smart speaker, smart TV, set-top box and smart home devices will ship in 2019, and voice assistants will be built into an increasing number of them. Built-in voice interfaces have attracted a lot of attention from manufacturers, but cost and integration complexity concerns pushed the early voice implementations towards ‘Alexa compatible’ and ‘Push to Talk’ solutions. However, people are switching on to the power of voice and the trend is set to rise over the coming years – which is bringing manufacturers back to the ‘built-in’ solution, because it delivers a more relaxed, intuitive experience.

XMOS’ new technology presents a real opportunity for manufacturers to bring a compelling voice control experience to the masses, effectively and economically.

When the TV in your living room offers far-field voice capture that works with close range precision (i.e. you can just tell it what you’d like to watch from anywhere in the room),  the remote control and push-to-talk voice experience starts to feel outdated. Voice-control unshackles us from hierarchical menus and never-ending pages of content, freeing us from push-button and touch-based interfaces. Whilst we may still need the remote control to switch between different hardware, the ability to just say: “Alexa, play Friends series two, episode three” or even “Alexa, find the pivot episode in Friends”  provides a much richer search experience. And it’s easy to see how voice control is perfect for TVs and set-top boxes – and how far-field makes that a more natural, conversational interaction.

The XMOS solution

Our new XVF3510 two-mic voice processor is designed with our modern living spaces in mind. Our new VocalFusion algorithms work intelligently to analyse the acoustic environment, detecting and isolating a voice command from every other sound in the room (including any media streaming through the device itself or other devices nearby), making it ideal for smart home devices and integration into smart TVs and set-top boxes. And with a price tag of just $0.99, it hurdles the cost barrier nicely.

Simon Bryant, Research Director at Futuresource Consulting said: “The strong adoption in multimedia and entertainment is expected to continue and this new device from XMOS addresses what manufacturers who want to add far-field voice control to their product lines will be looking for.”

The XVF3510 offers a clear incentive for manufacturers to move beyond the world of push-to-talk, ‘Alexa compatible’ and touch-based control interfaces. With that, we’ll start to see the real power of voice emerge.

For further technical details, the full product brief can be found here.

The XVF3510 will enter general distribution in August 2019. Developers can request a dev kit from XMOS here.

Embedding voice – DSP chip or DSP algorithms on the Applications Processor?

Embedding voice – DSP chip or DSP algorithms on the Applications Processor? 5578 3719 XMOS

Why smart developers choose a DSP chip rather than run DSP algorithms on the Applications Processor …

In our increasingly connected, intelligent world, voice-control opens the door for a more natural, engaging conversation with technology. Reliable, accurate voice capture relies on advanced digital signal processing (DSP) algorithms and good acoustic design to ‘hear’ the wake-word and pick up the voice command – even in a noisy environment. Some of the key algorithms include:

  • Acoustic-echo cancellation: When you give a voice command to a TV, the microphones will capture both your command and the audio track coming from the TV speakers. That captured audio track – the acoustic echo – needs to be cancelled from the captured signal so it ‘hears’ the wake-word’ first time, every time and captures a clean voice command to send to the speech recognition service (eg Alexa). This is also known as ‘barge-in’.
  • Beamforming: This detects and tracks where the voice is coming from, so the command is captured accurately, even if you’re walking across the living room.
  • Interference Canceller: This ‘scans’ the soundscape of the room and ignores (cancels out) the point noise sources, ie anything that’s not the voice of interest, in the surrounding space. The improved voice signal can then be sent to the speech recognition service.
  • Noise suppression: Noise suppression algorithms target diffuse noise sources such as air conditioning and road noise. They remove the stationary and non-stationary background sounds to enable accurate, reliable voice detection.

As voice starts to move beyond smart-speakers and into the living room, developers are having to figure out how best to build a voice interface into a smart TV or set-top box. And one of the common questions we hear is whether to embed the DSP on a separate voice processor (chip) or run DSP algorithms on the Applications Processor ….

Should you run DSP algorithms on the Applications Processor?

Most consumer electronics devices are built around an Applications Processor. Put simply, the more powerful the processor, the quicker your programmes, apps, games and features will appear. As a developer, you may choose to simply execute the DSP algorithms on the Applications Processor (host processor). At first glance, this seems cost effective and easy to integrate – primarily because there’s no additional chip to purchase and integrate. However, there are some significant downsides to this approach that developers need to consider.

  • Adverse impact on capacity: because the host processor handles the core system processes, it’s one of the most expensive elements of the electrical design. The more powerful the host processor, the more tasks it can handle – but in turn, it’ll cost more, consume more power and require more space. As a developer, you’ll want the cheapest processor that’s capable of running all the core functions, with minimal power. Therefore, adding DSP algorithms onto it, imposes additional processing that burdens the chip and takes up capacity that can otherwise be used for core functions.
  • Bill of Materials (BoM): This will be pushed up beyond original estimates as additional components will be required to support the integrations (eg microphone aggregator).
  • Performance risk: The DSP algorithms will be constrained by the capacity that’s available on the host processor and performance may be compromised.
  • Integration complexity: Adding algorithms onto the host processor, puts all of the integration demands onto the software team and can rapidly increase the cost of development. It can also create challenges in delivering with in the real time constraints to produce a glitch-free audio stream, without increasing the latency of the system. Further challenges may arise in the future around in-field updates and whether there’s sufficient capacity to run the update on the host processor.

How does that compare with running DSP algorithms on a separate chip?

A standalone DSP chip solution offers some compelling advantages over licensing DSP algorithms and integrating them into the host processor.

  • Transfers work away from host processor: Running the DSP on a separate chip, keeps the host processor free for core functions – and avoids impacting the software team
  • Easy to integrate: A ringfenced solution needs to be planned into the electrical design, but using an external DSP allows you to use standard hardware interfaces (such as I2S or USB for connectivity) which simplifies the integration task significantly. A separate chip ensures there are no dependencies between the code on the DSP chip and that on the host processor, there’s simply an API to deliver processed voice samples in an uninterrupted stream.
  • Future-proof solution: You benefit from the latest developments in voice technology; plus, in-field software releases are delivered easily via firmware update.
  • Accelerated time to market: A DSP chip offers a plug and play solution which separates the voice-capture solution from the rest of the TV electronic design, enabling developers to deliver a built-in voice interface rapidly.

Choosing the right far-field voice interface for your TV or set-top box is a key decision for your company. A separate voice processor such as XMOS’ VocalFusion often provides a more flexible and cost-effective solution over the complete lifecycle of a TV or set-top box. It reduces project risk, minimises dependencies between software functions and avoids burdening the host processor.

XMOS solutions are cost-effective and offer the flexibility to remove additional costs from your system design. Find out more about our voice solutions here. Or get in touch with one of our sales team here.

We’re here to help you transform the way people find and enjoy content through your products.