Edge-computing voice recognition on DSP-capable RISC-V processors

2 mins read

Cyberon, an embedded speech solution provider, and Andes Technology, a supplier of 32/64-bit RISC-V processor cores, are collaborating on an edge-computing voice recognition solution, the Cyberon DSpotter.

The DSpotter uses Andes DSP-capable RISC-V CPU cores such as the D25F and comprehensive software development environment to provide a cost-effective, high performance, and easy-to-deploy solution.

AI has been driving the voice recognition market and, in addition to voice assistant services based on cloud-computing architecture, there are growing demands for local voice recognition by edge-computing devices. Locally executed offline command recognition provides users with a quick-response voice operation interface, protects personal privacy, and reduces the development and maintenance costs of the device manufacturers.

Cyberon's DSpotter has been developed to meet the needs of products where there's a strong demand for voice control, such as wearable devices, home appliances, IoT devices, etc., low computing resource requirements and high recognition performance.

The DSpotter has adopted a phoneme-based acoustic model to improve customers’ product development efficiency. Developers do not need to collect a large amount of training corpus in advance, rather they can create the required commands by simply entering text.

Cyberon has developed more than 40 global languages for DSpotter. Regarding the recognition performance, DSpotter has high accuracy and high noise robustness due to the strength of its acoustic model consisting of TDNN-F architecture. In addition, the algorithm has been well optimised by Cyberon to fit into general MCU platforms without using a dedicated neural network processor - which means that manufacturers can provide products with voice interfaces through cost-effective hardware.

In addition, the performance of DSpotter is increased significantly by leveraging RISC-V DSP/SIMD P-extension (RVP) instructions on AndesCore D25F, a 32-bit RISC-V CPU core with highly optimised 5-stage pipeline.

The RVP enables multiple data in integer registers to be processed in one single cycle, helping to efficiently boost the computations for voice, audio, image and signal processing. It also greatly improves performance for edge AI involving the above data types. The D25F is the first market-proven RISC-V RVP-capable processor, and has the most complete ecosystem in development tools, libraries for DSP and neural networks, and audio/voice codec.

"The AI technology of edge computing has gradually entered people’s lives,” said Alex Liou, VP of Cyberon Embedded solution BU. “Cyberon’s DSpotter algorithm helps developers to reduce development costs of voice recognition applications. We offer a convenient and easy-to-use tool to create customised commands of global languages.

"Developers can create various voice recognition applications efficiently to meet the strong and diverse demands of the market. The collaboration with Andes extends the application of DSpotter technology to RISC-V platforms and demonstrates excellent computing and recognition performances. It is hoped that it will bring more products with intelligent and convenient voice interface to people’s lives.”