The year 2024 has seen a torrent of announcements about new large language model (LLM)-based artificial intelligence (AI) products from companies such as OpenAI and Google. In consumer and computing endpoints such as smartphones and personal computers, AI has not just joined the mainstream: arguably it is the mainstream.
So, it is striking that embedded device manufacturers have not yet widely adopted AI or machine learning (ML) in endpoint devices. There are multiple reasons for this, not least the constraints on compute resource and power consumption in traditional microcontrollers, applications processors and systems-on-chip (SoCs).
But device designers’ hesitancy about embedding ML algorithms has slowed the momentum behind ML in IoT and other types of devices, that are normally based on an MCU. This hesitancy is understandable, but the strong appeal and novel capabilities of AI software in consumer devices suggest that quicker adoption of ML technology in the IoT could offer huge potential value.
So, what should the manufacturers of MCUs be doing to help OEMs overcome the technical and operational barriers to the acceptance by embedded device designers of ML software?
The uncertainty principle
If embedded device designers are hesitant about building ML algorithms into MCU-based devices, it is in part because their training and development methods are adapted for a completely different kind of software that is deterministic and programmatic.
A classic real-time control function receives an input such as a sensor’s temperature measurement and performs a specified action, such as to turn off a device if the temperature exceeds a safety threshold. The MCU has evolved to become the pre-eminent hardware basis for this type of real-time and deterministic control function. An MCU based on a RISC CPU such as an Arm Cortex-M core provides the guaranteed latency and high-speed sequential execution of functions required in applications from motor control to sensor data processing to display control.
For the application developer, the writing of traditional programmatic ‘if/then’ software code has a logical thread, and its operation is bounded by knowable conditions that can be explicitly defined. Once debugged, the code has an entirely predictable and dependable output in a device such as a security camera or power tool – classic applications for MCU-based control. Today, such devices offer new scope to add value through the addition of ML functions to the existing control functions. And this ML software can run at the edge, directly in the MCU.
In the security camera, for instance, real-time monitoring for potential intruders can be automated: powerful cloud-based ML software can accurately detect and analyse the behaviour of people in the field of view. But the cost and power consumption of such a camera can be dramatically reduced if the video feed is pre-scanned by a local processor which can distinguish human shapes from other objects in the field of view. This means that the camera will only trigger the system to upload frames to the cloud when the video feed contains potentially relevant information, rather than continuously uploading the entire feed.
ML offers similar transformative value in an electric drill. Equipped with an ultra-wideband (UWB) radio, the drill can receive RF signals which vary depending on the material that it is boring into. This enables the drill to verify that the correct type of bit is in use, and to detect hidden hazards such as a water pipe buried in a wall.
To write traditional programmatic software to perform this type of function effectively might be impossible and would certainly require an unfeasible amount of development effort and time. But to train an appropriate neural network to recognise the pattern of, for instance, the reflections of a UWB transmission from a copper pipe embedded in concrete is a relatively trivial task. Likewise, large, open-source training datasets and neural networks are readily available for ML systems to detect the motion of people in a video stream.
But the entire approach to training such an ML algorithm is alien to the approach of a classically trained developer of programmatic software for MCUs or embedded processors. Instead of structuring, writing and debugging code, for ML algorithms the embedded developer has to think about:
• How to assemble an adequate training dataset
• The evaluation and selection of a suitable neural network
• The configuration and operation of a deep learning framework, such as TensorFlow Lite, that is compatible with the target MCU or other hardware.
While programmatic software might be improved through processes such as debugging and detailed code analysis, optimising the operation of ML inferencing is an entirely different type of task: it involves the analysis of factors such as the potential to improve the accuracy and speed of inferencing by increasing the size of the training dataset, or by improving the quality of the training dataset’s samples.
And the output from an ML algorithm is also qualitatively different from that of classic software: it is probabilistic, providing an answer to a question (‘Is a copper pipe buried in the wall?’) in the form of an inference drawn with a certain degree of confidence. It is not deterministic – the answer is probably right but could be wrong.
So, to gain the value of ML functionality, designers of MCU-based devices have to adopt a new development method and accept a new type of probabilistic rather than deterministic output. This is unfamiliar territory: it is unsurprising if the embedded world has been somewhat hesitant about implementing ML.
So, what can MCU manufacturers do to ease the transition to an AI-centric embedded world, and to make the operating environment friendly to ML software?
Three features of an ML-friendly MCU
For Alif Semiconductor, this is an existential question, since the company’s mission since its founding in 2019 has been to provide manufacturers of embedded and IoT devices with a new range of MCUs and fusion processors which provide the best AI/ML at the lowest power. Its business model depends on the widespread adoption of ML in the embedded world.
Using Alif’s analysis, three key features of an MCU give manufacturers the best chance of succeeding with new ML-based products.
1. Provide a hardware environment that helps rather than hinders the operation of neural networking algorithms. A RISC CPU is at the heart of the control functions of an MCU, but its sequential mode of operation is inimical to the execution of a neural network’s MAC cycles. A neural processing unit (NPU), on the other hand, is optimised for MAC execution and other neural networking operations. An MCU architecture which has one or more NPUs operating alongside one or more CPUs provides the best basis for the fast and low-power inferencing required for edge AI applications.
Testing against standard industry benchmarks for ML functions such as voice activity detection, noise cancellation, echo cancellation, and automatic speech recognition shows that a combination of an Arm Cortex-M55 CPU and Ethos-U55 NPU provides a 50x improvement in speed to inference compared to a high-end Cortex-M7 CPU alone, and a 25x reduction in power consumption.
2. Allow the device design team to work on ML applications in a familiar development environment. For the control functions performed by a CPU, the MCU market has succeeded in consolidating the choices down to one architecture: Arm’s Cortex-M. But some MCU manufacturers complement the CPU with a proprietary NPU, forcing users to leave the familiar Arm environment for the ML portion of their designs.
Eventually, it is highly likely that the MCU market will converge on Arm for the NPU as well as the CPU. An MCU with an Arm Ethos NPU alongside a Cortex-M CPU enables developers to share the same Arm tools and software ecosystem across both the control and ML portions of the application.
3. Enable early-stage experimentation with popular ML application types. The probabilistic nature of ML inferencing lends itself to a trial-and-error approach to proof-of-concept development, based on the use and refinement of open-source neural networking models and training datasets.
Consequently, Alif Semiconductor provides its AI/ML AppKit, development hardware which is pre-configured for the collection of vibration, voice and vision data, and is supplied with a broad set of demonstration systems for various AI use cases.
The kit features a 4” colour LCD, an integrated camera module, PDM and I2S microphones, and inertial sensors. Device driver and operating system packs, as well as demonstration applications and examples, are published on the GitHub platform.
Making embedded devices more valuable
The opportunity to bring the transformative capability of ML to embedded devices is available now: the technology is real and ready for mainstream deployment. While adoption might previously have been slowed by the lack of availability of ML-native MCUs operating in a familiar Arm development environment, this reason for hesitating to implement ML no longer applies.
The introduction of products such as the Ensemble family of MCUs and fusion processors from Alif Semiconductor, which feature a choice of single- and multi-core Arm Cortex-M55 CPU and Ethos-U55 NPU-based product options, has given embedded developers a new, ML-friendly hardware and development platform.
With a development tool such as the AI/ML AppKit in hand, it is time to take the plunge into the world of machine learning at the edge!
Author details: Henrik Flodell, Senior Marketing Director, Alif Semiconductor