Could employing an RTOS help deliver more reliable systems?

4 mins read

Delivering ultra-low power operation in IoT nodes means they will spend most of their time in a dormant mode, waiting to exchange data with a gateway. Could employing an RTOS help deliver more reliable systems?

Connectivity is ubiquitous and the ability for electronic devices to exchange data using various methods and protocols existed long before the Internet. But connectivity has become more relevant in recent years because it underpins the IoT and this has led to significant growth in technologies that provide or support connectivity.

While adding basic connectivity between two devices isn’t difficult, extending that to an infrastructure like the Internet takes some careful design. In this context, ‘careful’ is synonymous with ‘complex’, which could make building-out the IoT a challenge that just keeps getting bigger. However, while ‘careful’ may be an intangible quantity, complexity is something the electronics industry manages very well.

If viewed as a system, the IoT requires a systemic approach to design – technologies that scale, interoperate, can adapt to new scenarios and challenges, while all the time meeting constraints. Embedded systems are often described as constrained devices, which makes embedded design a major driver in the IoT.

Not just for PCs
Outside of the embedded community, operating systems (OS) are probably seen only as the software that runs our smartphones, tablet computers and, perhaps, servers. The devices we see as embedded – typically single purpose, closed ‘black boxes’ – may not be seen as even needing such an operating system. However the IoT is likely to see billions of new devices being interconnected over the next several decades and this will require the benefits of an operating system that can offer real-time functionality.

Most OSs offer a platform to build complex systems, providing key functions that typically include connectivity. An RTOS also offers these ancillary functions, but differs from other OSs in one crucial area: the ability to offer deterministic execution. But why is this important in the IoT?

The IoT will comprise innumerable distributed networks, which will be managed by the infrastructure that is the Internet. Each network will likely be managed by a local gateway, preserving the inherent hierarchy of communication networks. In turn, those gateways will be responsible for managing communications between various nodes, each likely to be an embedded (and therefore constrained) device.

It is also likely that many of these devices will be sensor-based and powered by either harvested energy or simple batteries. That makes them slaves to ultra-low power operation. Couple that constraint with the fact that they are likely be connected using some form of wireless interface and it becomes clear that power conservation will be vital.

Using an RTOS to manage these nodes makes sense, particularly if they need to communicate with a gateway wirelessly. The middleware needed to enable communications will be available in an RTOS that has been optimised for constrained devices. Furthermore, using an RTOS to manage those communications could prove to be the most effective way to deliver power-efficient operation.

It’s all in the timing

Many IoT nodes, particularly those that need to work for multiple years from a single battery, will spend much of their time in sleep mode to conserve energy, waking periodically to transmit data. If the gateway were only serving one node, this would be simple, but if it is managing hundreds of nodes, ensuring data is exchanged within specific time slots will require deterministic execution – and each node may believe it is the only node on the network (see fig 1). The protocol used may not offer the ability for long exchanges that would consume valuable energy. Instead, it might limit operation to a burst from the node which must be received by the gateway at a specific time (see fig 2).

Fig 1: Nodes on the IoT may not be aware of other nodes on the same network, making their timing more critical.

If the node and/or the gateway isn’t prepared for the exchange, packets could be lost. That may not be serious in one or two instances, but if it happens repeatedly, network integrity could be compromised.

Using an RTOS could guarantee synchronisation between nodes and gateways – and many RTOSs are available that address this use case, such as µC/OS, ThreadX and FreeRTOS — all of which have been ported to a wide range of 16 and 32bit MCUs.

All of these RTOSs are robust and capable of delivering deterministic operation in embedded devices when the application code is written to conform with RTOS behaviour (see below). However, all RTOSs are subject to real world anomalies that can impact performance. As these anomalies could have unforeseen consequences, it is important to understand their causes and the possible remedies.


Fig 2: Gateways may need to service hundreds of nodes in a deterministic way to preserve the integrity of the network's operation.


Finding fault
While desktop OSs are known for their ability to handle multiple tasks at once, the underlying processor will only be executing one thread at any time. Even multicore processors are limited in this respect and it is the OS’ job to minimise the time it takes to switch between contexts in order to deliver multitasking.

In an RTOS, deterministic operation means task switches must be prioritised, to ensure the task with the highest priority is always the one being executed. As one task terminates, the task with the next highest priority will begin, or be interrupted by a higher priority task. This prioritisation is fundamental to the operation of an RTOS, but it can result in tasks being stopped or interrupted which, means achieving real-time execution across all task priorities could be difficult.

While this is part of designing with an RTOS, there is also the presence of jitter – the small amount of variability in how long it takes to execute tasks. Most of the time, jitter can be compensated for in the design through prioritisation. However, each design is different and runtime events may conspire to render even the most diligently designed code prone to anomalies such as jitter.

Tracking down this kind of activity in embedded code is difficult and requires close inspection of the code’s execution under real-world conditions. Debugging technologies can provide insights into code execution; probes like Segger’s J-Link and J-Trace support this type of debugging through breakpoints inserted into the code and the ability to capture core activity through debug ports. Similarly, Micrium’s µC/Probe is a Windows application that allows memory locations in the target MCU to be read from or written to during runtime. Both these technologies can be complemented by Percepio’s Tracealyzer technology, which displays CPU activity on a per-task or per-interrupt basis, as well as mapping that activity over time. This provides an effective way of identifying timing variations and finding resource conflicts during runtime.

Building-out the IoT will require a range of technologies, some of which are yet to emerge. Employing an RTOS in IoT nodes could deliver major benefits by ensuring the reliable behaviour of network nodes, however large or small.

Author profile:
Dr Johan Kraft is CEO and founder of Percepio.