09 August 2011
Real time operating system architectures are important for medical designers
Medical designers note the importance of choosing the right operating system
Medical device manufacturers understand the importance of the operating system (OS) and, contrary to common practice for embedded systems design, often select the OS before they choose the board.
The business needs which drive OS selection for medical devices are much like those for most other devices and require little elaboration: cost, quality, time to market, portability, support, vendor history, ecosystem, and vendor track record.
However, before a medical device can go to market, it must comply with legislation in the jurisdictions where it will be sold: for example, in the US, the FDA 510(k) pre market notification; in Europe, the Medical Devices Directive (MDD); and myriad national standards.
Though agencies such as the US Food and Drugs Administration (FDA) evaluate devices as a whole, compliance can also be affected by how the OS and other device components are developed and by how their functional safety claims are validated.
Things to look for from an OS vendor include:
• development in a good manufacturing process/quality management environment (for example, ISO 9001)
• validation of functional safety claims, including testing data, proven in use data and design verification, with the appropriate certification (such as IEC 61508 and IEC 62304)
• a tool set that can provide concrete evidence of functionality and behaviours in a given system: code coverage, system profiling and memory analysis artefacts.
For the purposes of this discussion, consumer grade medical devices, whose failure implies nothing more than an inconvenience, are excluded. For devices whose failure carries serious consequences, we can group key OS characteristics as follows:
• dependability: a correct and timely response to events, for as long as required
• connectivity: communication with diverse equipments and systems, either directly or through networks
• data integrity and security: safe storage of data and protection from unauthorised scrutiny
To these we can add:
• power management: important for any device running on battery power, even temporarily
• graphics capabilities: support for the user interface design that the device requires, including the concurrent use of multiple technologies such as OpenGL ES, Adobe Flash and Qt
• platform independence: an OS that can run on different hardware architectures allows development of modular systems that can be reused for different products
• multicore support: future projects will almost certainly require multicore processing
While each of these characteristics merits in depth discussion, we will focus on the one that is, arguably, most important – dependability.
GPOS or RTOS?
Dependability is a combination of availability (how often the system responds to requests in a timely manner) and reliability (how often these responses are correct). A realtime OS is engineered explicitly to guarantee availability and reliability and is, therefore, a better candidate than a general purpose operating system, which can only offer best effort performance.
Since an OS's architecture has a profound effect on a system's dependability, it should be the first item under scrutiny. The three most common RTOS architectures are real time executive, monolithic, and microkernel.
With the real time executive model, all software components (kernel, networking stacks, file systems, drivers and applications) run together in one memory address space. Though efficient, this architecture has two immediate drawbacks: a pointer error in any module can corrupt memory used by the kernel or another module and cause a system failure; and the system can crash without leaving diagnostic information.
Some RTOSs use a monolithic architecture, where user applications run as memory protected processes. While this architecture protects the kernel from errant user code, kernel components still share the same address space as file systems, protocol stacks, drivers and other system services. Hence, a programming error in any service can cause the entire system to fail.
In a microkernel RTOS (see fig 2), device drivers, file systems, networking stacks and applications reside outside the kernel in separate address spaces, which means they are isolated from the kernel and from each other. A fault in one component will not bring down the entire system, memory faults in a component cannot corrupt other processes or the kernel and the OS can restart any failed component without a system reboot.
• Real time commitments
To ensure that high priority processes always get the cpu cycles they need, the RTOS must allow kernel operations to be preempted. However, the time windows during which preemption may not occur should be extremely brief and there should be an upper limit on how long preemption is held off and interrupts disabled. Further, the RTOS kernel must be simple, so there is a limit on the longest non preemptible code path through the kernel.
• Protect against priority inversions
Priority inversion infamously plagued the Mars Pathfinder project in July 1997. It is a condition where a low priority task prevents a higher priority task from completing its work. Priority inheritance is a technique for preventing priority inversions by assigning the priority of a blocked higher priority task to the lower priority thread doing the blocking until the blocking task completes (see fig 3).
• Guaranteed availability
For many systems, guaranteeing resource availability is critical. For example, a heart monitor that loses connectivity may fail to trigger an alarm – with dire consequences for the patient. Time partitioning addresses resource starvation by enforcing cpu budgets and preventing processes or threads from monopolising cpu cycles.
Two time partitioning approaches are possible: fixed and adaptive. With fixed partitioning, the system designer divides tasks into partitions, allocating a portion of cpu time to each. No task in any partition may consume more than that partition's percentage of cpu time.
Adaptive partitioning enforces resource budgets, but when cpu cycles are available, it uses a dynamic scheduling algorithm to reassign them from partitions that are not using them those which can benefit from extra processing time.
• Monitor, stop, and restart processes
Safeguards against process failures cascading through the system and self healing capabilities are crucial to a highly dependable OS. Devices that require availability or safety guarantees may implement hardware based high availability solutions, as well as a software watchdog.
A software watchdog monitors the system and performs multistage recoveries or clean shutdowns as required. This process must be self monitoring and resilient to internal failures; if it is stopped abnormally, it must reconstruct its own state immediately and completely by handing over to a mirror process.
It's the OS, it's the vendor
Device manufacturers can improve their products' chances of success by paying careful attention to the OS. Devices that cannot be allowed to fail and reboot are best served by a microkernel RTOS, as this architecture is best-suited for ensuring system dependability and can support a full range of features and capabilities. An RTOS from a supplier with a track record of successful safety and security certifications can help reduce the costs associated with obtaining FDA, MDD and other certifications.
Justin Moon is product manager, medical, for QNX Software Systems.