The hypervisor advantage

5 mins read

Operating Systems (OSs) like Linux and Android are widely used in embedded systems but being large, complex and inevitably containing numerous flaws once compromised, an attacker can violate the security and take control of the whole system.

A method for improving the security of these systems is to use a hypervisor based on a secure microkernel that guarantees separation between the system software components. The microkernel is a secure layer of software below the OS that runs at a higher privilege level than the OS and virtualizes the hardware resources, the hypervisor allows the guest OS to run as it would directly on hardware. Because of the higher privilege level, the integrity of the system remains intact even if the guest OS is compromised. A hypervisor that is designed to be secure and reliable from the ground up offers significant advantages over hardware for implementing low-level security. Also, it can provide multiple levels of privilege so that a service with sensitive data could run in an isolated “compartment”, or partition, alongside a service with less sensitive information.

Since it is virtually impossible to test millions of lines of code, it is inevitable that Linux and Android will continue to contain security vulnerabilities and software bugs. Also, the increasingly interconnected nature of embedded systems allows hackers to exploit those vulnerabilities. To make such systems more secure, the first requirement is to lockdown and control capabilities of the whole device. Using a multilevel protection approach like Multiple Independent Levels of Security (MILS) is one way to ensure systems, running vulnerable operating systems, remain secure.

This is where virtualization comes into play: it provides the ability to run an operating system, named Guest, on top of another operating system, named Host. The hypervisor provides the virtualization environment to the Guest. As a hypervisor proceeds with the execution of the various Guest OSes on the system, it keeps the isolation in place, providing security comparable to physical separation.

The Separation Kernel

On a parallel note, the best-in-class OS design for both safety and security is accepted to be a Separation Kernel, a system where only the minimum amount of code is running at the highest level of privilege. The Separation Kernel consists of "compartments" named partitions. A software process runs in each of these partitions. Inside the partition, the separation is not guaranteed – if there are flaws, or it becomes compromised, this cannot pass outside the partition and change the operation or behaviour of other partitions or the separation kernel itself.

Virtualization, combined with a Separation Kernel, offers advanced features to embedded systems software developers such as ensuring the heterogeneous software components are free-from-interference and reinforcing the security and safety of the communication system by potentially adding filtering or encrypting intermediates.

Here the Separation Kernel delivers all the security functions the guest cannot guarantee. The part of the hypervisor that enables the guest OS runs in the same application space as the guest and indeed, does not even need to be residing in privileged mode, and is therefore running as an application on top of the Separation Kernel.

In this scenario an untrusted Guest operating system runs in a partition. This ensures memory resource protection and secure access control for I/O and other system objects, and to schedule workloads securely and efficiently across cores. By separating system components into different partitions, an attacker would have to compromise multiple partitions to modify or access sensitive data.

Protecting sensitive data

All sensitive data should be isolated in separate partitions, while other applications can remain on the Guest, e.g. Linux or Android. In this way, the security-critical applications are isolated from the less secure and connected applications, while the benefit of flexible frameworks on Linux is kept available.

However, a malicious application running on the Guest OS could exploit virtualization software bugs in some types of virtualization systems to obstruct or access other Guests or partitions and therefore breach confidentiality, integrity, and availability of their code or data. For this reason, the virtualization system must be able to guarantee the separation between Guests and partitions.

The microkernel architecture of some Separation Kernels, like the INTEGRITY RTOS from Green Hills Software, ensures that the kernel size remains small and is therefore more easily tested and verified to be proven free of bugs and security holes. In microkernel architectures, only basic services are part of the kernel i.e. communication between partitions (IPC), virtual memory management and scheduling.

Figure 1: Security-critical application running on Embedded Linux OS

There are other approaches to support multiple OS contexts than using a Separation Kernel or classic Type-1 Hypervisor. Linux Containers, LXC in short, is a method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel. However, containers do not address kernel-level attacks, in particular against device drivers, which are running privileged on Linux/BSD. They are less secure than Separation Kernel hypervisors because the kernel that hosts the containers has a much larger attack surface than a Separation Kernel. The smaller attack surface of the latter decreases the probability that a privilege escalation attack will allow an attacker to compromise the security of a virtual machine and affect other components on the system. Consider the scenario shown in Figure 1, where a generic security-critical application needs to exchange confidential messages over the internet, store confidential data and finally show an HMI to the end user. By exploiting Linux vulnerabilities and/or installing malware, an attacker can potentially take control of the whole system.

In contrast, by using a secure Separation Kernel that also provides virtualization, it is possible to improve the security of the system. Figure 2 shows a system where the security-critical components of the Linux application have been ported as Separation Kernel native applications and isolated in different partitions. The software components can interact using secure Inter Partition Communication (IPC) systems, so that secure requests can be performed by the Linux application interacting with the Secure Services partition. In this way the Secure Services partition can be responsible for the security-critical functionality, whereas the Network stack/Gateway partition can handle and filter the requests coming from the network.

Figure 2: Improving the security of the Linux OS using a Separation Kernel and Virtualization

In this scheme, the Separation Kernel manages to provide security in a manner equivalent to ARM TrustZone or other supplemental isolation mechanisms, by instead using more modern and flexible hardware assisted virtualization technology like ARM VE or Intel VT.

I/O device handling

Another important aspect regarding the security of a virtualized system is the device management. In particular, we refer to devices that will be available to a Guest OS. If the Separation Kernel were to give complete control of DMA devices to the Linux Guest, the security of the whole system may be compromised. Indeed, using the DMA, the Guest OS could instruct the Device to read or write directly to any area of main memory including the kernel. Unless specific protection is in place it’s possible to bypass all security mechanisms.

For this reason, many modern SoC have introduced some functionality to limit the scope of what a DMA Device can access: the IOMMU. Several implementations can be seen on the market, such as ARM SMMU, Intel VT-d or Renesas IPMMU. The IOMMU, much like a CPU MMU, provides a programmatic interface to define which ranges of addresses the device can access. This allows device drivers to run purely in a Separation Kernel partition, or a Guest OS. While direct Device access from the Guest is strongly discouraged where an IOMMU is not capable of protecting the system, this is sadly a common practice, taken as a compromise for the sake of either maintainability or time-to-market.

Where such a hardware protection does not exist, DMA devices should instead be managed by the Separation Kernel (a Virtual Driver partition for each DMA device) to ensure that a flaw in a Guest OS device driver cannot mis-program the DMA hardware and cause potentially fatal memory corruption. More precisely, the DMA requests need to be handled by the Separation Kernel, while the more complex part of the driver can still run in the Guest. This pushes driver implementation toward a paravirtualized, specific model, to ensure the behaviour can still perform.

The added complexity is the price to pay for keeping the system robust, safe and secure, and experience demonstrates that the overhead is actually smaller than anticipated when the interface is correctly designed.

Author details: Carmelo Loiacono is a Field Application Engineer, Green Hills Software