Access Denied

4 mins read

Some software problems just never seem to go away.

The Common Weakness Enumeration (CWE) list of security vulnerabilities compiled by US defence institute Mitre covers much more than embedded software but so many of the problems wind up in the edge devices now attached to the internet.

Even cross-site request forgery, an attack that exploits logical flaws in web requests, have been used against printers and other embedded devices that implement browser-based setup scripts.

The out-of-bounds write continues to be the runaway winner. The most famous of this kind of problem is the stack poisoning attack caused by deliberately overflowing a buffer used to cache incoming data. The overflowing data winds up being used by another process, leading to the hacker gaining control over at least part of the target. Protections in the desktop and cloud environments have served to curtail stack attacks. Research by Microsoft performed several years ago found stack poisoning had become far less common as an attack vector.

However, the focus seems to have shifted and now the heap is the big target, with pointers being abused to load unwanted data into system data structures that can then be used to take control of the target.

Such tools have appeared in embedded systems as well. Last month, Segger launched a version of its compiler that can automatically insert limit checks into stack code to catch attempted overflows at runtime. Another approach, supported by vendors such as Trustinsoft is to check the source code itself to catch situations where code implicitly accepts oversized or corrupted payloads.

Because pointers are almost inherently untrustworthy, they come in for special treatment in the hardware extensions defined by a group from the University of Cambridge and SRI International. Though early funding came from DARPA when the project got underway in 2010, it has now received backing from the UK government’s R&D-funding agencies. In spring 2022, engineers gathered at a Digital Security by Design (DSbD) event organised by Digital Catapult to hear about the UKRI funding being put behind a programme intended to build a bridge across the chasm from academic research to where chipmakers feel confident enough about tool and market support to put it into their own processors.

Not a new concept

The underlying concept is far from new, dating back to ideas introduced in minicomputers designed in the late 1960s and early 1970s.

Intel attempted to bring the concept to the microprocessor with its ill-fated iAPX432 project and even carried over some of the operations into the 80286 and its successors. But with the protection mechanisms controlled by microcode, they incurred overhead on the order of hundreds of cycles.

The work on Capability Hardware Enhanced RISC Instructions (CHERI) called for a more streamlined implementation. Working initially with the MIPS64 architecture, the team came up with a heavily expanded format that doubles the size of a regular pointer. On top of the 64bit pointer, which the program can manipulate, is a 64bit area that contains tags that tell the processor what the program is allowed to do in the space and information on how big a memory space that pointer can cover. User programs can only reduce the size of the memory window. Any attempt to access memory outside the defined space will trigger an exception. Shadow registers in the processor avoid having to go to main memory to work out what accesses can take place and implementors see caches as being important to further speed up operations.

Since 2014, the Cambridge group has worked with Arm on implementing the technology efficiently, and built its own experimental prototype called Morello first using the company’s Fast Models technology in simulation followed by a superscalar hardware implementation built on a 7nm process. This hardware is central to the latest DSbD initiative which will involve trials to identify how much effort porting to CHERI involves and the performance effects.

By spring this year, close to 30 companies have signed up for the DSbD programme. The Defence and Security Accelerator (DASA) also engaged with CHERI, launching a competition for projects to experiment with adding protections to existing software.

In its work, vehicle-immobiliser specialist CAN-Phantom reported at the DSbD demo day this spring that reworking code for the company’s products indicated a minor performance hit on the order of a couple of microseconds for network traffic, but not delays that were likely to affect performance though it was difficult to compare application speed directly.

In terms of protection, CHERI extensions successfully protected against buffer-overflow attacks on the stack and other attempts to read outside a pointer’s expected address range.  Similarly, Microsoft’s analysis of CHERI performed in 2020 suggested that having the technology in place would prevent just over 30 per cent of known memory vulnerabilities reported from being exploited without a patch.

The key question is whether the hardware overhead provides a sufficient benefit to make it worth deploying over and above static analysis and improved practice. It cannot trap all problems. CAN-Phantom’s work showed it is still possible to use memory after the operating system allocator has notionally released it, indicating that code would still need to be careful to remove capabilities after heap space had been supposedly cleared and shut down access to those locations. In its analysis from 2020, Microsoft’s security analysis team suggested the use of runtime software such as Cambridge’s Cornucopia to trap these issues.

Some of the benefits may come from the restructuring of software designed to take advantage of the hardware control rather than simply protecting against common hacks.

One method to protect against heap attack is to use address randomisation, but this can cause unexpected behaviour and in circumstances that are difficult to replicate. CHERI adds programmable shadow data and program registers that let the hypervisor or operating system move data around in physical space and avoid the need for user-level randomisation.

Though CHERI was designed for 64bit systems, there may also be benefits in a slimmed down version being applied to 32bit microcontrollers. The mechanism could be used to replace the fixed memory-protection units used by some architectures to act as a halfway house between a flat space and full virtual memory. Because these can rely on power-hungry associative memories, CHERI can provide isolation and address translation, possibly offering an advantage in terms of energy with maybe a small reduction in instruction throughput.

Some groups are investigating CHERI for use with unikernels, an approach that Lynx Software Technologies, has embraced for high-criticality applications. The idea behind the unikernel is that by stripping all unused functions out from the operating system and its libraries, the target has a much smaller attack surface. But being designed for a flat memory space, they need some of protection to avoid functions inadvertently or deliberately corrupting the memory owned by other tasks.

Though the technology has picked up momentum over the past few years, hardware support still remains restricted to Arm’s prototype. Several groups, including the Cambridge team, are working on RISC-V implementations and Microsoft has proposed a 32bit version for use with the RISC-V in addition to the other 64bit projects.

However, the push being given to CHERI on the software side may start the final transition to commercial hardware in the coming years.