As intelligence heads towards the edge of the IoT, the container concept is gaining in popularity

4 mins read

The internet of things (IoT) has a connectivity problem. It relies fundamentally on the ability to ensure ‘things’ can stream bytes with abandon so that servers can mine their data. But reality is beginning to intervene as it becomes clear even a 5G-powered IoT would fold under the weight of such an architecture.

Mark Skarpness, Intel’s software and services group director, told delegates at last autumn’s Embedded Linux Conference Europe that advanced cyber-physical systems, such as cars, could generate 4Tbyte of data per day. “Obviously, with that amount of data being generated, you can’t just send all that data to the cloud and do analytics on it there.”

There are other problems for those systems. The finite speed of light means the round-trip time for calculations performed in the cloud is too long to support real-time decisions. Software needs to run locally or at least within 100miles or so of a robot or similar cyber-physical system to be able to respond in time to sudden problems – even assuming the connections are reliable.

Many systems will not generate anything like the amount of data generated by a self-driving car. But they are beginning to demand the kind of machine-learning and data-mining tasks currently deployed in the cloud be migrated to where they are located. At the company’s Ignite conference last autumn, Microsoft Azure senior programme manager Olivier Bloch described a system being introduced by Schneider Electric to monitor the many lonely ‘nodding donkey’ or sucker-rod oil-well pumps that pepper the southern US.

“They are out there in the wild not supervised and extracting oil from wells that are a mile deep sometimes. Any number of things can go wrong,” Bloch explains. The remoteness of the pumps means that, when a problem arises and a crew arrives to fix it, the damage is already done.

One of the main problems is the pump pulling sediment into the flow of oil. “You have to slow down and allow things to settle before you can start to pump again,” Bloch says.

"The big upside of containers [is that] when running an inside a container, you will see almost no difference in performance."
Cedric Vincent, Witekio

Often, pressure readings can identify problems such as pipeline leaks and sediment contamination. But a direct reading of pressure does not show the problem – it is seen as a divergence from a normal two-phase cycle of high and low pressure as the rod in the well rises and falls. Traditionally, these readings are plotted on a ‘dynacard’ that shows graphically how the pressure is varying over a pumping cycle.

Schneider wants to use machine-learning algorithms to determine when the problems kick in. The pumps could report their running status to servers that decide whether conditions, but that involves sending large quantities of data over low-bitrate links, such as GSM. So, the company is now installing Linux-based computers at the wellheads to perform the analysis onsite and to raise an alarm when something needs attention – possibly halting the pumps if the issue is serious.

To support systems like Schneider’s, Microsoft and others in the nascent edge-computing market are tweaking technologies originally developed for use in cloud servers so they can be deployed to custom hardware in the field and managed from the network.

“That edge device will be managed from the cloud. When the device comes online, it sends a ping to the hub: ‘I’m the edge device. Do you know me?’ The cloud will say: ‘Yes, you need to work with this configuration’. That configuration will tell the edge device what modules to load from the cloud to run locally.”

One of the problems with downloading software modules to an edge device is handling configurations and compatibility with other applications, particularly for services that might be active over a decade or more. One option is to use virtual machines to handle the downloadable code. But virtualisation imposes a performance overhead because many I/O requests that would normally go straight to hardware need to be emulated – at least partially – in software. Although cloud operators embraced virtualisation early on to provide better isolation between different tasks and users running on shared hardware, they have wholeheartedly embraced another option over the past five years – the container.

Cedric Vincent, CTO at Witekio says: “The big upside of containers: running app inside container you will see almost no difference in performance.”

Now led by companies such as Docker and Joyent, the container evolved from an idea created by Bill Joy for BSD Unix in the early 1980s as a way to prevent software under development from interfering with other applications. A system utility called chroot moved the root directory for any user to any arbitrary folder in the filesystem. Short of hacks to break out of this virtual jail, the user would have no direct access to files above that point.

Containers use features built progressively into Linux to extend the jail concept. Containers can create the illusion for applications of exclusive access to a full Linux system with root privileges. In fact, the application is stuck inside a software container running alongside many others, each oblivious of each other’s existence. A key advantage of this approach is that it overcomes the problem of dependency mismatches. Applications tested with one version of Java, for example, could fail if the Java runtime receives an update. Each container that needs Java can have its own version sitting in its namespace. However, the core operating system is shared to avoid having to duplicate entire memory images.

“You pack inside the container the minimum you need for your application.” Vincent says.

Some embedded systems teams already use containers running on conventional servers to avoid conflicts between different releases of their software. “Tools like Docker are changing the way we can isolate builds,” said Niall Cooling, managing director of training company Feabhas at the Agile for Embedded Conference last year.

In an edge gateway, as with virtualisation, containers offer the same advantages in terms of isolation. However, in contrast to development builds, applications inside containers will provide services to tasks running on the edge gateway, typically using virtual network connections. In the Schneider system, a custom task reads data from Modbus programmable logic controllers, feeding a stream into a service developed by Microsoft that converts the data into JSON format that the machine-learning service understands.

Microsoft expects to support a number of flavours of Linux with its edge offering. Embedded specialists such as Witekio favour the Yocto distribution because of its support for building stripped-down systems. Startup Resin.io is using Yocto as the basis for its own open-source project that is aimed squarely at running containers in an embedded environment.

Security remains a potential weakness for containers, particularly for implementations where an edge gateway offers compute services to different local users. Work on Linux features, such as control groups and the Seccomp module, which limits access to kernel calls, help stop containerised applications from breaking out of their jails, although it reduces flexibility for software sitting inside. Intel is using the Clear Linux project to try to bolster security by drawing on its Virtualization Technology hardware extensions.

In the wake of Meltdown and Spectre, hardware designers may simply decide to provide devices in which code that requires high security is allowed to run inside processors that have physically separated memory partitions.

Despite the potential security problems, management from the cloud is likely to see a migration towards edge devices that take advantage of containers and other technologies that were raised in the server farm.