Reality can wait

4 mins read

Since Ivan Sutherland demonstrated the Sword of Damocles half a century ago, head-mounted virtual reality (VR) systems have promised a bright future for human-computer interaction.

And though the latest crop of games and other applications for headsets have achieved greater market penetration than previous attempts, it remains a future that is some way off.

When he took to the stage at the sixth conference organised by Facebook’s virtual-reality subsidiary in September, chief scientist Michael Abrash did not have entirely good news to tell developers keen to get their hands on a new generation of systems.

At the previous year’s conference, the company showed off a prototype that encapsulated many of the concepts that the designers expected would turn VR into a mainstream product. This year, Abrash admitted: “You’re not going to get that shiny headset any time soon. I don’t know when you’ll be able to buy the magical headset I described last year.”

Abrash quoted Hofstadter’s Law to support his view: “It always takes longer than you expect, even when you take into account Hofstadter’s Law.”

The problems exist at two levels. One and the most obvious is that VR demands incredible processing throughput just to render a realistic virtual world. The second is that each step towards that primary aim reveals numerous issues.

One of the biggest obstacles for the moment is that graphics pipelines can provide enormous computing horsepower but it demands high power and, even worse, levels of latency that ruin the feeling of immersion. Reducing the compute horsepower by performing selective optimisations could bring the latency down but those changes do not come for free. Some make latency even more troublesome.

For example, the concept of foveated rendering has been around for some time in VR circles as a way of reducing the compute burden. Qualcomm put a version of it into the software development kit (SDK) for the Snapdragon 835 and various headset makers have put together demonstrations over the years.

HTC claimed to be the first to commercialise the underlying technology for foveated rendering with the launch of the Vive Pro Eye, a £1500 headset aimed at business rather than gaming users.

Conceptually, foveated rendering is simple. It relies on the observation that humans have sharp vision only towards the centre of the field of view. Off to the side, the brain only perceives rough shapes and colours. In principle, if graphics pipelines only perform full rendering on the part of the image the eye can see sharply, they can dramatically reduce the workload as well as latency and power consumption. Practically, things are less simple.

Concept v reality
Early attempts at foveated rendering assumed a fixed position for the eye but this offers a poor substitute for reality as it doesn’t take account of the rapid movements, known as saccades, the eyes often perform when the brain wants to examine an unfamiliar scene.
Practical foveated rendering relies on the ability to track eye movements in real time and this is the key feature that made its way into the Vive Pro Eye. Numerous other projects have incorporated cameras.

Even if foveated rendering is never used, multiple cameras trained on the face are likely to make their way into mass-market VR headsets in the future so they can track movement as the user speaks and changes expressions. Oculus and others see this feature as essential to virtual-presence or “social teleportation” applications.

At Oculus Connect 6, Abrash showed videos of his group’s work on “codec avatars”. These are digital simulations of people that are meant to look as realistic as possible, moving in the same way as the people wearing the headsets. As well as sending motion and voice data, the software stitches together a moving virtual face from the video recorded by each of the headset cameras.

Other cameras are trained on the hands and arms, at least when they are in front of the headset wearer: this will let them interact with a virtual space in a game, for example, without having to wield a controller stuffed with accelerometers.

Though headset cameras seem inevitable and are key to implementing foveated rendering, the accelerated processing promised is somewhat harder to achieve than it looks.

John Carmack, renowned for his work as one quarter of the partnership that delivered Wolfenstein and Doom to PCs in the early 1990s, is keen to point out how important the interactions between software and hardware designers need to be.

In his own talk at Oculus Connect 6, Carmack, who took on the role of Oculus CTO in 2013, pointed to the numerous problems facing foveated rendering and other attempts to squeeze more performance out of limited headset chipsets.

One problem he indicated is the need to deal with multiple layers of graphics software interfaces (APIs) in order to display a scene on a headset. As they tend to assume a display will want complete frames that is how they are packaged. But it means rendering much more than is necessary and can inject unwanted latency into the process. If the headset could pick up just groups of scan lines and present them to a display that uses a rolling shutter, it could cut out almost a complete frame’s latency. But that implies a much closer connection between rendering and the display subsystem, where different choices might be made.

Rolling shutters are found in OLED-based displays, but these at least today are dimmer than backlit LCDs, which will tend to use a global shutter. The results from foveated rendering so far have been disappointing, according to Carmack.

“We still have not seen the slam dunk, perfect, foveated-fendering demo of [where people say] ’this is absolutely unnoticeable’. Even on the PC, where you can throw more hardware and have less latency on it. You can still usually see when you glance around that it’s blurry for an instant. It is going to be worse on mobile because the processing is going to take longer: there’s a longer GPU pipeline for the tiled renderers,” Carmack explains.

The processing reduction is not necessarily as great as originally promised, Carmack says. “You can’t dial it down to this super-precise focus and save lots of power elsewhere. And when it falls down, in some of the areas the sparkling and shimmering going on in the periphery is more objectionable than we might have hoped it would be.”

It does not reduce the amount of processing but as it diverts much of the heavy lifting, remote rendering by a local server is another option being explored by VR headset makers. But such split rendering comes at a cost. One is the latency of communication. Although telecom operators are keen to promote 5G as a way of handling VR because it supports much lower latency than many high-bandwidth protocols, there are limitations on how far the server can be from the user. The speed of light is an obstacle no-one can overcome. Even then, the added complication of having the rendering split may be too much for applications developers to absorb, Carmack says.

The short-term solution may simply be to be less ambitious: focus less on the graphics and more on the audio played to users and the haptics. Carmack argues these aspects are underused by games and application developers.

In the meantime, the long project to build more realistic VR conitnues. ”It’s going to be a decade’s long journey to that promised land,” Abrash says.