More in

GPUs set to enter the mainstream

4 mins read

Until recently, demand for more processor performance has been met by faster clocking, more bit width and ever smaller fabrication nodes. The introduction of multicore x86 devices has also helped to boost performance while keeping power consumption at a reasonable level.

Yet even multicore processors are not always suitable for use in applications where you want to process large quantities of data as quickly as possible. In imaging diagnostics – such as MRI and CT, for example – it may take some time for multicore based solutions to produce final images, depending on the volume of data involved and the processing complexity.This is contrary to the needs of medical professionals, for example, who want image data to be processed as quickly as possible and displayed at high resolution. Alongside diagnostics, simulation is one of the most ambitious medical uses for cpus. Virtual microscopy, used in pharmaceutical research, is one example. Here, complex algorithms are used to simulate molecules and the way in which they react with one another. These applications share high performance demands; high enough that even the latest multicore devices cannot meet them. While it is possible to combine a number of multicore processors in a high performance cluster, the cost of purchasing and operating that kind of supercomputer, plus its infrastructure, is huge. An alternative approach can be found in high performance general purpose graphics processing unit (gpgpu) technology and tools, allowing the massive parallel computing power of modern graphics cards to be harnessed by embedded systems developers. GPU technology is developing rapidly, driven in particular by the demands of the consumer market. The ability of gpus to handle more frames per second at higher resolution and to provide uniform programming interfaces has created interest in using this technology in data processing applications. At the data level, there is not much difference between computing and displaying virtual worlds for games and the visualisation of raw data from a variety of sources, such as an ultrasound or coronary examination. But GPUs are sometimes better able to handle parallel and data intensive tasks, for instance. Being restricted to specific problems enables them to be designed so that most of their transistors are devoted to computing operations and not to control and caching, as is the case with cpus. Computer scientists have led the way in using the huge parallel computing power of modern graphics cards. However, such a degree of parallelism made the algorithms extremely complex, restricting its appeal to idealists and specialists. With the introduction of development environments, such as AMD's Accelerated Parallel Processing software development kit, and programming environments like OpenCL as an accompaniment to OpenGL, developers can now access the performance of modern gpus. Diagnostic speed can be multiplied simply by using the parallel computing power of modern gpus. Combined with high resolution and loss free imaging, this performance can produce incredible visualisation results. GPGPU enabled high performance systems also have cost benefits; and not just because it is less expensive to set up powerful graphics units as coprocessors on a standard embedded platform. Until recently, graphics card technology had one downside: short product life. The problem facing embedded systems developers was the lack of high end embedded graphics with long term availability. AMD has overcome this with the introduction of the ATI Radeon HD 5770 graphics card and the Radeon E6760 embedded discrete gpu (see below). This will help to provide medical apparatus OEMs with the design security they need for their graphics hardware. In addition, OEMs can benefit from integration of the gpu: a PCI Express capable x16 graphics card (PEG) can be used in a variety of embedded platforms, from a standard server board to a high end PICMG 1.3 backplane configuration. The Radeon E6760 gpu is also available in the MXM form factor. Graphics intensive applications are not the only ones that might benefit from high processing capability; the graphics unit's stream processors can take a huge workload off the multicore cpus when handling different data streams. Combining these processor units in shader units enables vector processing in addition to scalar operations. This potential can be accessed through standard programming platforms and APIs, such as OpenCL or DirectCompute, simplifying and speeding application development. In terms of cost of ownership, gpgpu solutions are designed to be cost efficient because systems enhanced by modular gpus do not require multicore processors distributed over a number of server boards in order to produce measurable savings. Exchanging a gpu is also less complicated than replacing an entire board. And a gpu can be replaced in the future by a compatible and more powerful model without altering the system configuration significantly. There are many reasons for using high end graphics hardware in medical embedded systems and the long term availability of solutions does away with the short term availability obstacle. The belief is that gpgpu technology will increasingly become mainstream, cutting the time needed to develop new applications. To secure the advantages in the offing, it is well worth starting to think about implementing the technology. PCI Express compatiblity With a gpu clock of 850MHz and 1Gbyte of 1200MHz GDDR5 ram, the Radeon HD 5770 PCI Express capable card uses AMD Stream technology to accelerate Open GL 3.2 and DirectX 11 applications. It also supports OpenCL 1.0 and DirectCompute 11. For high end 3d rendering applications – in medical imaging and simulation, for example – up to four graphics cards can be coupled using AMD's CrossFireX technology. Two displays can be driven independently on the two dual link DVI-i interfaces with a resolution of up to 2560 x 1600 pixels. AMD Eyefinity multidisplay technology allows simultaneous operation of three monitors with different resolutions, frame rates, colour models and video overlays, as well as allowing several monitors to be combined to create one large display. GPU features 480 processing elements AMD's Radeon E6760 gpu has been developed for use in embedded applications requiring compute intensive gpgpu functionality. With 480 processing elements, the E6760 gpu delivers up to 576Gflops peak single precision floating point performance in such applications as ultrasound, radar and video imaging. The E6760 gpgpu is enabled by the OpenCL programming language and by AMD's Stream Software Development Kit. The E6760 gpu, which supports up to six independent displays, HDMI 1.4 stereoscopic video and DisplayPort 1.2, can be paired with the forthcoming AMD A-Series accelerated processing unit to provide additional graphics capability and parallel computing power. Unlike PEG cards, MMX modules mount parallel to the carrier board, so the height of the embedded design is relatively unaffected. MXM modules build on the same advantages that have led to the success of computer on modules: a standardised footprint, a standard pinout and assembly concept and suitability as add on components. Aurelius Wosylus is regional sales manager Europe with AMD's embedded business unit.