Hardware processing blocks give highest video performance yet from Blackfin device

4 mins read

When digital signal processors first appeared commercially in the early 1980s, they took the electronics world by storm, eventually enabling the mobile phone revolution – and lots besides.

But since the turn of the Millennium, the dsp's profile has been on the wane. Partly, this was due to dsp functionality becoming available in other formats – as instructions for mainstream processors or integrated into an fpga. Yet dsps continue to provide useful functionality and new devices continue to appear on the market. Analog Devices has been present in the dsp market since the early 1990s. During this time, it has unveiled a number of dsp families, the latest of which is Blackfin, introduced in 2000. This range of 16/32bit processors has been designed to offer software flexibility and scalability in convergent applications, including multiformat audio, video, voice and image processing. The Blackfin architecture brings together elements from Analog's SHARC architecture and Intel's XScale architecture into one core, thereby offering dsp and microcontroller functionality. Since launching Blackfin, Analog has introduced around 150 variants. Rich Murphy, an Analog Devices marketing manager, said: "The family has been developed around three vectors: high dsp performance; a focus on connectivity; and price/performance." Now Analog has added four more models to the Blackfin portfolio – the BF608 and 609, targeted at embedded video applications, and the BF606 and 607, aimed at general dsp applications. Murphy said: "The 608 and 609 are likely to find use in automotive applications, plus some industrial vision systems. Both include a hardware accelerator for vision analytics." All four devices feature dual Blackfin cores running at 500MHz – 'the same core as on previous products', Murphy pointed out. The cores can communicate with each other and with the outside world via an internal and external memory bus structure, said to have been optimised to enhance system level performance. "It's easy for users to get better performance at the same clock rate," said Murphy, "because of the additional features in the parts." And, recognising the automotive industry will be a major customer for the BF608/609, Murphy noted the devices have also been provided with some integrated safety oriented features, which are compliant with ISO26262. Although the cores run at 500MHz, typical power consumption is likely to be around 400mW. "Previously, devices such as these would have drawn about 1W," Murphy noted. "And another benefit of the BF608/609 is the packaging means they can be designed into environments where the ambient temperature may reach 105°C. This is important for automotive applications, but also for some industrial uses." These features have been enabled by targeting the BF60x parts at a 65nm process, rather than the 130nm process used previously. All four additions share many common features. Each Blackfin core has access to a dedicated 148kbyte L1 sram with parity, but share a 256kbyte L2 sram, which features error check and correction. The BF606 uses a 128kbyte L2 memory. There are also a range of peripherals supplied. "For example," said Murphy, "there are four LinkPorts." These allow point to point communication in multiprocessing systems. "We also have this feature in SHARC processors," he continued, "and it allows multiple Blackfin devices to be joined to create a multiprocessing system or a SHARC to be linked to create a floating point processing system. And it's possible to use this interface to link Blackfin devices to an fpga." Other connectivity features include a USB interface, as well as Ethernet ports. There are also three enhanced parallel peripheral interfaces – ePPIs. These enable the direct connection of lcd panels to the Blackfin processor, as well as parallel a/d and d/a converters. "This is a great feature," Murphy observed. "It provides connectivity to a cmos imager, for example, and is glueless; it allows data to be brought onto the chips in an easy way." All elements are interconnected using a system crossbar and DMA subsystem. "The infrastructure has been redesigned to provide better throughput," Murphy claimed. "We've given a lot of thought about how to link two relatively fast cores to memory and peripherals. We have had to ensure there is enough bus throughput for the chip to do what the user wants." The crossbar also provides the cores with access to L3 memories. The BF608/609 are equipped with dedicated hardware processing units – a pipelined video processor (PVP), a pixel compositor and a pixel crossbar (see fig 1). "Together, these blocks offer flexible image processing," Murphy commented. "While the blocks have been developed primarily for automotive applications, there are generally useful functions included. Any software programmer or engineer familiar with image processing should be able to look at the PVP and understand how to use algorithms on it and how to get the most out of it." In particular, Murphy pointed to three areas where the PVP is likely to help video system developers. "It's good at object detection, object tracking and identification. In the automotive world, this will support driver assistance systems, but it will also find use in security systems, for example." The PVP has been designed to support up to five concurrent vision algorithms alongside the Blackfin cores. Video input can be up to full high definition (1280 x 960) at 30frame/s. Within the PVP are 12 functional blocks – including convolution, scaling and arithmetic – each of which can be assigned to a camera or memory pipe (see fig 2). These configurable signal processing blocks support a range of commonly used algorithms. "It's a hardware accelerator," said Murphy, "but it can be set up in a number of ways. We have tried to be flexible by giving the user control over what they use the blocks for and how many times they can call the block. There is also an API library written in C, so users can get involved at the bit level if they want to." Murphy said there are restrictions of how many blocks can be used together at one time and what path they can be applied to. "The user might bring in some image data," he continued, "and use the PVP for some functions, then send the result to memory. The PVP can then be called again and reconfigured for a different piece of processing." He noted the block is aimed primarily at 'software people'. "It's the software people who will have to get it to work and, in recognition, we have taken a software view of the hardware peripherals." Citing a traffic sign recognition application, Murphy said much of the processing happens before the dsp gets involved. "For example, rgb data gets taken into the pixel converter and output as YUV, then sent to memory. The PVP can do edge detection, again sending that result to memory. That data is then available to the Blackfin core." PVP operations have been optimised to save memory bandwidth. "Anywhere you can limit the use of external memory will help to boost system throughput," Murphy contended. The BF60x processors are supported by CrossCore Embedded Studio, a software development tool chain. This Eclipse based development environment supports proprietary and open source tools and technologies, including Micrium µC/OS-III, Linux and GCC. There is also a range of development boards and software available for the devices. BF60x devices are sampling with lead customers and are planned to enter volume production early in 2013.